Skip to content

feat(memory): track summary-only wiki git history#4168

Open
senamakel wants to merge 7 commits into
tinyhumansai:mainfrom
senamakel:feat/memory-wiki-summary-git-history
Open

feat(memory): track summary-only wiki git history#4168
senamakel wants to merge 7 commits into
tinyhumansai:mainfrom
senamakel:feat/memory-wiki-summary-git-history

Conversation

@senamakel

@senamakel senamakel commented Jun 26, 2026

Copy link
Copy Markdown
Member

Summary

  • Add a summary-only git repository under the memory wiki root for derived summary-node markdown.
  • Commit wiki summaries at tree seal / summary ingest boundaries, not for every low-level file stage.
  • Include tree scope, level range, child count, token count, time range, summary ids, and paths in commit messages.
  • Add timestamped read-pointer tags plus a moving latest tag for cheap current-pointer lookup.
  • Add Rust E2E coverage for summary ingest git history and timestamped read tags.

Problem

  • Memory wiki files did not have a durable summary-only history.
  • A generic git repo rooted at the broader memory workspace risks mixing raw source artifacts and implementation details with derived summary nodes.
  • Read high-water marks need to be represented without creating extra summary-history commits.

Solution

  • Introduce memory_store::content::wiki_git to initialize <content_root>/wiki/.git lazily and track only summaries/** plus .gitignore.
  • Move git commits to the summary seal/ingest boundary so each commit represents a completed summary-tree update.
  • Store read pointers as lightweight timestamped tags at refs/tags/read/<hex(pointer_id)>/<timestamp> and update refs/tags/read/<hex(pointer_id)>/latest.
  • Keep raw artifacts, Obsidian defaults, and future non-summary wiki files out of git history.

Submission Checklist

If a section does not apply to this change, mark the item as N/A with a one-line reason. Do not delete items.

  • Tests added or updated (happy path + at least one failure / edge case) per Testing Strategy
  • Diff coverage ≥ 80% — N/A: full local coverage was not run; targeted unit/E2E coverage was added and CI coverage gate will enforce changed-line coverage.
  • Coverage matrix updated — N/A: internal memory persistence/history behavior, no feature-row change.
  • All affected feature IDs from the matrix are listed in the PR description under ## Related — N/A: no matrix feature IDs changed.
  • No new external network dependencies introduced (mock backend used per Testing Strategy)
  • Manual smoke checklist updated if this touches release-cut surfaces (docs/RELEASE-MANUAL-SMOKE.md) — N/A: internal memory persistence behavior, not a release manual smoke surface.
  • Linked issue closed via Closes #NNN in the ## Related section — N/A: no linked issue provided.

Impact

  • Runtime/platform impact: local Rust core memory persistence only.
  • Compatibility: wiki git repo is lazily initialized; existing summary markdown remains readable.
  • Security/privacy: git history is scoped to derived summary nodes only; raw source mirrors and non-summary wiki files are ignored/pruned from the index.
  • Performance: one git commit per summary seal/ingest boundary; read pointer tags do not create commits.

Related

  • Closes: N/A
  • Follow-up PR(s)/TODOs: N/A

AI Authored PR Metadata (required for Codex/Linear PRs)

Keep this section for AI-authored PRs. For human-only PRs, mark each field N/A.

Linear Issue

  • Key: N/A
  • URL: N/A

Commit & Branch

  • Branch: feat/memory-wiki-summary-git-history
  • Commit SHA: 72346d103f27f1fd764e448df0792b2c21b7627c

Validation Run

  • pnpm --filter openhuman-app format:check
  • pnpm typecheck
  • Focused tests: cargo test --manifest-path Cargo.toml openhuman::memory_store::content::wiki_git --lib; bash scripts/test-rust-e2e.sh --suite memory_artifacts_e2e
  • Rust fmt/check (if changed): cargo fmt --all --manifest-path Cargo.toml; cargo check --manifest-path app/src-tauri/Cargo.toml via pre-push hook
  • Tauri fmt/check (if changed): cargo fmt --manifest-path app/src-tauri/Cargo.toml --all --check and cargo check --manifest-path app/src-tauri/Cargo.toml via pre-push hook

Validation Blocked

  • command: N/A
  • error: N/A
  • impact: N/A

Behavior Changes

  • Intended behavior change: memory wiki summary nodes now get summary-only git history, descriptive seal commits, and timestamped read-pointer tags.
  • User-visible effect: developers can inspect summary-tree history under the wiki git repository without raw artifacts being tracked.

Parity Contract

  • Legacy behavior preserved: summary markdown paths and SQLite content pointers remain unchanged.
  • Guard/fallback/dispatch parity checks: staging failures still abort before DB persistence; git failures at seal/ingest boundaries abort before DB persistence.

Duplicate / Superseded PR Handling

  • Duplicate PR(s): N/A
  • Canonical PR: this PR
  • Resolution (closed/superseded/updated): N/A

Summary by CodeRabbit

  • New Features

    • Added Git-backed tracking for saved summary content, including read-progress pointers.
    • Summary ingests and seal operations now record summary history with richer metadata.
  • Bug Fixes

    • Ensured only summary markdown is kept in tracked history, preventing unrelated files from being included.
    • Improved consistency by writing summary history before database persistence.
  • Tests

    • Added integration coverage for summary history, commit details, and read-pointer tagging.

@senamakel senamakel requested a review from a team June 26, 2026 11:35
@coderabbitai

coderabbitai Bot commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

📝 Walkthrough

Walkthrough

Adds a wiki-git-backed store for summary-node commits and read-pointer tags, then routes summary ingestion and seal paths through it before persistence. Tests cover tracked paths, commit metadata, and pointer refs.

Changes

Wiki Git-backed summary persistence

Layer / File(s) Summary
Module surface
src/openhuman/memory_store/content/README.md, src/openhuman/memory_store/content/mod.rs, src/openhuman/memory_store/content/wiki_git.rs
wiki_git is exported, its summary commit payload types are added, and the content README lists the new module.
Summary commit pipeline
src/openhuman/memory_store/content/wiki_git.rs
The wiki repo is initialized on demand, .gitignore is rewritten to summary-only content, tracked non-summary paths are pruned, and commits are skipped when the staged tree matches HEAD.
Read-pointer tags
src/openhuman/memory_store/content/wiki_git.rs
Timestamped refs/tags/read/* names are generated, latest is advanced to the target commit, and the current latest pointer can be read back.
Summary write wiring
src/openhuman/memory_tree/ingest.rs, src/openhuman/memory_tree/tree/bucket_seal.rs
Summary ingestion and both seal paths now build SummaryCommitBatch values and call wiki_git::commit_summaries before persistence.
Validation coverage
src/openhuman/memory_store/content/wiki_git.rs, tests/memory_artifacts_e2e.rs
Unit and end-to-end tests cover summary-only Git tracking, commit message fields, path filtering, and read-pointer tag updates.

Sequence Diagram(s)

sequenceDiagram
  participant ingest_summary
  participant commit_ingested_summary
  participant seal_one_level
  participant seal_explicit_children
  participant commit_summary_seal
  participant wiki_git as wiki_git::commit_summaries
  participant repo as content_root/wiki/.git
  participant db_tx as database transaction

  ingest_summary->>commit_ingested_summary: SummaryCommitBatch with one SummaryCommitEntry
  commit_ingested_summary->>wiki_git: commit_summaries(batch)
  wiki_git->>repo: stage summaries/** and .gitignore
  wiki_git-->>commit_ingested_summary: commit result
  commit_ingested_summary-->>ingest_summary: continue before DB persistence
  ingest_summary->>db_tx: persist SummaryNode and update L1 buffer

  seal_one_level->>commit_summary_seal: SummaryCommitBatch for sealed node
  seal_explicit_children->>commit_summary_seal: SummaryCommitBatch for sealed node
  commit_summary_seal->>wiki_git: commit_summaries(batch)
  wiki_git->>repo: stage summaries/** and .gitignore
  wiki_git-->>commit_summary_seal: commit result
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Suggested labels

feature, rust-core, memory

Suggested reviewers

  • M3gA-Mind

Poem

A bunny hopped through git tonight,
with summary leaves tucked in just right.
Read tags twinkled, soft and new,
and latest blinked a tiny “boo!” 🐇

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title is concise and accurately summarizes the main change: tracking summary-only wiki git history.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot added feature Net-new user-facing capability or product behavior. memory Memory store, memory tree, recall, summarization, and embeddings in src/openhuman/memory/. rust-core Core Rust runtime in src/: CLI, core_server, shared infrastructure. labels Jun 26, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🧹 Nitpick comments (3)
src/openhuman/memory_tree/ingest.rs (1)

165-189: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Add debug logs around the new wiki-git write.

commit_ingested_summary() adds a new external git call, but there is no debug/trace log on entry, success, or failure. That makes ingest↔wiki drift much harder to diagnose.

As per coding guidelines, "Add debug logging to entry/exit, branches, external calls, retries/timeouts, state transitions, and errors using log/tracing at debug/trace level in Rust".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/openhuman/memory_tree/ingest.rs` around lines 165 - 189, Add debug/trace
logging around the new wiki-git write in commit_ingested_summary so ingest↔wiki
drift is easier to diagnose. Use the function name commit_ingested_summary as
the entry/exit point, log before calling wiki_git::commit_summaries with key
context like tree.id, tree.scope, node.id, and content_path, and log both
success and failure paths. Make sure the external call failure is logged with
the error detail before returning the Result.

Source: Coding guidelines

src/openhuman/memory_tree/tree/bucket_seal.rs (1)

893-935: 📐 Maintainability & Code Quality | 🔵 Trivial | 🏗️ Heavy lift

Extract this new wiki-git batching helper into its own module.

bucket_seal.rs is already well past the repo's module-size cap, and adding more cross-store commit logic here makes the seal flow even harder to reason about. Pulling the wiki-git batching into a sibling module would keep this file focused on seal orchestration.

As per coding guidelines, "Rust modules must be ≤ ~500 lines in size".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/openhuman/memory_tree/tree/bucket_seal.rs` around lines 893 - 935, The
wiki-git batching logic is still embedded in commit_summary_seal within
bucket_seal.rs, which keeps the seal module oversized and mixes orchestration
with cross-store commit behavior. Move the commit_summaries batching code,
including SummaryCommitEntry and SummaryCommitBatch creation plus the call into
openhuman::memory_store::content::wiki_git, into a new sibling module and have
commit_summary_seal delegate to it; keep bucket_seal.rs focused on seal flow and
reuse the new helper by its module/function name.

Source: Coding guidelines

src/openhuman/memory_store/content/wiki_git.rs (1)

48-131: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Add the missing debug/trace coverage on the new public API.

commit_summaries, set_read_pointer_tag, and get_read_pointer_tag currently log some success paths, but they still miss the required entry/exit and branch logging for the empty-batch, repo-missing, and tag-missing cases.

As per path instructions, "Add debug logging to entry/exit, branches, external calls, retries/timeouts, state transitions, and errors using log/tracing at debug/trace level in Rust".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/openhuman/memory_store/content/wiki_git.rs` around lines 48 - 131, The
new public APIs in wiki_git.rs are missing the required debug/trace coverage for
entry, exit, and important branches. Add log/tracing statements in
commit_summaries, set_read_pointer_tag, and get_read_pointer_tag for function
entry/exit, the empty-batch early return, repo-open/path branches, and the
missing-tag/missing-repository cases, while keeping existing success logging and
wrapping external calls like open_prepared_repo, repo.reference, and
repo.find_reference with trace/debug context and error logs.

Source: Path instructions

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/openhuman/memory_store/content/wiki_git.rs`:
- Around line 1-527: The wiki git implementation is too large for a new
root-level Rust file and should be moved into a dedicated wiki_git submodule.
Split the current wiki_git logic into content/wiki_git/mod.rs with sibling
modules such as ops.rs and tests.rs, keeping key symbols like commit_summaries,
set_read_pointer_tag, get_read_pointer_tag, and open_prepared_repo organized
there. Then update content/mod.rs to re-export the new module so existing
callers keep working, and ensure each resulting Rust module stays under the size
cap.
- Around line 143-154: The open/init flow in open_or_init_repo currently falls
back to Repository::init for every Repository::open failure, which can hide real
repository problems. Update the match on Repository::open(wiki_root) so only the
NotFound case triggers initialization, and propagate any other error from open
unchanged; keep the existing logging/context around the init path for the
Repository::init(wiki_root) branch.
- Around line 125-129: The `read_pointer_latest_ref` lookup in `wiki_git.rs` is
swallowing all `find_reference` failures as missing data, which hides real git
errors. Update the `repo.find_reference(&tag_ref)` handling so only
`ErrorCode::NotFound` is mapped to `Ok(None)`, while all other errors are
propagated back to the caller. Keep the existing `target` flow and
`read_pointer_latest_ref`/`find_reference` symbols so the fix is localized to
the reference lookup path.

In `@src/openhuman/memory_tree/ingest.rs`:
- Around line 141-144: The git commit in commit_ingested_summary is happening
before persist_and_buffer(), which can leave wiki history ahead of the SQLite/L1
state if the DB write fails. Move the wiki git update to after the DB commit
succeeds, or make the flow replayable from persisted state by using
compensation/outbox logic so memory_tree::ingest::commit_ingested_summary and
the persistence path advance together with the same summary_id.

In `@src/openhuman/memory_tree/tree/bucket_seal.rs`:
- Around line 718-730: The seal flow in bucket_seal currently writes git history
via commit_summary_seal before the durable SQLite transaction and follow-up
updates finish, which can leave HEAD sealed while the DB still thinks the item
is unsealed. Update the bucket_seal path around commit_summary_seal,
insert_summary_tx, backlinking, buffer clear/upsert, and enqueue handling so the
git write happens only after the DB boundary succeeds, or introduce
compensation/outbox handling to keep git and SQLite consistent across retries.
Apply the same fix to both affected call sites, including the one used by the
buffer seal retry path.

In `@tests/memory_artifacts_e2e.rs`:
- Around line 159-164: The git-history assertion in the memory_artifacts_e2e
test is too weak because raw/not-tracked.md is never created, so it cannot prove
raw wiki files are excluded. Update the test around ingest_summary() to first
write a real file under wiki/raw/ using the existing test helpers, then assert
tree_obj.get_path(...) still fails for that seeded raw file. Keep the fix
localized to the e2e test and use the existing tree_obj, ingest_summary, and
get_path flow so the summary-only contract is actually exercised.

---

Nitpick comments:
In `@src/openhuman/memory_store/content/wiki_git.rs`:
- Around line 48-131: The new public APIs in wiki_git.rs are missing the
required debug/trace coverage for entry, exit, and important branches. Add
log/tracing statements in commit_summaries, set_read_pointer_tag, and
get_read_pointer_tag for function entry/exit, the empty-batch early return,
repo-open/path branches, and the missing-tag/missing-repository cases, while
keeping existing success logging and wrapping external calls like
open_prepared_repo, repo.reference, and repo.find_reference with trace/debug
context and error logs.

In `@src/openhuman/memory_tree/ingest.rs`:
- Around line 165-189: Add debug/trace logging around the new wiki-git write in
commit_ingested_summary so ingest↔wiki drift is easier to diagnose. Use the
function name commit_ingested_summary as the entry/exit point, log before
calling wiki_git::commit_summaries with key context like tree.id, tree.scope,
node.id, and content_path, and log both success and failure paths. Make sure the
external call failure is logged with the error detail before returning the
Result.

In `@src/openhuman/memory_tree/tree/bucket_seal.rs`:
- Around line 893-935: The wiki-git batching logic is still embedded in
commit_summary_seal within bucket_seal.rs, which keeps the seal module oversized
and mixes orchestration with cross-store commit behavior. Move the
commit_summaries batching code, including SummaryCommitEntry and
SummaryCommitBatch creation plus the call into
openhuman::memory_store::content::wiki_git, into a new sibling module and have
commit_summary_seal delegate to it; keep bucket_seal.rs focused on seal flow and
reuse the new helper by its module/function name.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 232be485-c9ba-48eb-9736-db3840f21e92

📥 Commits

Reviewing files that changed from the base of the PR and between 6d5d62b and b48bb73.

📒 Files selected for processing (6)
  • src/openhuman/memory_store/content/README.md
  • src/openhuman/memory_store/content/mod.rs
  • src/openhuman/memory_store/content/wiki_git.rs
  • src/openhuman/memory_tree/ingest.rs
  • src/openhuman/memory_tree/tree/bucket_seal.rs
  • tests/memory_artifacts_e2e.rs

Comment thread src/openhuman/memory_store/content/wiki_git.rs Outdated
Comment thread src/openhuman/memory_store/content/wiki_git/mod.rs
Comment thread src/openhuman/memory_store/content/wiki_git/mod.rs
Comment thread src/openhuman/memory_tree/ingest.rs Outdated
Comment thread src/openhuman/memory_tree/tree/bucket_seal.rs Outdated
Comment thread tests/memory_artifacts_e2e.rs

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7f694302e1

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/openhuman/memory_store/content/wiki_git/mod.rs Outdated
Comment thread src/openhuman/memory_store/content/wiki_git/mod.rs
coderabbitai[bot]
coderabbitai Bot previously approved these changes Jun 26, 2026

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4378efc1d9

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/openhuman/memory_tree/tree/bucket_seal.rs

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b7c97bcc3a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/openhuman/memory_tree/ingest.rs

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 72346d103f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

.add_path(Path::new(path))
.with_context(|| format!("stage wiki summary: {path}"))?;
}
stage_existing_summary_paths(&mut index, &wiki_root)?;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid committing summaries that failed to persist

When an ingest/seal stages the markdown file and then the SQLite transaction fails before inserting the mem_tree_summaries row, that file is left under wiki/summaries with no DB record. This unconditional sweep stages every existing summary file on the next successful commit, so the wiki Git history and read pointers can start exposing aborted/orphan summaries that the core database does not know about. Limit recovery staging to summaries known to have committed in SQLite, or clean up staged files on DB failure.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature Net-new user-facing capability or product behavior. memory Memory store, memory tree, recall, summarization, and embeddings in src/openhuman/memory/. rust-core Core Rust runtime in src/: CLI, core_server, shared infrastructure.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant