feat(memory): track summary-only wiki git history#4168
Conversation
📝 WalkthroughWalkthroughAdds a wiki-git-backed store for summary-node commits and read-pointer tags, then routes summary ingestion and seal paths through it before persistence. Tests cover tracked paths, commit metadata, and pointer refs. ChangesWiki Git-backed summary persistence
Sequence Diagram(s)sequenceDiagram
participant ingest_summary
participant commit_ingested_summary
participant seal_one_level
participant seal_explicit_children
participant commit_summary_seal
participant wiki_git as wiki_git::commit_summaries
participant repo as content_root/wiki/.git
participant db_tx as database transaction
ingest_summary->>commit_ingested_summary: SummaryCommitBatch with one SummaryCommitEntry
commit_ingested_summary->>wiki_git: commit_summaries(batch)
wiki_git->>repo: stage summaries/** and .gitignore
wiki_git-->>commit_ingested_summary: commit result
commit_ingested_summary-->>ingest_summary: continue before DB persistence
ingest_summary->>db_tx: persist SummaryNode and update L1 buffer
seal_one_level->>commit_summary_seal: SummaryCommitBatch for sealed node
seal_explicit_children->>commit_summary_seal: SummaryCommitBatch for sealed node
commit_summary_seal->>wiki_git: commit_summaries(batch)
wiki_git->>repo: stage summaries/** and .gitignore
wiki_git-->>commit_summary_seal: commit result
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 6
🧹 Nitpick comments (3)
src/openhuman/memory_tree/ingest.rs (1)
165-189: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winAdd debug logs around the new wiki-git write.
commit_ingested_summary()adds a new external git call, but there is no debug/trace log on entry, success, or failure. That makes ingest↔wiki drift much harder to diagnose.As per coding guidelines, "Add debug logging to entry/exit, branches, external calls, retries/timeouts, state transitions, and errors using
log/tracingatdebug/tracelevel in Rust".🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/openhuman/memory_tree/ingest.rs` around lines 165 - 189, Add debug/trace logging around the new wiki-git write in commit_ingested_summary so ingest↔wiki drift is easier to diagnose. Use the function name commit_ingested_summary as the entry/exit point, log before calling wiki_git::commit_summaries with key context like tree.id, tree.scope, node.id, and content_path, and log both success and failure paths. Make sure the external call failure is logged with the error detail before returning the Result.Source: Coding guidelines
src/openhuman/memory_tree/tree/bucket_seal.rs (1)
893-935: 📐 Maintainability & Code Quality | 🔵 Trivial | 🏗️ Heavy liftExtract this new wiki-git batching helper into its own module.
bucket_seal.rsis already well past the repo's module-size cap, and adding more cross-store commit logic here makes the seal flow even harder to reason about. Pulling the wiki-git batching into a sibling module would keep this file focused on seal orchestration.As per coding guidelines, "Rust modules must be ≤ ~500 lines in size".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/openhuman/memory_tree/tree/bucket_seal.rs` around lines 893 - 935, The wiki-git batching logic is still embedded in commit_summary_seal within bucket_seal.rs, which keeps the seal module oversized and mixes orchestration with cross-store commit behavior. Move the commit_summaries batching code, including SummaryCommitEntry and SummaryCommitBatch creation plus the call into openhuman::memory_store::content::wiki_git, into a new sibling module and have commit_summary_seal delegate to it; keep bucket_seal.rs focused on seal flow and reuse the new helper by its module/function name.Source: Coding guidelines
src/openhuman/memory_store/content/wiki_git.rs (1)
48-131: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winAdd the missing debug/trace coverage on the new public API.
commit_summaries,set_read_pointer_tag, andget_read_pointer_tagcurrently log some success paths, but they still miss the required entry/exit and branch logging for the empty-batch, repo-missing, and tag-missing cases.As per path instructions, "Add debug logging to entry/exit, branches, external calls, retries/timeouts, state transitions, and errors using
log/tracingatdebug/tracelevel in Rust".🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/openhuman/memory_store/content/wiki_git.rs` around lines 48 - 131, The new public APIs in wiki_git.rs are missing the required debug/trace coverage for entry, exit, and important branches. Add log/tracing statements in commit_summaries, set_read_pointer_tag, and get_read_pointer_tag for function entry/exit, the empty-batch early return, repo-open/path branches, and the missing-tag/missing-repository cases, while keeping existing success logging and wrapping external calls like open_prepared_repo, repo.reference, and repo.find_reference with trace/debug context and error logs.Source: Path instructions
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/openhuman/memory_store/content/wiki_git.rs`:
- Around line 1-527: The wiki git implementation is too large for a new
root-level Rust file and should be moved into a dedicated wiki_git submodule.
Split the current wiki_git logic into content/wiki_git/mod.rs with sibling
modules such as ops.rs and tests.rs, keeping key symbols like commit_summaries,
set_read_pointer_tag, get_read_pointer_tag, and open_prepared_repo organized
there. Then update content/mod.rs to re-export the new module so existing
callers keep working, and ensure each resulting Rust module stays under the size
cap.
- Around line 143-154: The open/init flow in open_or_init_repo currently falls
back to Repository::init for every Repository::open failure, which can hide real
repository problems. Update the match on Repository::open(wiki_root) so only the
NotFound case triggers initialization, and propagate any other error from open
unchanged; keep the existing logging/context around the init path for the
Repository::init(wiki_root) branch.
- Around line 125-129: The `read_pointer_latest_ref` lookup in `wiki_git.rs` is
swallowing all `find_reference` failures as missing data, which hides real git
errors. Update the `repo.find_reference(&tag_ref)` handling so only
`ErrorCode::NotFound` is mapped to `Ok(None)`, while all other errors are
propagated back to the caller. Keep the existing `target` flow and
`read_pointer_latest_ref`/`find_reference` symbols so the fix is localized to
the reference lookup path.
In `@src/openhuman/memory_tree/ingest.rs`:
- Around line 141-144: The git commit in commit_ingested_summary is happening
before persist_and_buffer(), which can leave wiki history ahead of the SQLite/L1
state if the DB write fails. Move the wiki git update to after the DB commit
succeeds, or make the flow replayable from persisted state by using
compensation/outbox logic so memory_tree::ingest::commit_ingested_summary and
the persistence path advance together with the same summary_id.
In `@src/openhuman/memory_tree/tree/bucket_seal.rs`:
- Around line 718-730: The seal flow in bucket_seal currently writes git history
via commit_summary_seal before the durable SQLite transaction and follow-up
updates finish, which can leave HEAD sealed while the DB still thinks the item
is unsealed. Update the bucket_seal path around commit_summary_seal,
insert_summary_tx, backlinking, buffer clear/upsert, and enqueue handling so the
git write happens only after the DB boundary succeeds, or introduce
compensation/outbox handling to keep git and SQLite consistent across retries.
Apply the same fix to both affected call sites, including the one used by the
buffer seal retry path.
In `@tests/memory_artifacts_e2e.rs`:
- Around line 159-164: The git-history assertion in the memory_artifacts_e2e
test is too weak because raw/not-tracked.md is never created, so it cannot prove
raw wiki files are excluded. Update the test around ingest_summary() to first
write a real file under wiki/raw/ using the existing test helpers, then assert
tree_obj.get_path(...) still fails for that seeded raw file. Keep the fix
localized to the e2e test and use the existing tree_obj, ingest_summary, and
get_path flow so the summary-only contract is actually exercised.
---
Nitpick comments:
In `@src/openhuman/memory_store/content/wiki_git.rs`:
- Around line 48-131: The new public APIs in wiki_git.rs are missing the
required debug/trace coverage for entry, exit, and important branches. Add
log/tracing statements in commit_summaries, set_read_pointer_tag, and
get_read_pointer_tag for function entry/exit, the empty-batch early return,
repo-open/path branches, and the missing-tag/missing-repository cases, while
keeping existing success logging and wrapping external calls like
open_prepared_repo, repo.reference, and repo.find_reference with trace/debug
context and error logs.
In `@src/openhuman/memory_tree/ingest.rs`:
- Around line 165-189: Add debug/trace logging around the new wiki-git write in
commit_ingested_summary so ingest↔wiki drift is easier to diagnose. Use the
function name commit_ingested_summary as the entry/exit point, log before
calling wiki_git::commit_summaries with key context like tree.id, tree.scope,
node.id, and content_path, and log both success and failure paths. Make sure the
external call failure is logged with the error detail before returning the
Result.
In `@src/openhuman/memory_tree/tree/bucket_seal.rs`:
- Around line 893-935: The wiki-git batching logic is still embedded in
commit_summary_seal within bucket_seal.rs, which keeps the seal module oversized
and mixes orchestration with cross-store commit behavior. Move the
commit_summaries batching code, including SummaryCommitEntry and
SummaryCommitBatch creation plus the call into
openhuman::memory_store::content::wiki_git, into a new sibling module and have
commit_summary_seal delegate to it; keep bucket_seal.rs focused on seal flow and
reuse the new helper by its module/function name.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 232be485-c9ba-48eb-9736-db3840f21e92
📒 Files selected for processing (6)
src/openhuman/memory_store/content/README.mdsrc/openhuman/memory_store/content/mod.rssrc/openhuman/memory_store/content/wiki_git.rssrc/openhuman/memory_tree/ingest.rssrc/openhuman/memory_tree/tree/bucket_seal.rstests/memory_artifacts_e2e.rs
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 7f694302e1
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 4378efc1d9
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b7c97bcc3a
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 72346d103f
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| .add_path(Path::new(path)) | ||
| .with_context(|| format!("stage wiki summary: {path}"))?; | ||
| } | ||
| stage_existing_summary_paths(&mut index, &wiki_root)?; |
There was a problem hiding this comment.
Avoid committing summaries that failed to persist
When an ingest/seal stages the markdown file and then the SQLite transaction fails before inserting the mem_tree_summaries row, that file is left under wiki/summaries with no DB record. This unconditional sweep stages every existing summary file on the next successful commit, so the wiki Git history and read pointers can start exposing aborted/orphan summaries that the core database does not know about. Limit recovery staging to summaries known to have committed in SQLite, or clean up staged files on DB failure.
Useful? React with 👍 / 👎.
Summary
latesttag for cheap current-pointer lookup.Problem
Solution
memory_store::content::wiki_gitto initialize<content_root>/wiki/.gitlazily and track onlysummaries/**plus.gitignore.refs/tags/read/<hex(pointer_id)>/<timestamp>and updaterefs/tags/read/<hex(pointer_id)>/latest.Submission Checklist
## Related— N/A: no matrix feature IDs changed.docs/RELEASE-MANUAL-SMOKE.md) — N/A: internal memory persistence behavior, not a release manual smoke surface.Closes #NNNin the## Relatedsection — N/A: no linked issue provided.Impact
Related
AI Authored PR Metadata (required for Codex/Linear PRs)
Linear Issue
Commit & Branch
feat/memory-wiki-summary-git-history72346d103f27f1fd764e448df0792b2c21b7627cValidation Run
pnpm --filter openhuman-app format:checkpnpm typecheckcargo test --manifest-path Cargo.toml openhuman::memory_store::content::wiki_git --lib;bash scripts/test-rust-e2e.sh --suite memory_artifacts_e2ecargo fmt --all --manifest-path Cargo.toml;cargo check --manifest-path app/src-tauri/Cargo.tomlvia pre-push hookcargo fmt --manifest-path app/src-tauri/Cargo.toml --all --checkandcargo check --manifest-path app/src-tauri/Cargo.tomlvia pre-push hookValidation Blocked
command:N/Aerror:N/Aimpact:N/ABehavior Changes
Parity Contract
Duplicate / Superseded PR Handling
Summary by CodeRabbit
New Features
Bug Fixes
Tests