Summary
The push worker unconditionally clears dirty_since = None after wait_until_done(), creating a up-to-30-second window where concurrent user writes can be silently lost. If a user writes to a file while the push worker is waiting for server processing, the pull reconciler can overwrite their latest edits with stale server content.
Root Cause
In push.rs, after a successful POST/PATCH + wait_until_done():
// push.rs:194-197
wait_until_done(api, remote_id, WAIT_DONE_MAX).await; // blocks up to 30s
db.set_mirrored_state(ino, None, Some("done"), Some(now_ms()));
db.set_dirty_since(ino, None); // UNCONDITIONAL CLEAR
set_dirty_since(ino, None) clears the dirty flag without checking whether a new write has occurred since the job was claimed. This affects all 3 push paths (PATCH, POST, and binary upload).
Reproduction Timeline
T=0s User writes "version A" -> dirty_since = T0, push enqueued
T=0.5s Push worker claims job, POSTs content to server
T=1s wait_until_done() starts polling (up to 30s)
T=2s User writes "version B" -> dirty_since = T2
(push_queue pending slot correctly filled)
T=4s wait_until_done() returns -> set_dirty_since(ino, None)
Destroys T2's dirty flag!
T=5s Pull reconciler runs, fetches server doc (still "version A")
dirty_since = None -> no protection -> OVERWRITES local file
"version B" is SILENTLY LOST
Compounding Issue: wait_until_done return value ignored
The return value of wait_until_done() (indicating timeout vs success) is completely ignored:
wait_until_done(api, remote_id, WAIT_DONE_MAX).await; // return value discarded
db.set_mirrored_state(ino, None, Some("done"), Some(now_ms())); // stamps "done" anyway
If the server times out, the code still stamps last_status = "done", causing the inflight poller to stop tracking the document. If the server subsequently fails, the error is never detected.
Why It's Hard to Catch
- The pending slot mechanism works correctly, the push queue properly coalesces writes
dirty_since is refreshed by the user's write, but then destroyed 2s later by the push worker
- Only manifests when: server processing > 500ms AND user saves during that window AND a pull loop runs before the second push completes
- With autosave editors and the 30s
WAIT_DONE_MAX + 5s pull interval, this is very plausible in production
Proposed Fix
Compare-and-swap: Capture dirty_since at job claim time, only clear if unchanged.
// In push_queue_claim_next: snapshot the current value
let dirty_since_at_claim = db.get_dirty_since(ino);
// After wait_until_done: conditional clear instead of unconditional
db.clear_dirty_since_if_unchanged(ino, dirty_since_at_claim);
// SQL: "UPDATE fs_inode SET dirty_since = NULL WHERE ino = ?1 AND dirty_since = ?2"
This ensures that if the user writes during the wait window, the new dirty_since is preserved and the pull reconciler correctly returns SkippedDirty.
Summary
The push worker unconditionally clears
dirty_since = Noneafterwait_until_done(), creating a up-to-30-second window where concurrent user writes can be silently lost. If a user writes to a file while the push worker is waiting for server processing, the pull reconciler can overwrite their latest edits with stale server content.Root Cause
In
push.rs, after a successful POST/PATCH +wait_until_done():set_dirty_since(ino, None)clears the dirty flag without checking whether a new write has occurred since the job was claimed. This affects all 3 push paths (PATCH, POST, and binary upload).Reproduction Timeline
Compounding Issue:
wait_until_donereturn value ignoredThe return value of
wait_until_done()(indicating timeout vs success) is completely ignored:If the server times out, the code still stamps
last_status = "done", causing the inflight poller to stop tracking the document. If the server subsequently fails, the error is never detected.Why It's Hard to Catch
dirty_sinceis refreshed by the user's write, but then destroyed 2s later by the push workerWAIT_DONE_MAX+ 5s pull interval, this is very plausible in productionProposed Fix
Compare-and-swap: Capture
dirty_sinceat job claim time, only clear if unchanged.This ensures that if the user writes during the wait window, the new
dirty_sinceis preserved and the pull reconciler correctly returnsSkippedDirty.