Skip to content

Atomically swap vector search engine index files on save#1460

Open
edwinyyyu wants to merge 1 commit into
MemMachine:mainfrom
edwinyyyu:atomic_index_swap
Open

Atomically swap vector search engine index files on save#1460
edwinyyyu wants to merge 1 commit into
MemMachine:mainfrom
edwinyyyu:atomic_index_swap

Conversation

@edwinyyyu

Copy link
Copy Markdown
Contributor

Problem

SQLiteVectorStore persists each collection's index by calling the search engine's save(), which wrote directly to the final path. A crash partway through that write left a truncated/corrupt file on disk.

This is worse than a transient I/O error because of the durability contract: once index_saved=True, the on-disk index is treated as authoritative — a missing or unloadable file raises IndexLoadError rather than silently rebuilding an empty index. So an interrupted save could leave a collection permanently unrecoverable.

Fix

Write the index to a sibling temp file and swap it into place with os.replace (via Path.replace), which is atomic on POSIX and Windows when source and destination share a filesystem (guaranteed here — the temp file is adjacent to the target). As a result:

  • A reader sees either the old index or the new one, never a partial write.
  • A failed save removes the temp file and leaves the previous index intact.
  • Leftover temp files are cleared on load() so a crash doesn't leak them across restarts.

A best-effort fsync of the temp file before the swap guards against the data-vs-rename durability gap on power loss; the atomic swap remains the actual correctness guarantee.

Where it lives

Implemented in the engines via a shared index_persistence helper (atomic_index_write, clear_stale_index_temp), wired into both HnswlibVectorSearchEngine and USearchVectorSearchEnginenot in SQLiteVectorStore/SQLiteVectorStoreCollection. The index save location and the number of files written differ across engine implementations, so the atomic-swap responsibility belongs with the engine that owns its file layout.

Tests

  • test_index_persistence.py: swap-on-success, preserve-existing + cleanup on failure, no target created on failure, stale-temp clearing.
  • Per-engine integration tests (hnswlib + usearch): no temp file lingers after a successful save; load() clears a stale temp file and still loads correctly.

All vector_search_engine and test_sqlite_vector_store suites pass; ruff and ty clean.

🤖 Generated with Claude Code

SQLiteVectorStore persists each collection's index by calling the search
engine's save(), which wrote directly to the final path. A crash mid-write
left a truncated/corrupt file. Because index_saved=True makes the on-disk
index a durable contract (missing/corrupt is a hard IndexLoadError, not a
silent empty rebuild), an interrupted save could render a collection
unrecoverable.

Write the index to a sibling temp file and swap it into place with
os.replace (atomic on POSIX and Windows on the same filesystem), so a reader
sees either the old or new index, never a partial write; a failed save leaves
the previous index intact. Leftover temp files are cleared on load so a crash
does not leak them across restarts.

Implemented in the engines (shared index_persistence helper) rather than in
SQLiteVectorStore/SQLiteVectorStoreCollection, since the index save location
and number of files written differ across engine implementations.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant