Skip to content

Batch deletes in Neo4jVectorGraphStore#1111

Closed
edwinyyyu wants to merge 2 commits into
MemMachine:mainfrom
edwinyyyu:batch_delete
Closed

Batch deletes in Neo4jVectorGraphStore#1111
edwinyyyu wants to merge 2 commits into
MemMachine:mainfrom
edwinyyyu:batch_delete

Conversation

@edwinyyyu

Copy link
Copy Markdown
Contributor

Purpose of the change

Deletion fails due to lack of resources or can time out.

Description

Batch deletes every 10000 rows.

Type of change

Breaks existing behavior. Now deletions are no longer atomic.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Refactor (does not change functionality, e.g., code style improvements, linting)
  • Documentation update
  • Project Maintenance (updates to build scripts, CI, etc., that do not affect the main project)
  • Security (improves security without changing functionality)

How Has This Been Tested?

  • Unit Test
  • Integration Test
  • End-to-end Test
  • Test Script (please provide)
  • Manual verification (list step-by-step instructions)

Checklist

  • I have signed the commit(s) within this pull request
  • My code follows the style guidelines of this project (See STYLE_GUIDE.md)
  • I have performed a self-review of my own code
  • I have commented my code
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added unit tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules
  • I have checked my code and corrected any misspellings

Maintainer Checklist

  • Confirmed all checks passed
  • Contributor has signed the commit(s)
  • Reviewed the code
  • Run, Tested, and Verified the change(s) work as expected

Signed-off-by: Edwin Yu <edwinyyyu@gmail.com>
Signed-off-by: Edwin Yu <edwinyyyu@gmail.com>
@edwinyyyu

Copy link
Copy Markdown
Contributor Author

This change makes it possible to deadlock if certain deletions are done concurrently.

@dora-zhang

Copy link
Copy Markdown

I deployed the version with this PR, I still see the following warning in logs

[WARNING] memmachine.server.api_v2.exceptions - exception handling request, code 500, message: Unable to delete project, payload: code=500 message='Unable to delete project' internal_error='failed to obtain a connection from the pool within 60.0s (timeout)' exception='ConnectionAcquisitionTimeoutError' trace='Traceback (most recent call last):\n File "/app/.venv/lib/python3.12/site-packages/mmp_api_server/router.py", line 423, in delete_project_internal\n await memmachine.delete_session(session_data)\n File "/app/.venv/lib/python3.12/site-packages/memmachine/main/memmachine.py", line 371, in delete_session\n await asyncio.gather(*tasks)\n File "/app/.venv/lib/python3.12/site-packages/memmachine/main/memmachine.py", line 348, in _delete_episodic_memory\n await episodic_memory_manager.delete_episodic_session(\n File "/app/.venv/lib/python3.12/site-packages/memmachine/episodic_memory/episodic_memory_manager.py", line 300, in delete_episodic_session\n await instance.delete_session_episodes()\n File "/app/.venv/lib/python3.12/site-packages/memmachine/episodic_memory/episodic_memory.py", line 285, in delete_session_episodes\n await asyncio.gather(*tasks)\n File "/app/.venv/lib/python3.12/site-packages/memmachine/episodic_memory/long_term_memory/long_term_memory.py", line 221, in delete_matching_episodes\n await self._declarative_memory.delete_episodes(\n File "/app/.venv/lib/python3.12/site-packages/memmachine/episodic_memory/declarative_memory/declarative_memory.py", line 603, in delete_episodes\n for derivative_nodes in await asyncio.gather(\n ^^^^^^^^^^^^^^^^^^^^^\n File "/app/.venv/lib/python3.12/site-packages/memmachine/common/vector_graph_store/neo4j_vector_graph_store.py", line 823, in search_related_nodes\n records, _, _ = await self._driver.execute_query(\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File "/app/.venv/lib/python3.12/site-packages/neo4j/_async/driver.py", line 947, in execute_query\n return await session._run_transaction(\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File "/app/.venv/lib/python3.12/site-packages/neo4j/_async/work/session.py", line 532, in _run_transaction\n await self._open_transaction(\n File "/app/.venv/lib/python3.12/site-packages/neo4j/_async/work/session.py", line 398, in _open_transaction\n await self._connect(access_mode=access_mode)\n File "/app/.venv/lib/python3.12/site-packages/neo4j/_async/work/session.py", line 126, in _connect\n await super()._connect(\n File "/app/.venv/lib/python3.12/site-packages/neo4j/_async/work/workspace.py", line 184, in _connect\n self._connection = await self._pool.acquire(**acquire_kwargs_)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File "/app/.venv/lib/python3.12/site-packages/neo4j/_async/io/_pool.py", line 1254, in acquire\n connection = await self._acquire(\n ^^^^^^^^^^^^^^^^^^^^\n File "/app/.venv/lib/python3.12/site-packages/neo4j/_async/io/_pool.py", line 416, in _acquire\n raise ConnectionAcquisitionTimeoutError(\nneo4j.exceptions.ConnectionAcquisitionTimeoutError: failed to obtain a connection from the pool within 60.0s (timeout)'

Probably we also need to increase the Neo4j connection pool size?

@dora-zhang

Copy link
Copy Markdown

I deployed the version with this PR, I still see the following warning in logs

[WARNING] memmachine.server.api_v2.exceptions - exception handling request, code 500, message: Unable to delete project, payload: code=500 message='Unable to delete project' internal_error='failed to obtain a connection from the pool within 60.0s (timeout)' exception='ConnectionAcquisitionTimeoutError' trace='Traceback (most recent call last):\n File "/app/.venv/lib/python3.12/site-packages/mmp_api_server/router.py", line 423, in delete_project_internal\n await memmachine.delete_session(session_data)\n File "/app/.venv/lib/python3.12/site-packages/memmachine/main/memmachine.py", line 371, in delete_session\n await asyncio.gather(*tasks)\n File "/app/.venv/lib/python3.12/site-packages/memmachine/main/memmachine.py", line 348, in _delete_episodic_memory\n await episodic_memory_manager.delete_episodic_session(\n File "/app/.venv/lib/python3.12/site-packages/memmachine/episodic_memory/episodic_memory_manager.py", line 300, in delete_episodic_session\n await instance.delete_session_episodes()\n File "/app/.venv/lib/python3.12/site-packages/memmachine/episodic_memory/episodic_memory.py", line 285, in delete_session_episodes\n await asyncio.gather(*tasks)\n File "/app/.venv/lib/python3.12/site-packages/memmachine/episodic_memory/long_term_memory/long_term_memory.py", line 221, in delete_matching_episodes\n await self._declarative_memory.delete_episodes(\n File "/app/.venv/lib/python3.12/site-packages/memmachine/episodic_memory/declarative_memory/declarative_memory.py", line 603, in delete_episodes\n for derivative_nodes in await asyncio.gather(\n ^^^^^^^^^^^^^^^^^^^^^\n File "/app/.venv/lib/python3.12/site-packages/memmachine/common/vector_graph_store/neo4j_vector_graph_store.py", line 823, in search_related_nodes\n records, _, _ = await self._driver.execute_query(\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File "/app/.venv/lib/python3.12/site-packages/neo4j/_async/driver.py", line 947, in execute_query\n return await session._run_transaction(\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File "/app/.venv/lib/python3.12/site-packages/neo4j/_async/work/session.py", line 532, in _run_transaction\n await self._open_transaction(\n File "/app/.venv/lib/python3.12/site-packages/neo4j/_async/work/session.py", line 398, in _open_transaction\n await self._connect(access_mode=access_mode)\n File "/app/.venv/lib/python3.12/site-packages/neo4j/_async/work/session.py", line 126, in _connect\n await super()._connect(\n File "/app/.venv/lib/python3.12/site-packages/neo4j/_async/work/workspace.py", line 184, in _connect\n self._connection = await self._pool.acquire(**acquire_kwargs_)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File "/app/.venv/lib/python3.12/site-packages/neo4j/_async/io/_pool.py", line 1254, in acquire\n connection = await self._acquire(\n ^^^^^^^^^^^^^^^^^^^^\n File "/app/.venv/lib/python3.12/site-packages/neo4j/_async/io/_pool.py", line 416, in _acquire\n raise ConnectionAcquisitionTimeoutError(\nneo4j.exceptions.ConnectionAcquisitionTimeoutError: failed to obtain a connection from the pool within 60.0s (timeout)'

Probably we also need to increase the Neo4j connection pool size?

The error occurred when I try deleting 6000 memories.

@edwinyyyu

Copy link
Copy Markdown
Contributor Author

I deployed the version with this PR, I still see the following warning in logs
[WARNING] memmachine.server.api_v2.exceptions - exception handling request, code 500, message: Unable to delete project, payload: code=500 message='Unable to delete project' internal_error='failed to obtain a connection from the pool within 60.0s (timeout)' exception='ConnectionAcquisitionTimeoutError' trace='Traceback (most recent call last):\n File "/app/.venv/lib/python3.12/site-packages/mmp_api_server/router.py", line 423, in delete_project_internal\n await memmachine.delete_session(session_data)\n File "/app/.venv/lib/python3.12/site-packages/memmachine/main/memmachine.py", line 371, in delete_session\n await asyncio.gather(*tasks)\n File "/app/.venv/lib/python3.12/site-packages/memmachine/main/memmachine.py", line 348, in _delete_episodic_memory\n await episodic_memory_manager.delete_episodic_session(\n File "/app/.venv/lib/python3.12/site-packages/memmachine/episodic_memory/episodic_memory_manager.py", line 300, in delete_episodic_session\n await instance.delete_session_episodes()\n File "/app/.venv/lib/python3.12/site-packages/memmachine/episodic_memory/episodic_memory.py", line 285, in delete_session_episodes\n await asyncio.gather(*tasks)\n File "/app/.venv/lib/python3.12/site-packages/memmachine/episodic_memory/long_term_memory/long_term_memory.py", line 221, in delete_matching_episodes\n await self._declarative_memory.delete_episodes(\n File "/app/.venv/lib/python3.12/site-packages/memmachine/episodic_memory/declarative_memory/declarative_memory.py", line 603, in delete_episodes\n for derivative_nodes in await asyncio.gather(\n ^^^^^^^^^^^^^^^^^^^^^\n File "/app/.venv/lib/python3.12/site-packages/memmachine/common/vector_graph_store/neo4j_vector_graph_store.py", line 823, in search_related_nodes\n records, _, _ = await self._driver.execute_query(\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File "/app/.venv/lib/python3.12/site-packages/neo4j/_async/driver.py", line 947, in execute_query\n return await session._run_transaction(\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File "/app/.venv/lib/python3.12/site-packages/neo4j/_async/work/session.py", line 532, in _run_transaction\n await self._open_transaction(\n File "/app/.venv/lib/python3.12/site-packages/neo4j/_async/work/session.py", line 398, in _open_transaction\n await self._connect(access_mode=access_mode)\n File "/app/.venv/lib/python3.12/site-packages/neo4j/_async/work/session.py", line 126, in _connect\n await super()._connect(\n File "/app/.venv/lib/python3.12/site-packages/neo4j/_async/work/workspace.py", line 184, in _connect\n self._connection = await self._pool.acquire(**acquire_kwargs_)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File "/app/.venv/lib/python3.12/site-packages/neo4j/_async/io/_pool.py", line 1254, in acquire\n connection = await self._acquire(\n ^^^^^^^^^^^^^^^^^^^^\n File "/app/.venv/lib/python3.12/site-packages/neo4j/_async/io/_pool.py", line 416, in _acquire\n raise ConnectionAcquisitionTimeoutError(\nneo4j.exceptions.ConnectionAcquisitionTimeoutError: failed to obtain a connection from the pool within 60.0s (timeout)'
Probably we also need to increase the Neo4j connection pool size?

The error occurred when I try deleting 6000 memories.

I deployed the version with this PR, I still see the following warning in logs

[WARNING] memmachine.server.api_v2.exceptions - exception handling request, code 500, message: Unable to delete project, payload: code=500 message='Unable to delete project' internal_error='failed to obtain a connection from the pool within 60.0s (timeout)' exception='ConnectionAcquisitionTimeoutError' trace='Traceback (most recent call last):\n File "/app/.venv/lib/python3.12/site-packages/mmp_api_server/router.py", line 423, in delete_project_internal\n await memmachine.delete_session(session_data)\n File "/app/.venv/lib/python3.12/site-packages/memmachine/main/memmachine.py", line 371, in delete_session\n await asyncio.gather(*tasks)\n File "/app/.venv/lib/python3.12/site-packages/memmachine/main/memmachine.py", line 348, in _delete_episodic_memory\n await episodic_memory_manager.delete_episodic_session(\n File "/app/.venv/lib/python3.12/site-packages/memmachine/episodic_memory/episodic_memory_manager.py", line 300, in delete_episodic_session\n await instance.delete_session_episodes()\n File "/app/.venv/lib/python3.12/site-packages/memmachine/episodic_memory/episodic_memory.py", line 285, in delete_session_episodes\n await asyncio.gather(*tasks)\n File "/app/.venv/lib/python3.12/site-packages/memmachine/episodic_memory/long_term_memory/long_term_memory.py", line 221, in delete_matching_episodes\n await self._declarative_memory.delete_episodes(\n File "/app/.venv/lib/python3.12/site-packages/memmachine/episodic_memory/declarative_memory/declarative_memory.py", line 603, in delete_episodes\n for derivative_nodes in await asyncio.gather(\n ^^^^^^^^^^^^^^^^^^^^^\n File "/app/.venv/lib/python3.12/site-packages/memmachine/common/vector_graph_store/neo4j_vector_graph_store.py", line 823, in search_related_nodes\n records, _, _ = await self._driver.execute_query(\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File "/app/.venv/lib/python3.12/site-packages/neo4j/_async/driver.py", line 947, in execute_query\n return await session._run_transaction(\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File "/app/.venv/lib/python3.12/site-packages/neo4j/_async/work/session.py", line 532, in _run_transaction\n await self._open_transaction(\n File "/app/.venv/lib/python3.12/site-packages/neo4j/_async/work/session.py", line 398, in _open_transaction\n await self._connect(access_mode=access_mode)\n File "/app/.venv/lib/python3.12/site-packages/neo4j/_async/work/session.py", line 126, in _connect\n await super()._connect(\n File "/app/.venv/lib/python3.12/site-packages/neo4j/_async/work/workspace.py", line 184, in _connect\n self._connection = await self._pool.acquire(**acquire_kwargs_)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File "/app/.venv/lib/python3.12/site-packages/neo4j/_async/io/_pool.py", line 1254, in acquire\n connection = await self._acquire(\n ^^^^^^^^^^^^^^^^^^^^\n File "/app/.venv/lib/python3.12/site-packages/neo4j/_async/io/_pool.py", line 416, in _acquire\n raise ConnectionAcquisitionTimeoutError(\nneo4j.exceptions.ConnectionAcquisitionTimeoutError: failed to obtain a connection from the pool within 60.0s (timeout)'

Probably we also need to increase the Neo4j connection pool size?

This is set on the Neo4j server-side.

@sscargal

sscargal commented Feb 25, 2026

Copy link
Copy Markdown
Contributor

@dora-zhang

Probably we also need to increase the Neo4j connection pool size?

In MemMachine v0.2.4, we introduced variables that let you change the connection pool size. Ensure you have the latest configuration.yml file from the sample, and you should see

    my_storage_id:
      provider: neo4j
      config:
        uri: 'bolt://localhost:7687'
        username: neo4j
        password: <YOUR_PASSWORD_HERE>
        max_connection_pool_size: 100
        connection_acquisition_timeout: 60.0
        range_index_creation_threshold: 10000
        vector_index_creation_threshold: 10000

We use Neo4j's defaults, but you are welcome to modify and test again.

@edwinyyyu, we shouldn't cause the DB to stall out like this due to a "Thundering Herd" issue from large-volume operations. This looks to be one use case we should track and resolve alongside #940, which has similar symptoms, but is caused by very high search operations rather than very large delete operations.

@dora-zhang

Copy link
Copy Markdown

@dora-zhang

Probably we also need to increase the Neo4j connection pool size?

In MemMachine v0.2.4, we introduced variables that let you change the connection pool size. Ensure you have the latest configuration.yml file from the sample, and you should see

    my_storage_id:
      provider: neo4j
      config:
        uri: 'bolt://localhost:7687'
        username: neo4j
        password: <YOUR_PASSWORD_HERE>
        max_connection_pool_size: 100
        connection_acquisition_timeout: 60.0
        range_index_creation_threshold: 10000
        vector_index_creation_threshold: 10000

We use Neo4j's defaults, but you are welcome to modify and test again.

@edwinyyyu, we shouldn't cause the DB to stall out like this due to a "Thundering Herd" issue from large-volume operations. This looks to be one use case we should track and resolve alongside #940, which has similar symptoms, but is caused by very high search operations rather than very large delete operations.

Got it! Thanks for the explanation Steve!

@sscargal sscargal mentioned this pull request Mar 2, 2026
13 tasks
@edwinyyyu edwinyyyu closed this Apr 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants