[GH-1146] Cap features sent to LLM during semantic ingestion update#1163
Merged
o-love merged 4 commits intoMar 23, 2026
Merged
Conversation
74fa23c to
9bf3eb4
Compare
jealous
approved these changes
Mar 3, 2026
9bf3eb4 to
8bfc2e9
Compare
Add configurable max_features_per_update (default 50) to prevent LengthFinishReasonError when a large profile causes the LLM response to overflow its output token budget. The limit flows from SemanticMemoryConf (YAML) through SemanticService into IngestionService, and is passed as page_size to get_feature_set in the update path.
8bfc2e9 to
85a65b4
Compare
sscargal
requested changes
Mar 4, 2026
sscargal
left a comment
Contributor
There was a problem hiding this comment.
CI tests are failing with
FAILED packages/server/server_tests/memmachine_server/semantic_memory/test_semantic_ingestion.py::test_process_single_set_limits_features_sent_to_llm[neo4j_semantic_storage-count_cache_episode_storage-sqlalchemy_sqlite_engine] - ModuleNotFoundError: No module named 'memmachine'
This is caused by the package refactor, so the PR needs to be updated to use the new locations and names. Thanks.
Contributor
|
@o-love, please resolve the merge conflicts so we can proceed with the merge process. Thanks. |
sscargal
previously requested changes
Mar 18, 2026
sscargal
left a comment
Contributor
There was a problem hiding this comment.
Merge conflicts need resolving.
…es-per-ingestion-update # Conflicts: # packages/server/server_tests/memmachine_server/semantic_memory/test_semantic_ingestion.py
edwinyyyu
approved these changes
Mar 23, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Purpose of the change
Prevent
LengthFinishReasonErrorduring semantic memory background ingestion when a profile has accumulated a large number of features. The unbounded feature set was causing the LLM response to overflow its output token budget (16384 tokens), logging errors on every background pass.Description
The
get_feature_setcall in the ingestion update path had nopage_sizelimit, sending the entire feature set tollm_feature_update. After many messages (168+ in the reported case), the profile grew large enough (~8034 prompt tokens) that the LLM's structured JSON response exceeded the completion token limit.This PR adds a configurable
max_features_per_updateparameter (default 50) that caps the number of existing features sent to the LLM per update call. The parameter flows through the full configuration chain:SemanticMemoryConf(YAML config file) — operator-configurableSemanticService.Params— service layerIngestionService.Params— ingestion layerget_feature_set(page_size=...)— storage queryFiles changed:
src/memmachine/common/configuration/__init__.py— Addmax_features_per_updatetoSemanticMemoryConfsrc/memmachine/semantic_memory/semantic_memory.py— Thread throughSemanticService.Paramsand intoIngestionServicesrc/memmachine/common/resource_manager/semantic_manager.py— Wire YAML config value intoSemanticService.Paramssrc/memmachine/semantic_memory/semantic_ingestion.py— Passpage_sizetoget_feature_setin the update pathtests/memmachine/semantic_memory/test_semantic_ingestion.py— Add test verifying the feature capYAML configuration:
Fixes/Closes
Fixes #1146
Type of change
How Has This Been Tested?
New test
test_process_single_set_limits_features_sent_to_llmverifies that when 60 features exist, fewer than 60 are passed to the LLM call. Full semantic memory test suite (224 passed, 2 skipped). Ruff and ty checks pass with no new diagnostics.Checklist
Screenshots/Gifs
N/A
Further comments
The default of 50 is conservative — it keeps the feature context well within typical model output limits while still providing enough profile context for meaningful updates. Operators can tune this via the YAML config if their deployment uses models with different token budgets.