feat(python): bulk/projected accessors for the Python facade (#180)#181
Merged
Conversation
Add set-at-a-time, field-projected reads to the Python facade so callers can enumerate the application in one round-trip instead of paying the per-entity reconstruction get_all_methods_in_application() does (tens of thousands of Bolt round-trips on large apps via the Neo4j backend). New `PyCallableOverview` projection model and three accessors on the PythonAnalysisBackend ABC, both backends, and the facade: - get_callables_overview() -> List[PyCallableOverview]: every callable (methods, module-level and nested functions) as a lightweight projection. - get_method_bodies(signatures) -> Dict[str, str]: batch source-body fetch. - get_decorated_callables(markers) -> List[PyCallableOverview]: overviews filtered by decorator (fills the get_methods_with_decorators gap). The in-process and Neo4j backends enumerate the same callable set (a "method" is one a class declares directly, mirroring PY_HAS_METHOD). Offline unit tests cover the in-process walk; the Neo4j test module asserts byte-for-byte parity against it when a server is reachable. Refs #180
PyNeo4jBackend._run opened a fresh driver session on every call, so the N+1 reconstruction fan-out (get_symbol_table / get_all_methods_in_application) paid session-acquisition overhead on each of its tens of thousands of queries. Reuse a single lazily-opened session for the backend's lifetime, dropping it on error so the next call reopens cleanly. Closed in close(). Refs #180
The lockfile still pinned codeanalyzer-typescript 0.4.0 while pyproject was bumped to 0.4.3 (#179); regenerate so the two agree. No runtime deps added (black stays in the test dependency group, not runtime).
Batch call-site fetch keyed by owning signature, the last of the four bulk accessors from #180. One projected Cypher statement on the Neo4j backend (an OPTIONAL MATCH over PY_HAS_CALLSITE so an existing callable with no call sites still gets an empty-list entry, matching the in-process backend); one symbol-table walk in-process. Added to the ABC, both backends, and the facade, with offline and Neo4j-parity test coverage. Refs #180
rahlk
added a commit
that referenced
this pull request
Jun 27, 2026
Minor release: new backward-compatible API since v1.2.0. - Python facade bulk/projected accessors (get_callables_overview, get_method_bodies, get_decorated_callables, get_callsites_for) + the PyCallableOverview model (#181). - TypeScript tsc_only toggle (TSCodeAnalyzerConfig) and synthesized anonymous callables (TSSynthesizedCallable, get_synthesized_callables) (#179). - Neo4j Python backend reuses a single read session. - codeanalyzer-typescript 0.4.0 -> 0.4.3.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adds bulk, field-projected accessors to the Python facade so agent workloads can enumerate the application set-at-a-time instead of paying the per-entity reconstruction
get_all_methods_in_application()does. Implements all four accessors from #180.Motivation and Context
On the Neo4j backend,
get_methods()fans out into an N+1 reconstruction (per module → per class → per callable, recursing) — tens of thousands of serialized Bolt round-trips on a large app. Catalog-style consumers only need{signature, decorators}per callable plus bodies/call-sites for a chosen few, yet pay to rebuild everything they discard. The SDK offered only one-at-a-time, fully-reconstructed reads, so consumers hand-wrote Cypher (leaking graph schema). These accessors give the SDK ownership of the projected query and return shape. See #180 for the full diagnosis.How Has This Been Tested?
tests/analysis/python/test_python_bulk_accessors.py) build an in-memoryPyApplicationand exercise_iter_callables+ all four accessors on the in-process backend — no analyzer run, no Neo4j. 5 passed.test_bulk_accessors_parity, asserting the Neo4j backend returns identical overviews / bodies / decorated results / call-sites vs the in-process backend (runs when a server is reachable; skipped otherwise).tests/analysis/python+tests/models/python: 25 passed, 6 skipped (the skips need a live Neo4j).blackis not enforced on this repo (12 pre-existing files would reformat at line-length 180), so the code matches the surrounding hand-formatted multi-line style rather than black output; line length verified ≤180.Breaking Changes
None — additive only. Four new methods on the
PythonAnalysisBackendABC, both backends, and the facade; a newPyCallableOverviewmodel. The session-reuse change preserves_run's observable behavior (same return, same exceptions).Types of changes
Checklist
Additional context
Commits:
PyCallableOverview+get_callables_overview(),get_method_bodies(signatures),get_decorated_callables(markers). Both backends enumerate the same callable set; a callable is a"method"only when a class declares it directly (mirrorsPY_HAS_METHOD)._runreused a fresh session per query; now one lazily-opened session for the backend's lifetime, dropped on error. Speeds up the existing reconstruction path too.PY_HAS_CALLSITEvia OPTIONAL MATCH so call-site-less callables still get an empty-list entry, matching in-process).