Skip to content

feat(python): bulk/projected accessors for the Python facade (#180)#181

Merged
rahlk merged 4 commits into
mainfrom
feat/issue-180-granular-accessors
Jun 27, 2026
Merged

feat(python): bulk/projected accessors for the Python facade (#180)#181
rahlk merged 4 commits into
mainfrom
feat/issue-180-granular-accessors

Conversation

@rahlk

@rahlk rahlk commented Jun 27, 2026

Copy link
Copy Markdown
Collaborator

Adds bulk, field-projected accessors to the Python facade so agent workloads can enumerate the application set-at-a-time instead of paying the per-entity reconstruction get_all_methods_in_application() does. Implements all four accessors from #180.

Motivation and Context

On the Neo4j backend, get_methods() fans out into an N+1 reconstruction (per module → per class → per callable, recursing) — tens of thousands of serialized Bolt round-trips on a large app. Catalog-style consumers only need {signature, decorators} per callable plus bodies/call-sites for a chosen few, yet pay to rebuild everything they discard. The SDK offered only one-at-a-time, fully-reconstructed reads, so consumers hand-wrote Cypher (leaking graph schema). These accessors give the SDK ownership of the projected query and return shape. See #180 for the full diagnosis.

How Has This Been Tested?

  • New offline unit tests (tests/analysis/python/test_python_bulk_accessors.py) build an in-memory PyApplication and exercise _iter_callables + all four accessors on the in-process backend — no analyzer run, no Neo4j. 5 passed.
  • The Neo4j test module gains test_bulk_accessors_parity, asserting the Neo4j backend returns identical overviews / bodies / decorated results / call-sites vs the in-process backend (runs when a server is reachable; skipped otherwise).
  • Full tests/analysis/python + tests/models/python: 25 passed, 6 skipped (the skips need a live Neo4j).
  • black is not enforced on this repo (12 pre-existing files would reformat at line-length 180), so the code matches the surrounding hand-formatted multi-line style rather than black output; line length verified ≤180.

Breaking Changes

None — additive only. Four new methods on the PythonAnalysisBackend ABC, both backends, and the facade; a new PyCallableOverview model. The session-reuse change preserves _run's observable behavior (same return, same exceptions).

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation update

Checklist

  • I have read the Codellm-Devkit Documentation
  • My code follows the repository's style guidelines
  • New and existing tests pass locally
  • I have added appropriate error handling
  • I have added or updated documentation as needed

Additional context

Commits:

  1. feat(python): add bulk/projected accessorsPyCallableOverview + get_callables_overview(), get_method_bodies(signatures), get_decorated_callables(markers). Both backends enumerate the same callable set; a callable is a "method" only when a class declares it directly (mirrors PY_HAS_METHOD).
  2. perf(python): reuse one Neo4j read session — the orthogonal quick win from Bulk/projection accessors for the Python facade to avoid N+1 reconstruction on the Neo4j backend #180: _run reused a fresh session per query; now one lazily-opened session for the backend's lifetime, dropped on error. Speeds up the existing reconstruction path too.
  3. chore: sync uv.lock — the lockfile still pinned codeanalyzer-typescript 0.4.0 while pyproject was at 0.4.3 (left stale by feat(typescript): add tsc_only toggle and surface synthesized anonymous callables #179); regenerated to agree. No runtime deps added.
  4. feat(python): add get_callsites_for — batch call-sites keyed by owning signature (PY_HAS_CALLSITE via OPTIONAL MATCH so call-site-less callables still get an empty-list entry, matching in-process).

rahlk added 4 commits June 27, 2026 15:23
Add set-at-a-time, field-projected reads to the Python facade so callers can
enumerate the application in one round-trip instead of paying the per-entity
reconstruction get_all_methods_in_application() does (tens of thousands of Bolt
round-trips on large apps via the Neo4j backend).

New `PyCallableOverview` projection model and three accessors on the
PythonAnalysisBackend ABC, both backends, and the facade:

- get_callables_overview() -> List[PyCallableOverview]: every callable
  (methods, module-level and nested functions) as a lightweight projection.
- get_method_bodies(signatures) -> Dict[str, str]: batch source-body fetch.
- get_decorated_callables(markers) -> List[PyCallableOverview]: overviews
  filtered by decorator (fills the get_methods_with_decorators gap).

The in-process and Neo4j backends enumerate the same callable set (a "method"
is one a class declares directly, mirroring PY_HAS_METHOD). Offline unit tests
cover the in-process walk; the Neo4j test module asserts byte-for-byte parity
against it when a server is reachable.

Refs #180
PyNeo4jBackend._run opened a fresh driver session on every call, so the N+1
reconstruction fan-out (get_symbol_table / get_all_methods_in_application) paid
session-acquisition overhead on each of its tens of thousands of queries. Reuse
a single lazily-opened session for the backend's lifetime, dropping it on error
so the next call reopens cleanly. Closed in close().

Refs #180
The lockfile still pinned codeanalyzer-typescript 0.4.0 while pyproject was
bumped to 0.4.3 (#179); regenerate so the two agree. No runtime deps added
(black stays in the test dependency group, not runtime).
Batch call-site fetch keyed by owning signature, the last of the four bulk
accessors from #180. One projected Cypher statement on the Neo4j backend (an
OPTIONAL MATCH over PY_HAS_CALLSITE so an existing callable with no call sites
still gets an empty-list entry, matching the in-process backend); one
symbol-table walk in-process. Added to the ABC, both backends, and the facade,
with offline and Neo4j-parity test coverage.

Refs #180
@rahlk rahlk marked this pull request as ready for review June 27, 2026 19:38
@rahlk rahlk merged commit 9f0260e into main Jun 27, 2026
@rahlk rahlk self-assigned this Jun 27, 2026
@rahlk rahlk added the enhancement New feature or request label Jun 27, 2026
@rahlk rahlk deleted the feat/issue-180-granular-accessors branch June 27, 2026 19:42
@rahlk rahlk mentioned this pull request Jun 27, 2026
9 tasks
rahlk added a commit that referenced this pull request Jun 27, 2026
Minor release: new backward-compatible API since v1.2.0.

- Python facade bulk/projected accessors (get_callables_overview,
  get_method_bodies, get_decorated_callables, get_callsites_for) + the
  PyCallableOverview model (#181).
- TypeScript tsc_only toggle (TSCodeAnalyzerConfig) and synthesized anonymous
  callables (TSSynthesizedCallable, get_synthesized_callables) (#179).
- Neo4j Python backend reuses a single read session.
- codeanalyzer-typescript 0.4.0 -> 0.4.3.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant