Skip to content

Analysis venv (uv + Jedi wiring), external_symbols, app-scoped prune, --no-venv (#44 #45 #46 #47)#48

Merged
rahlk merged 4 commits into
mainfrom
fix/issues-44-45-46-47
Jun 22, 2026
Merged

Analysis venv (uv + Jedi wiring), external_symbols, app-scoped prune, --no-venv (#44 #45 #46 #47)#48
rahlk merged 4 commits into
mainfrom
fix/issues-44-45-46-47

Conversation

@rahlk

@rahlk rahlk commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Summary

Fixes four issues on the Neo4j / analysis pipeline.

Changes

#47 -- install the analysis venv with uv and wire it to Jedi

The per-project analysis venv was built and populated but never used: __init__
left self.virtualenv = None and never reassigned it, so SymbolTableBuilder got
None and Jedi resolved against the default environment, ignoring the installed
dependencies. Now self.virtualenv is set on both a fresh build and a lazy reuse.
Dependency installation uses uv (uv pip install --python <venv>) instead of pip
for speed (parallel + shared cache); uv ships in its wheel so it is present
wherever canpy is installed, including Docker, with a pip fallback.

#44 -- first-class external_symbols (codeanalyzer-typescript parity)

External call targets are now first-class in the IR instead of being re-derived
ad hoc in the projection. Adds PyExternalSymbol{name, module} and
PyApplication.external_symbols (mirrors TSExternalSymbol). The analyzer
classifies every call-graph endpoint not declared in the symbol table as an
external. The Neo4j projection emits :PyExternal authoritatively from that set,
so an imported module name (a :PyPackage) can no longer shadow a call target and
silently drop the PY_CALLS edge. :PyExternal gains a module property
(SCHEMA_VERSION 1.0.0 -> 1.1.0, additive). Node identity is also keyed by
(merge_label, value) so deferred PY_EXTENDS / PY_RESOLVES_TO edges cannot be
shadowed. Fixes the ~3.7% of call edges (targets like os/re/json) that were
dropped from the emitted graph.

#45 -- scope the bolt full-run prune to the application anchor

The full-run orphan prune deleted any :PyModule not in the current emit across
the entire database, so a push for application B wiped application A's modules.
The prune is now anchored:

MATCH (:PyApplication {name: $app})-[:PY_HAS_MODULE]->(m:PyModule)
WHERE NOT m.file_key IN $present
OPTIONAL MATCH (m)-[...]->(x)
DETACH DELETE x, m

so it only removes that application's vanished modules. A single Neo4j database
can now hold multiple applications via full-run --emit neo4j.

#46 -- add --no-venv

New --no-venv/--venv flag (AnalysisOptions.no_venv). When set, skip venv
creation and dependency installation and resolve imports against the ambient
interpreter (self.virtualenv stays None). Useful in CI / containers where the
deps are already installed, for sandboxed runs without network, and for speed.

Testing

Closes

Closes #44
Closes #45
Closes #46
Closes #47

rahlk added 4 commits June 22, 2026 16:33
Closes #47

The per-project analysis venv was built and populated but never used: __init__
left self.virtualenv = None and never reassigned it, so SymbolTableBuilder got
virtualenv=None and Jedi resolved against the default environment, ignoring the
installed dependencies. Set self.virtualenv to the venv path on both a fresh
build and a lazy reuse so Jedi resolves the project's third-party imports.

Also install dependencies with uv (uv pip install --python <venv>) instead of
pip: uv resolves and downloads in parallel with a shared global cache, which is
dramatically faster for large dependency trees (e.g. Odoo). uv ships as a
self-contained binary in its wheel, so it is present wherever canpy is installed
(including Docker); fall back to python -m pip when uv cannot be located.
…ges (#44)

Closes #44

Adopt the model codeanalyzer-typescript uses: external call targets are now
first-class in the IR instead of being re-derived ad hoc during Neo4j projection.

- schema: add PyExternalSymbol{name, module} and PyApplication.external_symbols,
  keyed by signature (mirrors TSExternalSymbol).
- core: _compute_external_symbols() classifies every call-graph endpoint not
  declared in the symbol table as an external (name/module from the signature),
  so analysis.json carries external info that was previously a bare target string.
- neo4j: :PyExternal gains a `module` property (SCHEMA_VERSION 1.0.0 -> 1.1.0,
  additive). project()'s _call_endpoint classifies authoritatively from
  external_symbols rather than a "present in the graph" heuristic, so an imported
  module name (a :PyPackage) can no longer shadow a call target and silently drop
  the PY_CALLS edge.
- rows: track node identity by (merge_label, value) so deferred PY_EXTENDS /
  PY_RESOLVES_TO edges can't be shadowed either.

Fixes the ~3.7% of call edges (e.g. targets os/re/json) that were dropped from
the emitted graph. Adds a regression test and exercises external_symbols in the
sample app; regenerates schema.neo4j.json.
Closes #45

The full-run prune deleted any :PyModule whose file_key was not in the current
emit across the ENTIRE database -- not just the application being written -- so a
full-run push for application B wiped application A's modules, leaving an orphaned
:PyApplication with zero PY_HAS_MODULE edges. A single Neo4j database therefore
could not hold multiple applications via full-run --emit neo4j.

Anchor the prune to the :PyApplication {name} being emitted
(MATCH (:PyApplication {name:$app})-[:PY_HAS_MODULE]->(m:PyModule) WHERE NOT
m.file_key IN $present ...), so it only removes that application's vanished
modules. Adds a container regression test (app-b push leaves app-a intact).
…ent env

Closes #46

Add a --no-venv flag (AnalysisOptions.no_venv) that skips virtualenv creation and
dependency installation and resolves imports against the ambient interpreter
(self.virtualenv stays None, so Jedi uses the default environment). Useful in CI /
containers where the project's dependencies are already installed, for sandboxed
runs where network installs are disallowed, and for speed. Tradeoff: import /
call-resolution quality then depends on what is installed in the ambient env.

Regenerates the README --help block; adds a CLI regression test (no virtualenv is
created and analysis.json is still produced).
@rahlk rahlk added bug Something isn't working enhancement New feature or request labels Jun 22, 2026
@rahlk rahlk merged commit 41cc449 into main Jun 22, 2026
@rahlk rahlk deleted the fix/issues-44-45-46-47 branch June 22, 2026 21:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working enhancement New feature or request

Projects

None yet

1 participant