-
Notifications
You must be signed in to change notification settings - Fork 0
Comparing changes
Open a pull request
base repository: codellm-devkit/codeanalyzer-python
base: main
head repository: codellm-devkit/codeanalyzer-python
compare: feat/level3-dataflow-sdg
- 11 commits
- 37 files changed
- 1 contributor
Commits on Jul 2, 2026
-
feat(dataflow): stage 1 — exceptional statement-level CFG per callable
Level-3 groundwork (#67): hand-built CFG from the stdlib ast with the shared node/edge vocabulary, Python lowering rules (try/except/else/ finally, with, yield/await resume kinds, break/continue, synthetic escape edge for infinite loops), dead-code pruning, and source-span- ordered node ids (ENTRY=0, EXIT=last). Dataflow fixture project and CFG gate tests included.
Configuration menu - View commit details
-
Copy full SHA for 2c08649 - Browse repository at this point
Copy the full SHA 2c08649View commit details -
feat(dataflow): stage 2 — post-dominators and control dependence
Cooper–Harper–Kennedy iterative post-dominators over the reverse CFG (unique root EXIT, guaranteed by stage 1's synthetic escape edges) and Ferrante–Ottenstein–Warren control dependence with ENTRY as the region root. Gate tests pin exact hand-computed CDG sets for the fixture's if/loop/early-return functions. (#67)
Configuration menu - View commit details
-
Copy full SHA for 6af65e0 - Browse repository at this point
Copy the full SHA 6af65e0View commit details -
feat(dataflow): stage 3 — access paths, reaching definitions, DDG
k-limited access-path model with per-scope base classification (local/ param/self/global/capture), header-only facts for compound statements, comprehension scoping, closure-capture and call-mutation rules; classic worklist reaching definitions with strong kills on exact non-wildcard paths; DDG edges via textual interference plus the type-based may-alias oracle (the locked MVP points-to substrate — unknown types conservatively alias, incompatible types don't). Gate tests cover the loop-carried dependency, scope shadowing, and the aliased write/read pair. (#67)
Configuration menu - View commit details
-
Copy full SHA for 7377dc3 - Browse repository at this point
Copy the full SHA 7377dc3View commit details -
feat(dataflow): stage 4 — PDG assembly and exact backward-slice gate
PDG = CDG ∪ DDG per callable over the same node ids; intraprocedural backward slice as reverse reachability. Gate pins hand-computed exact slices: the early-return arm is excluded from the other arm's slice, loop slices close over the loop-carried dependency. (#67)
Configuration menu - View commit details
-
Copy full SHA for d21fc0a - Browse repository at this point
Copy the full SHA d21fc0aView commit details -
feat(dataflow): stage 5 — alias oracle wiring, Tarjan SCC, global qua…
…lification Iterative Tarjan SCC condensation of the frozen call-graph oracle (reverse topological schedule for bottom-up summaries); call mutations become suffixed weak defs so caller-visible mutation is distinguishable from local rebinding; global bases gain module::name qualification for the interprocedural build. (#67)
Configuration menu - View commit details
-
Copy full SHA for 789bd1c - Browse repository at this point
Copy the full SHA 789bd1cView commit details -
feat(dataflow): stages 6–7 — function summaries and SDG assembly
Relational summaries (params/captures/read-globals → return/mutations/ written-globals) composed bottom-up over the Tarjan condensation DAG, monotone fixpoint within SCCs, callee global footprints injected at callsites and reaching definitions re-solved; HRB parameter structure (formal/actual in/out nodes in the owning function's id space after EXIT), CALL/PARAM_IN/PARAM_OUT edges, SUMMARY edges from composed flows, globals as extra formals, closure captures bound at definition sites; builder maps symbol-table signatures to AST by (file, line) and treats the call graph and Jedi callsite resolutions as frozen oracles. Gates: arity, no dangling endpoints, transitive-chain SUMMARY, cross- file global flow, deterministic double-run. (#67)
Configuration menu - View commit details
-
Copy full SHA for c6f990f - Browse repository at this point
Copy the full SHA c6f990fView commit details -
feat(dataflow): stage 8a — two-phase context-sensitive backward slicing
Classic HRB traversal over the assembled SDG: phase 1 ascends and skips across callsites via SUMMARY edges (never PARAM_OUT), phase 2 descends (never PARAM_IN/CALL) — call–return matching without re-descent. Gate pins an exact hand-computed interprocedural slice (caller_of_mutate → mutate) plus cross-file global descent and no-reascend properties. (#67)
Configuration menu - View commit details
-
Copy full SHA for 43e0e69 - Browse repository at this point
Copy the full SHA 43e0e69View commit details -
feat(dataflow): program_graphs emission, -a 3, --graphs, --graph-fiel…
…d-depth program_graphs schema section (PyProgramGraphs and friends, versioned 1.0.0 independently of the application schema) attached to PyApplication; -a extended to 3 (cumulative: level 3 keeps PyCG enrichment); --graphs cfg,dfg,pdg,sdg selector with strict validation (unknown values and level<3 usage exit non-zero, never silently fall back); --graph-field-depth k-limit knob recorded in the output. -a 1/2 emit no program_graphs and their pipeline is untouched. (#67)
Configuration menu - View commit details
-
Copy full SHA for 6479c04 - Browse repository at this point
Copy the full SHA 6479c04View commit details -
feat(dataflow): stage 8b — CPG projection through the Neo4j emitter
CFGNode label (merge key id = <signature>#<node_id>) carrying both CFG statements and HRB parameter nodes, plus the shared cross-language edge vocabulary HAS_CFG_NODE / CFG_NEXT / CDG / DDG / PARAM_IN / PARAM_OUT / SUMMARY (deliberately unprefixed — parity clause). Additive schema.neo4j.json bump to 1.2.0; sample app extended so the conformance tests exercise every new row family; count-parity and no-dangling gates on the real fixture at -a 3. CALL stays at the callable level (PY_CALLS twin). (#67)
Configuration menu - View commit details
-
Copy full SHA for 3e82256 - Browse repository at this point
Copy the full SHA 3e82256View commit details -
docs(dataflow): analysis levels, Architecture & Tooling, schema decis…
…ions README gains the level table, the locked level-3 substrate decisions (CFG from stdlib ast, hand-built reaching defs, type-based may-alias MVP, documented unsoundness), a level-3 usage example, and a regenerated --help block; CHANGELOG Unreleased entry; level-3 schema decision log tracked at .claude/SCHEMA_DECISIONS.md as SDK-model input (un-ignored past the global .claude exclude). (#67)
Configuration menu - View commit details
-
Copy full SHA for ac7acb3 - Browse repository at this point
Copy the full SHA ac7acb3View commit details -
fix(neo4j): namespace the CPG overlay per language (PyCFGNode, PY_* e…
…dges) Unprefixed CFGNode/CFG_NEXT/CDG/DDG/PARAM_IN/PARAM_OUT/SUMMARY would mingle analyzers' dependence edges in a Neo4j database holding more than one language's graph — SDK backends scope queries by label/type prefix. The vocabulary stays cross-language in shape (same suffixes, props, semantics) but is PY_-namespaced in the projection like every other row family; the JSON program_graphs section keeps the unprefixed contract since each analysis.json is its own namespace. Decision recorded in .claude/SCHEMA_DECISIONS.md. (#67)
Configuration menu - View commit details
-
Copy full SHA for ddae36c - Browse repository at this point
Copy the full SHA ddae36cView commit details
This comparison is taking too long to generate.
Unfortunately it looks like we can’t render this comparison for you right now. It might be too big, or there might be something weird with your repository.
You can try running this command locally to see the comparison on your machine:
git diff main...feat/level3-dataflow-sdg