Quickstart
This guide gets you from a clone to a working analysis.json in a couple of minutes. For installation alternatives (native binary, pre-built JAR via the Python SDK), see Installation.
Prerequisites
Section titled “Prerequisites”- A Linux, macOS, or WSL machine
- A JDK, version 11 or above (Java 17 recommended). We suggest installing it with SDKMan!
Build the JAR
Section titled “Build the JAR”-
Install a JDK (Java 17 shown here, via SDKMan!):
Terminal window sdk install java 17.0.10-semsdk use java 17.0.10-sem -
Clone and build the fat JAR:
Terminal window git clone https://github.com/codellm-devkit/codeanalyzer-javacd codeanalyzer-java./gradlew fatJarThe build produces a self-contained JAR at
build/libs/codeanalyzer-2.3.7.jar. -
Confirm it runs:
Terminal window java -jar build/libs/codeanalyzer-2.3.7.jar --version
Run your first analysis
Section titled “Run your first analysis”Symbol table only (fast)
Section titled “Symbol table only (fast)”Analysis level 1 parses source and builds the symbol table. It does not require building the target project, so it’s quick:
java -jar build/libs/codeanalyzer-2.3.7.jar \ -i /path/to/your/project \ -a 1 \ -o ./outputThis writes ./output/analysis.json containing the symbol_table for every .java file.
Symbol table + call graph
Section titled “Symbol table + call graph”Analysis level 2 additionally builds the WALA call graph. By default codeanalyzer will build the target project (so WALA has compiled classes and resolved dependencies to work from):
java -jar build/libs/codeanalyzer-2.3.7.jar \ -i /path/to/your/project \ -a 2 \ -o ./output \ -vThe -v flag streams progress logs so you can watch the build and call-graph construction.
Analyze a single source string
Section titled “Analyze a single source string”No project, no build — pass Java source directly and get a symbol table back on stdout:
java -jar build/libs/codeanalyzer-2.3.7.jar \ -s "public class Hello { public static void main(String[] a){} }" \ -a 1Read the output
Section titled “Read the output”analysis.json has this top-level shape:
{ "symbol_table": { "/abs/path/File.java": { /* compilation unit */ } }, "call_graph": [ /* caller→callee edges, present at level 2 */ ], "version": "2.3.7"}Continue to the Output schema for the full structure, or the CLI reference for every flag.
Emit to Neo4j
Section titled “Emit to Neo4j”analysis.json is self-contained, but it doesn’t compose: to ask a question across a portfolio you load every blob into memory and stitch it together yourself. --emit neo4j projects the same symbol table and call graph into a Neo4j property graph instead — a queryable system of record that many applications can share. --emit selects one output target, so --emit neo4j returns without writing analysis.json.
The quickest path needs no running database. With no Bolt URI, codeanalyzer renders a self-contained, re-runnable graph.cypher snapshot:
java -jar build/libs/codeanalyzer-2.3.7.jar \ -i /path/to/your/project \ -a 2 \ --emit neo4j \ --app-name daytrader8 \ -o ./output# -> ./output/graph.cypher--app-name is the tenancy key — it anchors this app’s subgraph at a :JApplication node, so one database can host many apps side by side. Load the snapshot into any Neo4j whenever you’re ready; the script declares its constraints and indexes, does a scoped wipe of just this app’s prior subgraph, then MERGE-loads the graph:
cypher-shell -a bolt://localhost:7687 -u neo4j < ./output/graph.cypherTo push live and incrementally to a running cluster — only re-sending the compilation units whose content_hash changed — set a Bolt URI instead. Prefer the NEO4J_PASSWORD environment variable so the secret never lands in shell history:
export NEO4J_URI=bolt://localhost:7687export NEO4J_USERNAME=neo4jexport NEO4J_PASSWORD=secret
java -jar build/libs/codeanalyzer-2.3.7.jar \ -i /path/to/your/project -a 2 \ --emit neo4j --app-name daytrader8Once the graph is populated, the CLDK Python SDK reads it back with no JDK, binary, or project source — only read-only credentials. See the Neo4j graph output guide for the two emit modes, deployment as a Kubernetes Job, and the --emit schema contract, or Python SDK integration to read the graph from Python.