Skip to content

Quickstart

This guide gets you from a clone to a working analysis.json in a couple of minutes. For installation alternatives (native binary, pre-built JAR via the Python SDK), see Installation.

  • A Linux, macOS, or WSL machine
  • A JDK, version 11 or above (Java 17 recommended). We suggest installing it with SDKMan!
  1. Install a JDK (Java 17 shown here, via SDKMan!):

    Terminal window
    sdk install java 17.0.10-sem
    sdk use java 17.0.10-sem
  2. Clone and build the fat JAR:

    Terminal window
    git clone https://github.com/codellm-devkit/codeanalyzer-java
    cd codeanalyzer-java
    ./gradlew fatJar

    The build produces a self-contained JAR at build/libs/codeanalyzer-2.3.7.jar.

  3. Confirm it runs:

    Terminal window
    java -jar build/libs/codeanalyzer-2.3.7.jar --version

Analysis level 1 parses source and builds the symbol table. It does not require building the target project, so it’s quick:

Terminal window
java -jar build/libs/codeanalyzer-2.3.7.jar \
-i /path/to/your/project \
-a 1 \
-o ./output

This writes ./output/analysis.json containing the symbol_table for every .java file.

Analysis level 2 additionally builds the WALA call graph. By default codeanalyzer will build the target project (so WALA has compiled classes and resolved dependencies to work from):

Terminal window
java -jar build/libs/codeanalyzer-2.3.7.jar \
-i /path/to/your/project \
-a 2 \
-o ./output \
-v

The -v flag streams progress logs so you can watch the build and call-graph construction.

No project, no build — pass Java source directly and get a symbol table back on stdout:

Terminal window
java -jar build/libs/codeanalyzer-2.3.7.jar \
-s "public class Hello { public static void main(String[] a){} }" \
-a 1

analysis.json has this top-level shape:

{
"symbol_table": { "/abs/path/File.java": { /* compilation unit */ } },
"call_graph": [ /* caller→callee edges, present at level 2 */ ],
"version": "2.3.7"
}

Continue to the Output schema for the full structure, or the CLI reference for every flag.

analysis.json is self-contained, but it doesn’t compose: to ask a question across a portfolio you load every blob into memory and stitch it together yourself. --emit neo4j projects the same symbol table and call graph into a Neo4j property graph instead — a queryable system of record that many applications can share. --emit selects one output target, so --emit neo4j returns without writing analysis.json.

The quickest path needs no running database. With no Bolt URI, codeanalyzer renders a self-contained, re-runnable graph.cypher snapshot:

Terminal window
java -jar build/libs/codeanalyzer-2.3.7.jar \
-i /path/to/your/project \
-a 2 \
--emit neo4j \
--app-name daytrader8 \
-o ./output
# -> ./output/graph.cypher

--app-name is the tenancy key — it anchors this app’s subgraph at a :JApplication node, so one database can host many apps side by side. Load the snapshot into any Neo4j whenever you’re ready; the script declares its constraints and indexes, does a scoped wipe of just this app’s prior subgraph, then MERGE-loads the graph:

Terminal window
cypher-shell -a bolt://localhost:7687 -u neo4j < ./output/graph.cypher

To push live and incrementally to a running cluster — only re-sending the compilation units whose content_hash changed — set a Bolt URI instead. Prefer the NEO4J_PASSWORD environment variable so the secret never lands in shell history:

Terminal window
export NEO4J_URI=bolt://localhost:7687
export NEO4J_USERNAME=neo4j
export NEO4J_PASSWORD=secret
java -jar build/libs/codeanalyzer-2.3.7.jar \
-i /path/to/your/project -a 2 \
--emit neo4j --app-name daytrader8

Once the graph is populated, the CLDK Python SDK reads it back with no JDK, binary, or project source — only read-only credentials. See the Neo4j graph output guide for the two emit modes, deployment as a Kubernetes Job, and the --emit schema contract, or Python SDK integration to read the graph from Python.