Skip to content

bntvllnt/codebase-intelligence

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

99 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

codebase-intelligence

CLI-first codebase analysis for TypeScript projects.

Parse your codebase, build a dependency graph, compute architectural metrics, and query everything from your terminal/CI. MCP support is included as an optional secondary interface.

License: MIT Node TypeScript


Quick Start

CLI (recommended)

npx codebase-intelligence overview ./src

Common workflows:

npx codebase-intelligence hotspots ./src --metric complexity --limit 10
npx codebase-intelligence opportunities ./src --limit 10
npx codebase-intelligence duplicates ./src --mode mild --min-tokens 30
npx codebase-intelligence impact ./src parseCodebase
npx codebase-intelligence dead-exports ./src --limit 20
npx codebase-intelligence changes ./src --json
npx codebase-intelligence boundaries ./src --preset layered --list --json

MCP (optional)

claude mcp add -s user -t stdio codebase-intelligence -- npx -y codebase-intelligence@latest .

Table of Contents

Features

  • 24 CLI commands for architecture analysis, dependency impact, duplicate families, focused map/context packs, content drift, health scoring, boundary enforcement, Highways convergence, improvement opportunities, dead code detection, search, CI rules, and agent setup
  • Machine-readable JSON output (--json) for automation and CI pipelines
  • Auto-cached index in .codebase-intelligence/ for fast repeat queries
  • Cache migration facts in JSON (cacheDir, legacyCacheDir, migrated, gitignoreUpdated, warnings[])
  • Quality metrics — PageRank, betweenness, coupling, cohesion, tension, churn, cyclomatic/cognitive complexity, blast radius, dead exports, test coverage, escape velocity, risk, maintainability, and CRAP score
  • Symbol-level analysis — callers/callees, symbol importance, impact blast radius
  • BM25 search — ranked keyword search across files, symbols, and type/shape facts
  • Process tracing — detect entry points and execution flows through the call graph
  • Codebase maps + context packs — focused file/symbol/test graph with deterministic evidence IDs and token-bounded context for agents
  • Content drift — report-only detection for file/name/scope/side-effect/shape/test placement mismatch with stable evidence
  • Health score — CI-gateable composite score with per-file maintainability, CRAP, coverage source, and risk hotspots
  • Highways analysis — find repeated routes that bypass canonical dataflow paths and synthesize safe proposed highways
  • Community detection — Louvain clustering for natural file groupings
  • Agent adoptioninit writes per-agent instruction files + installs a skill so AI agents query CI before grep/read
  • MCP parity (secondary) — same analysis and rules gate available as 25 MCP tools, 2 prompts, and 3 resources

Installation

Run directly with npx (no install):

npx codebase-intelligence overview ./src

Or install globally:

npm install -g codebase-intelligence
codebase-intelligence overview ./src

CLI Usage

codebase-intelligence <command> <path> [options]

Commands

Command What it does
overview High-level codebase snapshot
hotspots Rank files by metric (coupling, churn, complexity, cognitive complexity, blast radius, coverage, risk, etc.)
file Full context for one file
search BM25 keyword and shape search
changes Git diff analysis with risk metrics
dependents File-level blast radius
modules Module architecture + cross-dependencies
forces Cohesion/tension/escape-velocity analysis
dead-exports Unused export detection
opportunities Ranked code-quality and refactoring opportunities
duplicates Duplicate function families (strict, mild, weak)
groups Top-level directory groups + aggregate metrics
symbol Callers/callees and symbol metrics
impact Symbol-level blast radius
rename Reference discovery for rename planning
processes Entry-point execution flow tracing
map Focused codebase graph + token-bounded context pack
drift Report-only content drift findings with evidence and recommendations
health CI-gateable health score with maintainability, CRAP, coverage, and risk hotspots
boundaries Architecture boundary zones, allow/forbid import rules, and violation evidence
highways Repeated route convergence, canonical path opportunities, and synthesis proposals
clusters Community-detected file clusters
owners Ownership, bus-factor, and risk grouping
architecture Evidence-backed extraction/seam/locality recommendations
workspaces Package workspace scope and changed-workspace detection
lsp Advisory editor diagnostics snapshot or minimal LSP server
watch Local watch readiness and debounced analysis events
check Rules-engine gate for CI, including opt-in dead-code and dependency gates
ci One PR quality gate around check, changes, health, formats, baselines, and history
doctor Read-only setup auditor for local, CI, MCP, and agent workflows
explain Explain one analyzer rule and next action
migrate-config Dry-run config migration to codebase-intelligence.json
hooks Plan or install local hooks that run the same CI gate
history Read local finding history from .codebase-intelligence/
init Set up AI agents to use CI — writes per-agent instruction files (skill opt-in via --skill)

Useful flags

Flag Description
--json Stable JSON output
--force Rebuild index even if cache is valid
--limit <n> Limit results on supported commands
--metric <m> Select ranking metric for hotspots
--scope <s> Select git diff scope for changes; directory/module scope for map and drift
--mode <m> Select clone mode for duplicates: strict, mild, weak
--min-tokens <n> Minimum duplicate token size for duplicates
--skip-local Ignore duplicate families confined to one file
--trace <id> Return token evidence for one duplicate family
--focus <name> Focus map or drift on one symbol, file, or scope
--context-budget <n> Bound map context pack size
--min-score <n> Minimum drift score for drift findings; minimum health score before health exits 1
--score Print compact health score text
--preset <name> Run boundaries with bulletproof, layered, hexagonal, or feature-sliced
--list List resolved boundaries zones and rules
--format <fmt> Export map as markdown, json, dot, or graphml; export check/ci as text, json, sarif, markdown, annotations, pr-comment-github, pr-comment-gitlab, badge, codeclimate, or compact
--operation <verb> Focus highways on one operation verb
--shape <name> Focus highways on one type/DTO shape
--production Exclude test/dev files from production-risk check/ci output
--changed-since <ref> Line-level check/ci filtering from a git diff
--diff-file <path> Line-level check/ci filtering from a unified diff file

The scanner always excludes common generated and agent-workspace directories such as .codebase-intelligence/, legacy .code-visualizer/, .next/, dist/, coverage/, .worktrees/, and .claude/worktrees/.

Analysis commands with --json and init --json include a top-level cache object with the canonical cache path, legacy cache path, migration status, .gitignore update status, and non-fatal cache warnings.

For full command details, see docs/cli-reference.md.

Agent Adoption

codebase-intelligence has the data — but AI agents only benefit if they actually query it instead of defaulting to grep/read. init closes that gap.

codebase-intelligence init                  # interactive picker (TTY)
codebase-intelligence init --agents claude,cursor
codebase-intelligence init --all --skill    # every agent + global skill
codebase-intelligence init --yes            # non-interactive defaults

Nothing is written unless you choose it. On a terminal, init shows an interactive picker (AGENTS.md + CLAUDE.md preselected); non-interactively it defaults to those two. The global skill is opt-in (--skill). It writes an idempotent, marked instruction block ("query CI before grep/read") into each selected agent's native file:

Layer Target Default
Repo instructions AGENTS.md, CLAUDE.md selected
Repo instructions .cursor/rules/*.mdc, .github/copilot-instructions.md, GEMINI.md, CONVENTIONS.md (Aider) opt-in
Portable skill ~/.claude/skills/codebase-intelligence/SKILL.md opt-in (--skill)

Writes are idempotent — only content between the <!-- codebase-intelligence:start --> / :end markers is ever touched, so re-running is safe and your own edits are preserved.

Install the skill directly

The skill is also published to the skills.sh registry:

ags install codebase-intelligence
# or
npx skills add github.com/bntvllnt/codebase-intelligence

MCP Integration (Secondary)

Running without a subcommand starts the MCP stdio server (backward compatible):

npx codebase-intelligence ./src

Claude Code (manual)

Add to .mcp.json:

{
  "mcpServers": {
    "codebase-intelligence": {
      "type": "stdio",
      "command": "npx",
      "args": ["-y", "codebase-intelligence@latest", "./src"],
      "env": {}
    }
  }
}

Cursor / VS Code

Add to .cursor/mcp.json or .vscode/mcp.json:

{
  "servers": {
    "codebase-intelligence": {
      "command": "npx",
      "args": ["-y", "codebase-intelligence@latest", "./src"]
    }
  }
}

For MCP tool details, see docs/mcp-tools.md.

Metrics

Metric What it reveals
PageRank Most-referenced files (importance)
Betweenness Bridge files between disconnected modules
Coupling How tangled a file is (fan-out / total connections)
Cohesion Does a module belong together? (internal / total deps)
Tension Is a file torn between modules? (entropy of cross-module pulls)
Escape Velocity Should this module be its own package?
Churn Git commit frequency
Complexity Average cyclomatic and cognitive complexity of exports/symbols
Blast Radius Transitive dependents affected by a change
Dead Exports Unused exports (safe to remove)
Test Coverage Whether a test file exists for each source file
Risk Score Composite file risk from complexity, churn, coupling, size, blast radius, and test reachability
Maintainability Index Health-derived 0-100 per-file maintainability score
CRAP Score Complexity x uncovered-code risk, using Istanbul coverage when present or static test reachability otherwise

Architecture

codebase-intelligence <command> <path>
        |
        v
   +---------+     +---------+     +----------+
   | Parser  | --> | Graph   | --> | Analyzer |
   | TS AST  |     | grapho- |     | metrics  |
   |         |     | logy    |     |          |
   +---------+     +---------+     +----------+
        |
        +--> CLI output (default)
        +--> MCP stdio (optional mode)
  1. Parser — extracts files, functions, and imports via TypeScript Compiler API.
  2. Graph — builds dependency/call graphs with graphology.
  3. Analyzer — computes file/module/symbol metrics.
  4. Interfaces — CLI is primary; MCP is available for agent integrations.

Requirements

  • Node.js >= 18
  • TypeScript codebase (.ts / .tsx files)

Limitations

  • TypeScript-focused analysis
  • Static analysis only (no runtime tracing)
  • Call graph confidence varies by symbol resolution quality

Release

Publishing is automated through GitHub Actions. See CHANGELOG.md for release notes.

Normal CI (before release)

  • CI workflow runs on every PR and push to main:
    • lint → typecheck → build → test

Canary publish

  • Pushes to main trigger a canary publish.
  • The package is published to npm with the canary tag.
  • Canary versions are derived from the current package version plus the short commit SHA.

Create a release

  1. Bump package.json version in a normal PR.
  2. Merge that PR to main.
  3. Run the Publish workflow manually from GitHub Actions.
  4. The workflow will:
    • verify the tag does not already exist
    • create and push vX.Y.Z
    • publish to npm with provenance via OIDC
    • create a GitHub Release with generated notes

No PAT is required for npm publish. The workflow uses GitHub repository permissions for tagging and OIDC for npm publishing.

Security

Please do not report security vulnerabilities in public issues.

  • Read SECURITY.md for supported versions and disclosure guidance.
  • Use GitHub Security Advisories or private maintainer contact for sensitive reports.

Contributing

Contributions are welcome.

Start here:

Quick setup:

git clone https://github.com/bntvllnt/codebase-intelligence.git
cd codebase-intelligence
pnpm install
pnpm dev          # tsx watch mode
pnpm test         # vitest
pnpm lint         # eslint
pnpm typecheck    # tsc --noEmit
pnpm build        # production build
pnpm verify:cli-real        # default real-repo CLI matrix
pnpm verify:cli-real:heavy  # large /home/ubuntu repo matrix

License

MIT

Packages

 
 
 

Contributors