Skip to content

Latest commit

 

History

History
120 lines (96 loc) · 15.7 KB

File metadata and controls

120 lines (96 loc) · 15.7 KB

CLAUDE.md - mfbt CLI

Project Overview

mfbt CLI is a Python-based CLI tool for the mfbt platform, targeting publication to PyPI. It provides both an interactive TUI mode (K9S-style) and traditional subcommands for managing mfbt projects.

  • Language: Python 3.10+ (uses modern features: pattern matching, improved type hints)
  • Backend: Integrates with mfbt REST API ({BASE_URL}/api/v1/)
  • Auth: OAuth 2.1 with PKCE (1hr access tokens, 30-day refresh tokens), plus API key management (mfbtsk-{uuid}, passed as Authorization: Bearer {key})
  • Config: Global ~/.mfbt/ directory stores auth tokens and configuration (no per-project config)

API Specification

The file openapi.json in the project root contains the full OpenAPI spec for the mfbt backend. Do NOT read this file in full — it is very large. Instead, use Grep or search tools to find specific endpoints, schemas, or parameters as needed.

Architecture

Core Systems

  1. Authentication & Configuration — Browser-based OAuth flow with PKCE, auto token refresh + interactive re-auth on 401, API key support. Shared auth flow in auth_flow.py. Config/auth in ~/.mfbt/ (schema v2, no per-project project_id).
  2. Interactive TUI Mode — Launched when CLI runs without subcommands; K9S-style keyboard-driven navigation: Projects > Phases > Modules > Features. Ralph orchestration integrated via r key.
  3. Subcommandsauth, projects (list, show, create, archive, delete), ralph, tui. Consistent output formatting (table, JSON, etc.)
  4. API Integration Layer — REST client with on_auth_required callback for auto re-auth on 401. Paginated responses ({items, total, page, page_size, total_pages}), error handling (402 for token limits), job polling, WebSocket support.
  5. API Key Management — CRUD via /api/v1/users/me/api-keys
  6. Coding Agent Checkscoding_agents.py provides pre-flight checks (agent installed, MCP configured) before Ralph orchestration.

Key API Endpoints

  • Projects CRUD
  • Brainstorming phases
  • Modules, features, and implementations
  • Job monitoring: GET /api/v1/jobs/{job_id}
  • Thread comments
  • MCP config retrieval: GET /api/v1/projects/{id}/mcp-config
  • API key management: /api/v1/users/me/api-keys

Constraints

  • Must work with existing mfbt FastAPI backend
  • Support both UUIDs and short URL identifiers
  • Handle ISO 8601 UTC timestamps
  • Graceful degradation when token limits reached (HTTP 402)

MFBT MCP Server (Virtual Filesystem)

This project uses the mfbt MCP server, which exposes a virtual filesystem (VFS) for navigating mfbt project data. Always call readMeFirst at the start of a session to get the full VFS guide, available tools, and recommended workflow.

Key MCP Concepts

  • The VFS exposes project phases, modules, features, and implementations as a navigable filesystem
  • Use UNIX-like commands: ls, cat, tree, find, grep, head, tail
  • Smart metadata on directories guides what to work on next (progress %, next feature, completion status)
  • Two feature sources: system-generated/ (AI-created from brainstorming) and user-defined/ (manually created or imported)

MCP VFS Structure

/
├── phases/
│   ├── system-generated/{phase}/
│   │   ├── phase-spec/          # full.md, summary.md, by-section/
│   │   ├── phase-prompt-plan/   # full.md, by-section/
│   │   └── features/{module}/{feature}/
│   │       ├── implementations/{impl}/
│   │       │   ├── spec.md          # WHAT to build
│   │       │   ├── prompt_plan.md   # HOW to build
│   │       │   └── notes.md         # Writable learnings
│   │       └── conversations/conversations.md
│   └── user-defined/features/{module}/{feature}/
│       └── (same structure as above)
├── project-info/team-info.json
├── system-info/users/available-users-list.csv
└── for-coding-agents/
    ├── agents.md              # Grounding context (read at session start)
    └── mfbt-usage-guide/      # Workflow guides

MCP Recommended Workflow

  1. cat /for-coding-agents/agents.md --branch_name='your-branch' — grounding context
  2. ls /ls /phases/ → drill into phases, modules, features
  3. Read spec.md (what to build) and prompt_plan.md (how to build)
  4. setMetadataValueForKey .../features/{module}/{feature}/ in_progress true — mark as started
  5. Implement the feature
  6. write .../implementations/{impl}/notes.md '<learnings>' — document what you learned
  7. setMetadataValueForKey .../implementations/{impl}/ is_complete true — mark done (auto-syncs feature status)

MCP Tips

  • Include coding_agent_name: "claude_code" and branch_name in tool calls
  • Use summary.md for quick spec overviews; by-section/ for large specs
  • Focus on must_have features first, then important, then optional
  • notes.md auto-feeds into agents.md grounding — document architectural decisions and gotchas
  • Use grep to search across specs and prompt plans
  • For team communication (posting comments), see /for-coding-agents/mfbt-usage-guide/
  • To update status: in_progress on features when starting, is_complete on implementations when done

Development

  • Venv: .venv/bin/python, .venv/bin/pytest
  • Run tests: .venv/bin/pytest tests/unit/ -v
  • Editable install with uv: uv sync && uv pip install -e . && uv pip install -r requirements-dev.txt — then .venv/bin/mfbt ... runs the in-development CLI with edits picked up live. Note: uv sync only installs runtime deps from pyproject.toml and will uninstall anything not declared there, so re-run the requirements-dev.txt step after a sync to restore pytest/mypy/ruff/etc.
  • Config functions no longer take project_rootload_config(), save_config(), load_auth(), save_auth(), init_mfbt_dir(), resolve_config() all operate on ~/.mfbt/ via get_mfbt_dir(). TokenManager.__init__ takes only base_url.
  • API response formats: List endpoints (/features, /implementations) return paginated {"items": [...], "total": N, ...} — always extract via body["items"]. Some endpoints (/brainstorming-phases) return plain lists. See src/mfbt/tui/data_provider.py for the canonical parsing pattern.
  • TUI navigation: Projects > Phases > Modules > Features (4 levels). NavigationState stack depth: 0=projects, 1=phases, 2=modules, 3=features. Phase list shows brainstorming phases + virtual "Orphan modules" entry.
  • Ralph in TUI: Integrated into main TUI (r key), not a standalone app. Ralph runs in place inside the unified FeatureListPanel (tui/screens/feature_list.py) in #main-content — there is no separate Ralph panel — with PreflightModal for agent checks. Standalone tui_app.py and the old ralph_panel.py are deleted.
  • Ralph subcommand: src/mfbt/commands/ralph/ — orchestrator, display (console), tui_display (Textual adapter), ralph_widgets (TUI widgets), agent runner, progress (API), prompt builder, types.
  • Display duck typing: RalphOrchestrator.display is typed as Any — both RalphDisplay and RalphTUIDisplay are structurally compatible (same display protocol).
  • Ralph sandboxing (two-mechanism, hard-fail — current design): Exactly two sandbox mechanisms, each a hard requirement on its platform: sandbox-exec on macOS, bwrap (bubblewrap) on Linux. There is no unsandboxed mode, no docker/firejail, no config override, no --sandbox/--yes flag, no consent/feedback layer. If the platform's mechanism is unavailable (or the platform is neither macOS nor Linux), Ralph aborts before any agent subprocess with an actionable message.
    • sandbox/ package: models.py (SandboxMechanism={SANDBOX_EXEC,BWRAP} only; SandboxConfig, SandboxPermissions, resolve_paths), detect.py, exceptions.py, wrapper.py, adapters/ (sandbox_exec.py, bwrap.py, __init__.py). No models SandboxResult/SandboxType/CandidateOutcome/SandboxDetectionReport; no feedback.py; no docker.py/firejail.py. models.py imports the sibling leaf exceptions (acyclic, still no external deps / no import-time side effects) and does fail-loud, filesystem-free __post_init__ validation: SandboxPermissions rejects a writable filesystem-root or bare $HOME (incl. project_dir) → SandboxSetupError; SandboxConfig rejects project_dir != permissions.project_dir (the bound permissions is stored as-is, no silent replace). SandboxConfig.base_url is inert (no adapter consumes it; never a confinement input).
    • Detection: one function detect_sandbox(which_fn=shutil.which, system_fn=platform.system, *, run_fn=subprocess.run) -> SandboxMechanism. macOS→SANDBOX_EXEC (binary-presence check), Linux→BWRAP (functional liveness probe bwrap --ro-bind / / true), else→raise SandboxDetectionError. No detect_sandbox_with_report, no explicit=, no config loader. detect.py does not import mfbt.config (cycle severed; config.validate_config has no sandbox block; no _VALID_SANDBOX_TYPES/_CONFIG_OVERRIDE_KEY).
    • Adapters: each is pure build_prefix(config: SandboxConfig) -> list[str] + classify_exit(rc, sandbox_stderr) -> SandboxError|None. adapters/__init__.py get_adapter/get_classifier are strict 2-entry dispatch (unknown mechanism → SandboxError, no identity/no-op). SBPL asset commands/ralph/profiles/claude.sb (MCLI-126 write-confine: (allow default) + global write-deny + per-writable re-allow + a curated /dev device-node allow-list/dev/null|zero|random|urandom|tty|dtracehelper|ptmx, ^/dev/ttys[0-9]+$, (subpath "/dev/fd")not a blanket (subpath "/dev"); kept a single line so the comment-breakout test's balanced-sexpr invariant holds; network open) loaded via importlib.resources; hatchling auto-bundles non-.py package files. classify_exit re-attributes a non-zero exit to the sandbox only on the sandbox's own line-anchored stderr signature ([Sandbox] wrapper prefix stripped first; sandbox_exec: deny(\d+) regex anywhere on a line OR a line starting sandbox-exec:; bwrap: a line starting bwrap:); else None = Claude's own exit, returned as AgentResult.
    • SandboxWrapper: wrap_command(cmd) canonical (wrap(cmd) alias); prepare() is a no-arg no-op lifecycle hook; cleanup() removes the sandbox-exec SBPL temp profile in run()'s finally. mechanism=Nonedetect_sandbox() opt-in (orchestrator passes a concrete mechanism; bad value→SandboxError). Sandbox stderr: pure format_sandbox_stderr ([Sandbox] prefix) + note_stderr_line (always attributes) / get_sandbox_stderr; sandbox tool + claude share ONE stderr fd by design (attribution by [Sandbox] marker, not OS-fd — do not split processes). AgentResult.sandbox_stderr defaulted.
    • AgentRunner: __init__(*, project_dir=None, base_url=None) (no assume_yes/presented). sandbox_wrapper is None→auto-detect once; injected wrapper used as-is. The no-wrapper default base_url is sourced from mfbt.config.DEFAULT_CONFIG["base_url"] (single source — no hand-synced _DEFAULT_BASE_URL constant). build_command always inserts --dangerously-skip-permissions at argv[1] (a real OS sandbox is always the boundary; headless claude -p/stdin=DEVNULL can't answer tool prompts). The subprocess Popen is launched with cwd= pinned to the sandbox's own realpath'd project_dir (single source = the wrapper's SandboxConfig; resolved exactly as the adapters resolve it) so the run location is always inside the confined/bound area, never the inherited parent CWD — cwd is part of the frozen Popen signature. After the timed stderr join, run() logs (WARNING) if the drain was truncated, and on a non-zero exit not attributed to a denial logs INFO (WARNING if truncated) so a possibly-missed denial is never fully silent. run() = a try/except SandboxError setup phase (build→prepare()wrap_command) + the subprocess try/…/finally cleanup() phase; ordered except clauses (base SandboxError last) each _emit_sandbox_error(exc); raise (re-raise, not SystemExit). Popen OSError EPERM/EACCES→SandboxDeniedError. No _handle_none_sandbox_fallback/acknowledge_unsandboxed/_emit_sandbox_note/_NONE_SANDBOX_WARNING.
    • Exceptions: SandboxError(message, *, mechanism=None) (__str__="[<mechanism>] <message>"); subclasses SandboxDetectionError, SandboxSetupError, SandboxDeniedError, SandboxCompatibilityError (reserved, defensively handled, never raised by current paths). All 5 re-exported from sandbox/__init__.
    • Hard-fail wiring (TWO chokepoints — construction AND run): (1) construction: RalphOrchestrator.__init__ calls detect_sandbox(), logs+re-raises SandboxDetectionError. (2) post-construction: orchestrator._implement_feature has an except SandboxError: logger.error(...); raise before its broad except Exception so a SandboxSetupError/SandboxDeniedError from agent.run() is never downgraded to a per-feature FAILED (which would loop on a broken sandbox and mask a denial) — it propagates out of run(). CLI commands/ralph/__init__.py: except SandboxDetectionErrordisplay.error+typer.Exit(1); display.sandbox_status(...) before orchestrator.run(); except SandboxErrordisplay.error("sandbox failure — Ralph aborted: …")+typer.Exit(1); summary.failed>0typer.Exit(1). TUI feature_list.py::_run_orchestrator: except SandboxDetectionError (now logger.error(exc_info=True) — parity with CLI) wrapping the RalphOrchestrator(...) construction → notify(error) + _return_to_browsing; a separate except SandboxError around run() → notify("Sandbox failure — Ralph aborted: …", error) + _return_to_browsing (both fail closed visibly). RalphConfig has no assume_yes/sandbox; the pre-run typer.confirm prompt is gone; --status only conflicts with --quiet. RalphHeader._sandbox_label() always shows the mechanism (no UNSANDBOXED branch). RalphDisplay/RalphTUIDisplay keep sandbox_status (stdout/log, quiet-suppressed, enum .value verbatim); sandbox_warning removed.
    • Debug: module-scope _debug_enabled = False in wrapper.py AND detect.py (literal # TODO: decision needed — wire to --<flag-name> and <ENV_VAR_NAME>, intentionally unwired; tests monkeypatch.setattr(<mod>, "_debug_enabled", True)). [sandbox-debug] via wrapper.format_sandbox_debug/emit_debug_sandbox_stderr()/detect._probe_liveness.
    • Test gotchas: __init__'s logger param shadows the module logger — use logging.getLogger(__name__). test_orchestrator.py autouse fixture patches orchestrator.detect_sandbox (NOT _with_report) returning a bare SandboxMechanism. test_agent.py autouse fixture patches agent.detect_sandbox + agent.SandboxWrapper; _stub_wrapper defaults to SANDBOX_EXEC (a real sandbox) and sets classify_exit.return_value=None, so build_command includes --dangerously-skip-permissions at index 1 — assert presence, never absence; mock wrappers need wrap_command.side_effect (not .wrap). test_detect.py::test_import_has_no_side_effects must keep the parent-package detect attr save/restore in finally (full-suite-order hermeticity). Real-wrapper integration tests in test_agent.py exercise BWRAP/SANDBOX_EXEC through run() mocking only Popen/which.
  • Key files: auth_flow.py (shared OAuth), coding_agents.py (agent pre-flight checks), tui/screens/phase_list.py, tui/screens/preflight_modal.py, tui/screens/feature_list.py (unified browse + in-place Ralph panel), commands/ralph/ralph_widgets.py.
  • TUI shortcuts: r = Ralph, ctrl+r = Refresh, d = Describe, enter = Open/Detail, esc = Back, ? = Help, q = Quit.