Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
214 commits
Select commit Hold shift + click to select a range
57a0189
fix(db): split referral_legacy migration to handle PostgreSQL enum li…
brandonkachen Feb 4, 2026
d6d19fa
refactor(db): Switch from drizzle-kit push to migrate for safer produ…
brandonkachen Feb 4, 2026
b3b8644
fix(db): Remove backfill migration to fix PostgreSQL enum transaction…
brandonkachen Feb 4, 2026
92a7603
chore: Remove backfill script (already applied manually)
brandonkachen Feb 4, 2026
a07936a
Enable invoice creation and tax id collection in stripe checkout
jahooma Feb 4, 2026
e0435f8
fix(db): Remove trailing comma in migration journal JSON
brandonkachen Feb 4, 2026
1141a88
Revert "refactor(db): Switch from drizzle-kit push to migrate for saf…
brandonkachen Feb 5, 2026
d9efa47
fix: use dynamic WEBSITE_URL instead of hardcoded codebuff.com in usa…
brandonkachen Feb 5, 2026
119670c
Use full width for terminal command preview
brandonkachen Feb 5, 2026
2ef223e
fix: move Stripe webhook helpers to separate file to fix Next.js rout…
brandonkachen Feb 5, 2026
ab065a3
fix: add missing env variable (#427)
Roshan-anand Feb 5, 2026
70d3787
Try to fix some timeout errors
jahooma Feb 5, 2026
ae0f600
New doc: What makes Codebuff unique (generated from all our marketing…
jahooma Feb 5, 2026
3946e0f
Delete where codebuff shines (mostly obvious stuff b/c 1 year later, …
jahooma Feb 5, 2026
2c423c3
Subscription client changes (#424)
jahooma Feb 5, 2026
9b50b8e
Bump version to 1.0.610
github-actions[bot] Feb 5, 2026
34ac8ee
Subscription endpoint: support token bearer auth
jahooma Feb 5, 2026
7c7710e
Don't show input box and subscriptoin limit banner together
jahooma Feb 5, 2026
af93b6f
Don't allow escaping the subscription limit banner unless you click t…
jahooma Feb 5, 2026
e3ef8e5
Bump version to 1.0.611
github-actions[bot] Feb 5, 2026
ebd3048
Trigger buffbench remotely
jahooma Feb 5, 2026
7001603
Change always use a la carte to simply whether credit spending is ena…
jahooma Feb 5, 2026
363c40f
Bump version to 1.0.612
github-actions[bot] Feb 5, 2026
ecaff08
Fix updating a la carte preference in prod: allow auth via bearer token
jahooma Feb 5, 2026
68dc003
refactor: remove Pure Usage subscription and stripe_price_id from use…
brandonkachen Feb 5, 2026
97e94e4
Update pricing page title/description metadata
jahooma Feb 5, 2026
f8383bc
Fix anthropic to open router mapping for opus 4.6
jahooma Feb 5, 2026
db5ca02
Opus 4.6 (#428)
jahooma Feb 6, 2026
752a623
Bump version to 1.0.613
github-actions[bot] Feb 6, 2026
e437455
Add a step prompt to read relevant skills
jahooma Feb 6, 2026
7175592
Tweak to not claim 3x speedup in general compared to CC
jahooma Feb 6, 2026
fff1734
Allow free requests to go through even when subscription depleted and…
jahooma Feb 6, 2026
86b012e
Add standalone /subscribe page which only has buttons to subscribe. R…
jahooma Feb 7, 2026
0ba93b6
Remove npm token from sdk-release workflow now that it is using Trust…
jahooma Feb 7, 2026
09d8c2b
Bump version to 1.0.614
github-actions[bot] Feb 7, 2026
cbad21d
Simplify onboarding flow copy
jahooma Feb 7, 2026
147e0ad
Replace /referral with /refer-friends and let them copy referral url
jahooma Feb 7, 2026
52135d9
Fix test
jahooma Feb 7, 2026
ed33ed3
web: simplify buy credits panel. no custom purchase amount
jahooma Feb 7, 2026
8f736f8
Update to node 24 for sdk release script to try to make it work
jahooma Feb 7, 2026
5bf6772
Bump SDK version to 0.10.3
github-actions[bot] Feb 7, 2026
04c832c
sdk: Codebuff's fixes so index.d.ts is bundled in dist
jahooma Feb 7, 2026
78ecb9b
Bump SDK version to 0.10.4
github-actions[bot] Feb 7, 2026
9469d3d
grant credits script
jahooma Feb 7, 2026
ed92ef3
Ads info banner. Encourage users to hide ads if they don't like them
jahooma Feb 7, 2026
2bc8693
Bump version to 1.0.615
github-actions[bot] Feb 7, 2026
0d772d2
Add some model types to standard agent definition
jahooma Feb 7, 2026
2458834
Fix anthropic toolcall bug (#431)
jahooma Feb 9, 2026
49d633b
update sdk changelog
jahooma Feb 9, 2026
23a38ca
Bump SDK version to 0.10.5
github-actions[bot] Feb 9, 2026
17de998
Fix markdown table rendering: wrap text instead of truncating (#432)
aether-agent[bot] Feb 9, 2026
55c683d
agent-runtime: Add case for aborted tools without tool results to be …
jahooma Feb 9, 2026
b487109
subscription: Remove description that hardcoded tier
jahooma Feb 9, 2026
c826050
subscription: Delete change subscription route, which is unused
jahooma Feb 9, 2026
790a6ae
Add claude sub deprecation notice to cli / web
jahooma Feb 9, 2026
595836b
Bump version to 1.0.616
github-actions[bot] Feb 10, 2026
de46526
Update docs with Opus 4.6
jahooma Feb 10, 2026
ae056a7
cli: Tweak display layout for suggestion menu
jahooma Feb 10, 2026
ceca36a
Continue the turn if only <think> tags were included in the response
jahooma Feb 10, 2026
a0c593a
Don't call phantom skills
jahooma Feb 10, 2026
d576e74
Add systeminformation as cli dependency
jahooma Feb 10, 2026
e63160c
Revert "cli: Tweak display layout for suggestion menu"
jahooma Feb 10, 2026
8033e0b
Bump version to 1.0.617
github-actions[bot] Feb 10, 2026
911f66f
Handle aborts within agent-runtime
jahooma Feb 11, 2026
020ccb7
feat: add and use new feedback API endpoint with validation
brandonkachen Feb 9, 2026
cd0f3b5
fix: resolve Bedrock assistant message prefill error in CLI agents
brandonkachen Feb 12, 2026
94e787a
If no tool result content, insert an empty json content
jahooma Feb 12, 2026
17c85fa
Rename editor-glm to editor-lite and update to MiniMax M2.5
jahooma Feb 12, 2026
e94eccf
Add code reviewer lite (minimax m2.5). Remove editor-lite from base2-…
jahooma Feb 12, 2026
2043cce
Update docs with MiniMax M2.5
jahooma Feb 12, 2026
412e148
Bump version to 1.0.618
github-actions[bot] Feb 12, 2026
7297a7f
Fix tests
jahooma Feb 13, 2026
6cda457
Update buffbench to run with base2-free
jahooma Feb 13, 2026
cd25699
chore: remove Ctrl+U debug log from multiline-input
brandonkachen Feb 12, 2026
b70f174
Make code reviewer lite free in free mode
jahooma Feb 13, 2026
3f5837f
Pass costmode through correctly for free mode
jahooma Feb 14, 2026
6e9d923
Bump version to 1.0.619
github-actions[bot] Feb 14, 2026
755a22f
Fix free agents so commander-lite matches right model
jahooma Feb 14, 2026
0ff2d0a
sdk: skills dir parameter
jahooma Feb 14, 2026
9e275a8
sdk: update changelog
jahooma Feb 14, 2026
d7f211b
Bump SDK version to 0.10.6
github-actions[bot] Feb 14, 2026
30d3216
Add more detail to subscription page
jahooma Feb 14, 2026
7e8fe65
Add build free button in plan mode
jahooma Feb 15, 2026
777cbab
Claude Code and Codebuff Comparison Table UI Improved (#439)
parikhvedant2003 Feb 18, 2026
82e773d
gravity api: Set relevancy to 0.3
jahooma Feb 19, 2026
22cc1b5
Make ad CTA bold with larger click target
jahooma Feb 20, 2026
8573955
Bump version to 1.0.620
github-actions[bot] Feb 20, 2026
004811b
Add script to analyze subscription usage
jahooma Feb 20, 2026
a7c73cf
Make write_file completely deterministic. No edit snippet
jahooma Feb 21, 2026
5ecf1f0
Fix prompt cache test: remove mocks, add git status case
jahooma Feb 21, 2026
8092589
fix contrib to not say bun dev runs both
jahooma Feb 23, 2026
3112583
Fix types
jahooma Feb 23, 2026
48bb77f
Fix modes table
jahooma Feb 23, 2026
91e9d5d
Slim down quick start page
jahooma Feb 23, 2026
38686e3
Improve troubleshooting
jahooma Feb 23, 2026
bbe89f6
Fix another markdown table in docs
jahooma Feb 23, 2026
d9fe5a0
init analytics earlier
jahooma Feb 24, 2026
f5b7ea2
Bump version to 1.0.621
github-actions[bot] Feb 24, 2026
558174d
Add mapping for sonnet 4.6
jahooma Feb 25, 2026
518639d
web: Add account tab that shows your email address
jahooma Feb 25, 2026
b625411
Update discord bot to give more context on "email"
jahooma Feb 25, 2026
18a91b4
Add confirmation dialog to auto top-up if it would place a charge imm…
jahooma Feb 25, 2026
468c5cc
completions endpoint: Move out of credits check to after checking for…
jahooma Feb 25, 2026
99ac3a1
loading agents: Don't load files within skills
jahooma Feb 26, 2026
18fc360
pricing: Add tooltip with credits per dollar info on subscription tier
jahooma Feb 26, 2026
f10f5c1
web endpoints: Create subscription credit block first before credits …
jahooma Feb 26, 2026
f9a3660
Codex agent (#444)
jahooma Feb 26, 2026
e457e3f
Bump version to 1.0.622
github-actions[bot] Feb 26, 2026
4748e45
Test codex agent in buffbench
jahooma Feb 26, 2026
2924bf4
Update sdk changelog
jahooma Mar 1, 2026
e6cff22
Bump SDK version to 0.10.7
github-actions[bot] Mar 1, 2026
e8efaff
Increase subscription credits in tier 100 by 20%, tier 500 by 5%
jahooma Mar 2, 2026
5fce596
Disable ads when you have a subscription
jahooma Mar 2, 2026
e470294
feat: Add Hippo memory integration for persistent context across sess…
reillyse Feb 2, 2026
3a3662e
feat(cli): Improve hippo integration with file tracking and concept e…
reillyse Feb 2, 2026
8d96c04
update
reillyse Feb 11, 2026
850fc68
refactor(hippo): Simplify hippo integration and remove redundant cont…
reillyse Mar 15, 2026
5c28ec1
fix(cli): Fix init-direnv test mock path and consolidate HIPPO_BINARY…
reillyse Mar 15, 2026
d2097b6
refactor(hippo): Replace server sub-agent with async local CLI call
reillyse Mar 17, 2026
8c3d01d
fix(hippo): Only show green indicator when Neo4j connection is verified
reillyse Mar 17, 2026
31cc36f
feat(agent-runtime): Retry transient API errors (500/502/503/504) wit…
reillyse Mar 17, 2026
b26ff31
fix(cli): Reset hippo state when toggling hippo enabled/disabled
reillyse Mar 17, 2026
e20b9ac
fix(hippo): Cache logging setting, fix connection health tracking, lo…
reillyse Mar 17, 2026
12b4c96
chore: Remove commented-out dead code
reillyse Mar 17, 2026
1a10fed
feat(cli): Improve hippo status indicator and add abortable retry sleep
reillyse Mar 18, 2026
01c3e4c
chore: Add build-codebuff.sh script, update update-global-codebuff.sh…
reillyse Mar 18, 2026
d9285c1
feat: Enable CLAUDE_OAUTH_ENABLED=true in build script for local builds
reillyse Mar 18, 2026
b72138d
fix(hippo): resolve binary from PATH before hardcoded dev fallback
reillyse Mar 19, 2026
5f4b628
feat: hippo error handling, Claude OAuth refresh token for K8s, deplo…
reillyse Mar 20, 2026
8c4e2d5
feat: add K8s secrets generation script and update deployment docs
reillyse Mar 20, 2026
13f05b4
feat: add cli-lite TUI-free CLI package
reillyse Mar 20, 2026
3c22afd
fix: add stubs for upstream-only features (chatgpt-oauth, IS_FREEBUFF…
reillyse Mar 20, 2026
db525a3
feat(cli-lite): add hippo hooks, prompt/response logger, and REPL int…
reillyse Mar 20, 2026
79cc9aa
feat(cli-lite): add agent system, build scripts, and REPL commands
reillyse Mar 20, 2026
dfcbc95
Remove max token output limit
jahooma Mar 11, 2026
2abda78
Fix streaming
jahooma Mar 9, 2026
73f3e69
fix(write-file): use error.message instead of error.msg in catch bloc…
nil957 Mar 9, 2026
aba097f
fix(cli): check publish command at argv[2] position only (#468)
nil957 Mar 9, 2026
b30246a
Trim diff viewer of new lines
jahooma Mar 10, 2026
550bdb5
Fix todo rendering
jahooma Mar 10, 2026
3727091
fix: preserve line breaks in expanded thinking content (#456)
hobostay Mar 9, 2026
dd8550c
fix: improve environment validation error messages (#461)
hiSandog Mar 9, 2026
9dcaf75
fix: preserve MCP tool params when MCP schemas are rendered as allOf …
sonwr Mar 9, 2026
8696ad4
cli: Trim new lines before/after assistant message
jahooma Mar 10, 2026
6be27d9
Improvements for set_output tool prompt/params parsing
jahooma Mar 19, 2026
5466b24
Parse error from aisdk to properly show Forbidden
jahooma Mar 14, 2026
600cd5d
Only use amazon bedrock for our base2 opus, so there are fewer prompt…
jahooma Mar 10, 2026
a8a0f2c
Revamped context pruner: separate budget for user messages; build by…
jahooma Mar 18, 2026
1a9e047
context pruner: Include file editing results in summary. exclude some…
jahooma Mar 18, 2026
ceabff5
Route openai requests through direct api instead of open router
jahooma Mar 13, 2026
d507db7
fix: resolve type errors from upstream cherry-picks
reillyse Mar 23, 2026
8f0a4f3
feat: sync cli-lite with hippo-integration-clean, add terminal size s…
reillyse Mar 23, 2026
b083b84
feat: add Claude OAuth credential validation and prevent silent fallb…
reillyse Mar 24, 2026
0b2b0ff
refactor: remove all ANSI escape codes and animations from cli-lite f…
reillyse Mar 25, 2026
02666b2
feat: add Claude token service, prompt logger, OAuth deployment docs,…
reillyse Mar 25, 2026
9ba9295
updating
reillyse Mar 29, 2026
b8035f9
feat(cli-lite): debounce input lines to prevent multiple agent invoca…
reillyse Mar 30, 2026
927842f
feat(cli-lite): show tool and agent names in spinner messages, verbos…
reillyse Mar 31, 2026
8f3bae0
feat: include git SHA in version output for cli and cli-lite, improve…
reillyse Apr 1, 2026
ae69f1b
feat: improve agent error handling, deduplicate status code constants…
reillyse Apr 1, 2026
1bf3ace
feat: add subagent retry tests, fix AgentOutput type casts, deduplica…
reillyse Apr 1, 2026
e024686
fix: improve transient error retry logic with authoritative status co…
reillyse Apr 1, 2026
c19ef08
refactor: fix import ordering in spawn-agents.ts
reillyse Apr 1, 2026
3332874
fix: address review feedback - shared isTransientApiError, cost track…
reillyse Apr 1, 2026
573b104
fix: add missing env vars to test setup (fixes loop-agent-steps tests)
reillyse Apr 3, 2026
89ccf55
fix: add message schema validation to set_messages, defensive null co…
reillyse Apr 3, 2026
3353c4c
fix: resolve TS2322 in set-messages.ts with Message[] type assertion
reillyse Apr 5, 2026
7dfef45
fix: omit undefined statusCode/error fields from AgentOutput error ob…
reillyse Apr 7, 2026
ad4e9da
ci: add notify-weft workflow to trigger agent rebuild on push
reillyse Apr 7, 2026
31402ce
ci: use GITHUB_TOKEN secret for weft dispatch instead of WEFT_DISPATC…
reillyse Apr 7, 2026
4c4e301
ci: use CODEBUFF_GITHUB_TOKEN for weft dispatch (existing PAT with re…
reillyse Apr 7, 2026
377f52b
fix: default CODEBUFF_VERBOSE and CODEBUFF_PROMPT_LOG to disabled
reillyse Apr 16, 2026
b20b778
fix: add post-append truncation to prompt logger
reillyse Apr 16, 2026
add8550
docs: add CODEBUFF_PROMPT_LOG to deployment env vars table
reillyse Apr 16, 2026
0cdecb9
feat: centralize model version constants and upgrade to latest models
reillyse Apr 16, 2026
effc40c
fix: suppress Tool:/Agent: spinner messages in non-verbose mode in co…
reillyse Apr 16, 2026
c5e1820
fix: update stale openrouterModels entries and migrate Gemini 2.5 Flash
reillyse Apr 16, 2026
5c72359
fix: update openrouter_claude_sonnet_4_5 to sonnet-4.6 to match CURRE…
reillyse Apr 16, 2026
c855385
feat(cli-lite): add ask_user tool handler and progress dot timer
reillyse Apr 20, 2026
820d8ba
fix(ci): escape commit message in notify-weft.yml with toJSON()
reillyse Apr 20, 2026
6032bb7
feat(cli-lite): add message queue for input during agent runs
reillyse Apr 20, 2026
eac1043
fix(cli-lite): change default agent mode from MAX to DEFAULT
reillyse Apr 21, 2026
00ef9a5
feat(agents): downgrade default model from claude-opus-4.7 to 4.6 wit…
reillyse Apr 28, 2026
2e48725
Prompt cache debugging
jahooma Mar 6, 2026
72bb6d3
More comprehensive prompt cache debugging logs
jahooma Mar 7, 2026
31ba76d
Update cache debug script
jahooma Mar 10, 2026
c20e90e
Further cache debugging code to track usage
jahooma Mar 10, 2026
f241743
Add /connect:chatgpt
jahooma Mar 12, 2026
312524e
Enable /review and /connect:chatgpt in Freebuff
jahooma Mar 12, 2026
a204887
Get chatgpt oauth working
jahooma Mar 13, 2026
ec9ed7b
UX improvements for connecting chatgpt
jahooma Mar 13, 2026
2b46b83
chore(cli): canonicalize /connect:chatgpt and /connect:claude commands
github-actions[bot] Apr 23, 2026
6b3ad88
feat(agent-runtime): normalizeConversation pre-model-call invariant
github-actions[bot] Apr 22, 2026
fbc43f7
feat(telemetry): add Sparrow OTel telemetry module and config
github-actions[bot] Apr 23, 2026
0307355
feat(cli): add /telemetry command and telemetry lifecycle
github-actions[bot] Apr 23, 2026
b3eeb1b
feat(agent-runtime,sdk): instrument prompt/agent/step/llm/tool spans
github-actions[bot] Apr 23, 2026
015ce81
docs(openspec,sparrow): document telemetry change + update SPARROW_CH…
github-actions[bot] Apr 23, 2026
31cb81a
feat(telemetry): auto-flush at end of every top-level turn
github-actions[bot] Apr 23, 2026
2bd17f2
sparrow: telemetry — propagate identity attrs and add oauth account id
github-actions[bot] Apr 25, 2026
1aaae55
sparrow(telemetry): add chatgpt/claude oauth_eligible attrs to gen_ai…
github-actions[bot] Apr 25, 2026
a38f200
fixup(telemetry): install OTel deps, add errorToolResult helper, fix …
reillyse Apr 29, 2026
7ba775d
fix(agents): add missing tools to file-picker and file-lister toolNames
reillyse Apr 30, 2026
42530b4
chore: upgrade models — Opus 4.8, GPT-5.2, Grok 4.3, Haiku 4.5, Sonne…
reillyse Jun 1, 2026
8a9536f
feat(hippo): add subagent lifecycle hooks for hippo context injection
reillyse Jun 2, 2026
1c51977
chore(cli-lite): add stamp-version script for automatic git ref stamping
reillyse Jun 2, 2026
7c288ce
fix(scripts): use `which codebuff` to detect correct install location
reillyse Jun 2, 2026
3605e33
feat(hippo): prose-over-file-lists for subagent stores + tests
reillyse Jun 2, 2026
bf149bf
refactor: standardize agents to OpenAI + Anthropic models only via OAuth
reillyse Jun 2, 2026
28f7f3a
chore(models): remove unused openrouter_grok_4 model definition
reillyse Jun 4, 2026
0341bf2
feat(cli-lite): improve debug output for tool calls and subagents
reillyse Jun 5, 2026
6529e59
chore(cli-lite): regenerate bundled agents (gemini-free)
reillyse Jun 5, 2026
dafe462
fix(retry): retry mid-stream Anthropic overloads (AI_NoOutputGenerate…
reillyse Jun 5, 2026
b816c59
feat(retry): surface user-facing reason for mid-stream transient retries
reillyse Jun 5, 2026
fb4e667
fix(agent-runtime): add per-subagent timeout to prevent fan-out deadlock
reillyse Jun 18, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
15 changes: 7 additions & 8 deletions .agents/claude-code-cli.ts
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ const baseDefinition = createCliAgent({
startCommand: 'claude --dangerously-skip-permissions',
permissionNote:
'Always use `--dangerously-skip-permissions` when testing to avoid permission prompts that would block automated tests.',
model: 'anthropic/claude-opus-4.5',
model: 'anthropic/claude-opus-4.8',
})

// Constants must be inside handleSteps since it gets serialized via .toString()
Expand All @@ -23,9 +23,8 @@ const definition: AgentDefinition = {
yield {
toolName: 'add_message',
input: {
role: 'assistant',
content: 'I\'ll first gather context and prepare before starting the ' + CLI_NAME + ' CLI session.\n\n' +
'Let me read relevant files and understand the task to provide better guidance to the CLI.',
role: 'user',
content: 'Before starting the ' + CLI_NAME + ' CLI session, gather context by reading relevant files and understanding the task to provide better guidance to the CLI.',
},
includeToolCall: false,
}
Expand Down Expand Up @@ -92,10 +91,10 @@ const definition: AgentDefinition = {
yield {
toolName: 'add_message',
input: {
role: 'assistant',
content: 'I have started a ' + CLI_NAME + ' tmux session: `' + sessionName + '`\n\n' +
'I will use this session for all CLI interactions. The session name must be included in my final output.\n\n' +
'Now I\'ll proceed with the task using the helper scripts:\n' +
role: 'user',
content: 'A ' + CLI_NAME + ' tmux session has been started: `' + sessionName + '`\n\n' +
'Use this session for all CLI interactions. The session name must be included in your final output.\n\n' +
'Proceed with the task using the helper scripts:\n' +
'- Send commands: `./scripts/tmux/tmux-cli.sh send "' + sessionName + '" "..."`\n' +
'- Capture output: `./scripts/tmux/tmux-cli.sh capture "' + sessionName + '" --label "..."`\n' +
'- Stop when done: `./scripts/tmux/tmux-cli.sh stop "' + sessionName + '"`',
Expand Down
10 changes: 5 additions & 5 deletions .agents/codebuff-local-cli.ts
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ const baseDefinition = createCliAgent({
startCommand: 'bun --cwd=cli run dev',
permissionNote:
'No permission flags needed for Codebuff local dev server.',
model: 'anthropic/claude-opus-4.5',
model: 'anthropic/claude-opus-4.8',
skipPrepPhase: true,
spawnerPromptExtras: `**Purpose:** E2E visual testing of the Codebuff CLI itself. This agent starts a local dev Codebuff CLI instance and interacts with it to verify UI behavior.

Expand Down Expand Up @@ -95,10 +95,10 @@ const definition: AgentDefinition = {
yield {
toolName: 'add_message',
input: {
role: 'assistant',
content: 'I have started a ' + CLI_NAME + ' tmux session: `' + sessionName + '`\n\n' +
'I will use this session for all CLI interactions. The session name must be included in my final output.\n\n' +
'Now I\'ll proceed with the task using the helper scripts:\n' +
role: 'user',
content: 'A ' + CLI_NAME + ' tmux session has been started: `' + sessionName + '`\n\n' +
'Use this session for all CLI interactions. The session name must be included in your final output.\n\n' +
'Proceed with the task using the helper scripts:\n' +
'- Send commands: `./scripts/tmux/tmux-cli.sh send "' + sessionName + '" "..."`\n' +
'- Capture output: `./scripts/tmux/tmux-cli.sh capture "' + sessionName + '" --label "..."`\n' +
'- Stop when done: `./scripts/tmux/tmux-cli.sh stop "' + sessionName + '"`',
Expand Down
15 changes: 7 additions & 8 deletions .agents/codex-cli.ts
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,7 @@ const baseDefinition = createCliAgent({
startCommand: 'codex -a never -s danger-full-access',
permissionNote:
'Always use `-a never -s danger-full-access` when testing to avoid approval prompts that would block automated tests.',
model: 'anthropic/claude-opus-4.5',
model: 'anthropic/claude-opus-4.8',
extraInputParams: {
reviewType: {
type: 'string',
Expand All @@ -103,9 +103,8 @@ const definition: AgentDefinition = {
yield {
toolName: 'add_message',
input: {
role: 'assistant',
content: 'I\'ll first gather context and prepare before starting the ' + CLI_NAME + ' CLI session.\n\n' +
'Let me read relevant files and understand the task to provide better guidance to the CLI.',
role: 'user',
content: 'Before starting the ' + CLI_NAME + ' CLI session, gather context by reading relevant files and understanding the task to provide better guidance to the CLI.',
},
includeToolCall: false,
}
Expand Down Expand Up @@ -172,10 +171,10 @@ const definition: AgentDefinition = {
yield {
toolName: 'add_message',
input: {
role: 'assistant',
content: 'I have started a ' + CLI_NAME + ' tmux session: `' + sessionName + '`\n\n' +
'I will use this session for all CLI interactions. The session name must be included in my final output.\n\n' +
'Now I\'ll proceed with the task using the helper scripts:\n' +
role: 'user',
content: 'A ' + CLI_NAME + ' tmux session has been started: `' + sessionName + '`\n\n' +
'Use this session for all CLI interactions. The session name must be included in your final output.\n\n' +
'Proceed with the task using the helper scripts:\n' +
'- Send commands: `./scripts/tmux/tmux-cli.sh send "' + sessionName + '" "..."`\n' +
'- Capture output: `./scripts/tmux/tmux-cli.sh capture "' + sessionName + '" --label "..."`\n' +
'- Stop when done: `./scripts/tmux/tmux-cli.sh stop "' + sessionName + '"`',
Expand Down
15 changes: 7 additions & 8 deletions .agents/gemini-cli.ts
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ const baseDefinition = createCliAgent({
startCommand: 'gemini --yolo',
permissionNote:
'Always use `--yolo` (or `--approval-mode yolo`) when testing to auto-approve all tool actions and avoid prompts that would block automated tests.',
model: 'anthropic/claude-opus-4.5',
model: 'anthropic/claude-opus-4.8',
cliSpecificDocs: `## Gemini CLI Commands

Gemini CLI uses slash commands for navigation:
Expand All @@ -29,9 +29,8 @@ const definition: AgentDefinition = {
yield {
toolName: 'add_message',
input: {
role: 'assistant',
content: 'I\'ll first gather context and prepare before starting the ' + CLI_NAME + ' CLI session.\n\n' +
'Let me read relevant files and understand the task to provide better guidance to the CLI.',
role: 'user',
content: 'Before starting the ' + CLI_NAME + ' CLI session, gather context by reading relevant files and understanding the task to provide better guidance to the CLI.',
},
includeToolCall: false,
}
Expand Down Expand Up @@ -98,10 +97,10 @@ const definition: AgentDefinition = {
yield {
toolName: 'add_message',
input: {
role: 'assistant',
content: 'I have started a ' + CLI_NAME + ' tmux session: `' + sessionName + '`\n\n' +
'I will use this session for all CLI interactions. The session name must be included in my final output.\n\n' +
'Now I\'ll proceed with the task using the helper scripts:\n' +
role: 'user',
content: 'A ' + CLI_NAME + ' tmux session has been started: `' + sessionName + '`\n\n' +
'Use this session for all CLI interactions. The session name must be included in your final output.\n\n' +
'Proceed with the task using the helper scripts:\n' +
'- Send commands: `./scripts/tmux/tmux-cli.sh send "' + sessionName + '" "..."`\n' +
'- Capture output: `./scripts/tmux/tmux-cli.sh capture "' + sessionName + '" --label "..."`\n' +
'- Stop when done: `./scripts/tmux/tmux-cli.sh stop "' + sessionName + '"`',
Expand Down
3 changes: 3 additions & 0 deletions .agents/lib/create-cli-agent.ts
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,9 @@ export function createCliAgent(config: CliAgentConfig): AgentDefinition {
id: config.id,
displayName: config.displayName,
model: config.model,
providerOptions: {
ignore: ['Amazon Bedrock'],
},

spawnerPrompt: getSpawnerPrompt(config),

Expand Down
42 changes: 42 additions & 0 deletions .agents/sessions/03-02-14:07-chatgpt-oauth-direct/LESSONS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
# LESSONS — ChatGPT OAuth Direct Routing

Session: `.agents/sessions/03-02-14:07-chatgpt-oauth-direct/`

## What went well
- Building this feature behind a strict feature flag (`CHATGPT_OAUTH_ENABLED=false`) reduced rollout risk while allowing full end-to-end wiring.
- Reusing the Claude OAuth architectural pattern (credentials helpers, refresh mutex, routing split) accelerated implementation without coupling the two providers.
- Splitting policy logic into `classifyChatGptOAuthStreamError` made fallback/auth/fail-fast behavior easier to test and reason about.
- Adding focused CLI tests for `/connect:chatgpt` gating and utility sanitization caught regression risk early.

## Current confidence / known gaps
- Runtime ChatGPT stream policy is **partially tested**: `classifyChatGptOAuthStreamError` is covered, but we do not yet have full behavioral tests for `promptAiSdkStream` recursion branches (actual fallback recursion and post-partial-output behavior).
- CLI routing coverage is strongest for **feature-flag OFF** paths; flag-ON auth-code routing should get explicit dedicated tests in a future pass.

## What was tricky
- The repo had unrelated local drift during implementation; explicit scope cleanup (`git checkout -- <unrelated files>`) was necessary to avoid accidental cross-feature commits.
- CLI module mocking is path-sensitive. Test modules under `cli/src/commands/__tests__` must mock sibling modules with correct relative paths (e.g. `../../state/chat-store`), or mocks silently fail.
- Over-mocking analytics can break transitive imports (`setAnalyticsErrorLogger` export expectations). A safe pattern is spreading real analytics exports and overriding only `trackEvent`.

## Unexpected behaviors / gotchas
- A staged unrelated file can survive despite working-tree revert; both staged and worktree states must be checked before final handoff.
- “Looks correct” tests can still miss runtime branches if they only validate helper classification, not route wiring; reviewer loops were useful to force coverage on practical paths.
- For OAuth tooling/scripts, sanitize error text aggressively. Returning status-only errors avoids accidental token payload leakage.

## Useful patterns discovered
- Keep direct-provider routing stream-only initially; explicitly forcing non-streaming/structured calls to backend avoided broad compatibility risk.
- Use deterministic model allowlist + normalization mapping in constants to avoid relying on provider-side parsing/errors for unsupported models.
- Treat temporary protocol validation scripts as first-class validation artifacts: they are valuable for real-account smoke checks without coupling to full CLI runtime.

## Temporary script disposition
- `scripts/chatgpt-oauth-validate.ts` is currently kept as a **dev utility** for manual protocol revalidation while the feature remains experimental/off by default.
- Removal criteria: if protocol endpoints are either officially documented or the CLI flow gets stable automated integration coverage, this script can be retired.

## Repeatable security verification
- For redaction checks, run targeted searches against changed code/log handling paths for sensitive markers before handoff, e.g. `access_token`, `refresh_token`, and `Authorization: Bearer`.
- Keep surfaced token exchange errors status-only and avoid echoing raw provider response bodies.

## Follow-up improvements worth considering
- Add deeper runtime-behavior tests for `promptAiSdkStream` recursive fallback branches (not just policy classifier).
- Add explicit CLI test for flag-ON connect flow path once flag toggling is test-harness friendly.
- If feature graduates from experimental, add richer direct-path observability while preserving strict token redaction.
- Add periodic protocol drift checks (authorize/token/callback PKCE assumptions) before enabling the feature flag in production defaults.
104 changes: 104 additions & 0 deletions .agents/sessions/03-02-14:07-chatgpt-oauth-direct/PLAN.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
# PLAN — ChatGPT Subscription OAuth Direct Routing

## Implementation Steps
1. **Add shared ChatGPT OAuth constants**
- Create `common/src/constants/chatgpt-oauth.ts` with:
- feature flag (`CHATGPT_OAUTH_ENABLED=false`)
- endpoints/client id/redirect URI/env var
- model allowlist + normalization helpers
- Export through `common/src/constants/index.ts`.

2. **Build core OAuth utility + temporary protocol validation script (early gate)**
- Create `cli/src/utils/chatgpt-oauth.ts` with PKCE URL generation, browser-open helper, pasted code/URL parsing, token exchange helper.
- Create `scripts/chatgpt-oauth-validate.ts` to test OAuth URL generation + paste parsing + token exchange interaction.
- **Run this script before full integration** as go/no-go checkpoint for endpoint assumptions.

3. **Add SDK env + credential support**
- Extend `sdk/src/env.ts` with `getChatGptOAuthTokenFromEnv()`.
- Extend `sdk/src/credentials.ts` with `chatgptOAuth` schema and helpers:
- get/save/clear
- valid-check + refresh mutex
- get-valid-with-refresh
- Preserve all non-target credentials in read/write operations.

4. **Add CLI connect flow UI and command routing**
- Create `cli/src/components/chatgpt-connect-banner.tsx` with state machine + `handleChatGptAuthCode`.
- Update input modes (`connect:chatgpt`) and banner registry.
- Add `/connect:chatgpt` command + alias handling and slash command entry (feature-gated).
- Extend router to process pasted auth code in `connect:chatgpt` mode.
- Verify command visibility: hidden when flag OFF, present when flag ON.

5. **Implement direct routing primitives in model-provider (decomposed)**
- 5.1 Add ChatGPT direct eligibility checks (feature flag + creds + model scope + skip flag + rate-limit cache state).
- 5.2 Add model normalization + prevalidation helpers (OpenRouter-style -> provider-native).
- 5.3 Add strict payload sanitization helper for direct requests.
- 5.4 Add ChatGPT OAuth direct model construction using OpenAI-compatible transport.
- 5.5 Add ChatGPT rate-limit cache helpers (parallel to Claude cache pattern).
- Keep Claude OAuth path unchanged.

6. **Update stream execution + fallback/error policy**
- Extend `sdk/src/impl/llm.ts` to:
- recognize ChatGPT direct route usage
- emit ChatGPT OAuth analytics
- fallback only on rate-limit errors
- fail with reconnect guidance on auth errors
- fail fast for all other direct errors
- skip cost accounting for successful ChatGPT direct requests
- avoid fallback once output has already streamed

7. **Wire startup refresh + CLI status surfacing**
- Update `cli/src/init/init-app.ts` for background ChatGPT OAuth credential refresh when enabled.
- Update `cli/src/chat.tsx`, `cli/src/components/bottom-status-line.tsx`, and `cli/src/components/usage-banner.tsx` to surface ChatGPT connection/active status.

8. **Add analytics constants + SDK exports**
- Extend `common/src/constants/analytics-events.ts` with ChatGPT OAuth request/rate-limit/auth-error events.
- Ensure SDK exports newly needed helper(s) in `sdk/src/index.ts`.

9. **Add/adjust tests (explicit matrix)**
- SDK credentials tests:
- env precedence
- persisted read/write/clear
- refresh success/failure + mutex
- Model-provider tests:
- rate-limit cache lifecycle
- allowlist prevalidation + unsupported-model error
- normalization behavior for mapped/unknown variants
- LLM routing/fallback tests (targeted):
- 429 fallback
- 401/403 no-fallback + reconnect path
- timeout/5xx fail-fast
- no fallback after content emitted
- CLI tests/wiring checks:
- command/mode visibility by feature flag
- connect mode routing and handler call.
- Non-streaming/structured guard check:
- confirm backend-only behavior unchanged.

10. **Validation and cleanup decision for temporary script**
- Run targeted tests/typechecks for touched packages.
- Run OAuth validation script in manual mode (with your account interaction if needed).
- Decide and apply final disposition of temporary script:
- keep as dev utility, or
- remove before finalization.

11. **Security/redaction verification**
- Validate no token values are logged in direct feature code paths.
- Grep/check for accidental logging of authorization headers, token payload fields, or raw callback query params.

## Dependencies / Ordering
- Step 1 must be first.
- Step 2 must run before deep integration (early protocol validation gate).
- Step 3 precedes Steps 5–7.
- Step 4 can run in parallel with Step 3 after constants/util setup.
- Step 5 must precede Step 6.
- Step 8 can be implemented alongside Steps 5–6 but must complete before final validation.
- Step 9 follows core implementation completion.
- Steps 10–11 are final validation/cleanup/security passes.

## Risk Areas
1. **Unofficial OAuth contract drift** — endpoint/field incompatibility can break token exchange.
2. **Direct payload compatibility** — strict sanitization must retain required OpenAI fields.
3. **Error classification correctness** — misclassification can violate requested fallback policy.
4. **Model normalization accuracy** — wrong mapping yields avoidable provider failures.
5. **Token redaction** — avoid leakage in logs, errors, or analytics payloads.
6. **Streaming boundary behavior** — fallback must not happen after partial output is emitted.
Loading
Loading