Skip to content

feat: instance meter query#6380

Merged
Flo4604 merged 2 commits into
mainfrom
feat/meter-query
Jun 12, 2026
Merged

feat: instance meter query#6380
Flo4604 merged 2 commits into
mainfrom
feat/meter-query

Conversation

@Flo4604

@Flo4604 Flo4604 commented Jun 8, 2026

Copy link
Copy Markdown
Member

ClickHouse meter aggregation query

The foundation of the Deploy metering pipeline: clickhouse.GetInstanceMeterUsage computes billable usage from Heimdall checkpoint data (instance_checkpoints) over a time window, returning one row per (workspace, project, environment, resource) for the four meters.

  • CPU and egress are monotonic counters — each consecutive sample pair contributes its non-negative delta, which telescopes to last − first when there are no gaps.
  • Memory and disk are gauges — each pair contributes value × dt, a left-Riemann integral over time.

Correctness properties that matter for billing real money:

  • Every meter is derived from consecutive checkpoint pairs within a single container_uid (pod UID + restart count). A container restart resets the cgroup/network counters, so we never diff across that boundary.
  • Sample pairs more than ~2 minutes apart are dropped, so an agent outage under-counts rather than over-charges.
  • Reads the instance_checkpoints view (FINAL applied) so un-merged duplicate inserts can't double-count the integrals.
  • Memory/disk products accumulate in Float64 — byte-milliseconds over a month overflow Int64.

Exposed on *clickhouse.Client, the Querier interface, and the noop. Units (CPU-seconds, GiB-hours, bytes) are converted to the Stripe meter units by the consumer in the stacked worker PR (#6381).

Closes ENG-2863.

How to test

mise exec -- bazel test //pkg/clickhouse:clickhouse_test --test_filter=TestGetInstanceMeterUsage

The integration test runs against a real ClickHouse container and covers:

  • counter deltas + time-integration over a clean 15s-spaced series,
  • the 2-minute gap drop — the counter keeps climbing across a simulated outage, but the spanning pair is excluded,
  • restart-boundary isolation — counters reset on a new container_uid; totals sum per container life and never diff across the boundary,
  • cross-workspace aggregation (empty WorkspaceID filter), and
  • an empty result outside the query window.

The billing formulas are also documented inline in pkg/clickhouse/schema/025_instance_checkpoints_v1.sql for reference.

@vercel

vercel Bot commented Jun 8, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
dashboard Ready Ready Preview, Comment Jun 11, 2026 4:31pm
design Ready Ready Preview, Comment Jun 11, 2026 4:31pm

Request Review

@Flo4604

Flo4604 commented Jun 8, 2026

Copy link
Copy Markdown
Member Author

This stack of pull requests is managed by jj-ryu.

@pullfrog pullfrog Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ No new issues found.

Reviewed changes — adds GetInstanceMeterUsage to the ClickHouse client, a billing query that computes deploy meter usage (CPU, memory, egress, disk) from Heimdall checkpoint data using window-function pair integration over adjustable time windows.

  • Add GetInstanceMeterUsage query with leadInFrame pair integration — counter deltas for CPU/egress, gauge × interval for memory/disk, gap-drop at 2 minutes, conservative least() for resize safety.
  • Add GetInstanceMeterUsageRequest / InstanceMeterUsage types with ch: struct tags for ClickHouse-go deserialization and Float64 accumulators to prevent int64 overflow on large billing windows.
  • Add integration tests with real ClickHouse covering counter deltas, outage-gap dropping, restart-boundary isolation, cross-workspace aggregation, and empty-window returning zero rows.
  • Add no-op implementation and Querier interface entry matching existing patterns.

Pullfrog  | View workflow run | Using DeepSeek Pro𝕏

@linear-code

linear-code Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

ENG-2863

@pullfrog pullfrog Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ No new issues found.

Reviewed changes — the delta since the prior pullfrog review (72885e6) is entirely a dashboard settings UI refactor; the GetInstanceMeterUsage ClickHouse query is unchanged. This run reviewed the new @unkey/ui page-layout primitives, the settings-page migration onto them, and the supporting design-app docs.

  • Add @unkey/ui page-layout primitives — new PageContainer (compound, width="default"|"full" propagated via a group/page data attribute), PageBody, the PageHeader* family, and a cva-based SecondaryNav* family under web/internal/ui/src/components/page-header/, re-exported from the package index.
  • Migrate settings pages to the primitivesgeneral, team, root-keys, and billing (new billing-container.tsx) drop WorkspaceNavbar/SettingsShell for the new layout; a new settings/layout.tsx renders the SecondaryNav rail.
  • Remove the settings sub-nav from the global sidebarbuildSettingsLinks is deleted from leaves.ts and case "settings" in sidebar-body.tsx now returns buildWorkspaceSections; deleted settings/workspace-navbar.tsx and settings/root-keys/navigation.tsx.
  • Design app additionsPreview gains a bleed prop, copy-button.ts gains an inline option, and a new patterns/layout MDX page with example sources documents the primitives.

I verified the load-bearing parts of the refactor: general/page.tsx drops the old workspace && guard but useWorkspaceNavigation() redirects on a null workspace so the access is safe; buildSettingsLinks, WorkspaceNavbar, the settings Navigation, and SettingsShell have no remaining importers in settings; and the billing page never double-wraps BillingContainer since its branches and Client are mutually exclusive.

Pullfrog  | View workflow run | Using Claude Opus𝕏

Comment thread pkg/clickhouse/instance_meter.go

@chronark chronark left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just nits really, looks good

Comment thread pkg/clickhouse/instance_meter.go Outdated
Comment thread pkg/clickhouse/instance_meter_test.go Outdated

@pullfrog pullfrog Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ No new issues found.

Reviewed changes — the delta since the prior pullfrog review (f3f4377) addresses the two nits from chronark's approval. Both land in pkg/clickhouse and leave the query's billing semantics unchanged.

  • Switch the gap constant to a time.DurationmaxSampleGapMillis int64 = 2*60*1000 becomes maxSampleGap = 2 * time.Minute, converted with .Milliseconds() at the query-param boundary; doc comments updated to drop the Millis suffix. The wire value is still 120000ms, so query behavior is identical.
  • Derive the test anchor from time.Now() — the integration test's base moves from a hardcoded 2026-06-01 to time.Now().UTC().Truncate(24h).Add(-24h), keeping inserted rows inside the table's 95-day TTL so ClickHouse can't silently drop the parts before the query runs. This fixes a latent flakiness that would have surfaced once the fixed date aged past the TTL.

Pullfrog  | View workflow run | Using Claude Opus𝕏

Flo4604 added 2 commits June 11, 2026 18:28
…27.dahlia)

Sits at the bottom of the deploy-billing stack so every change above it is
written and tested against one SDK/API version. Full breaking-change audit in
ENG-2895; this change migrates the code that exists on main:

- API version pins -> 2026-05-27.dahlia (lib/stripe.ts, webhook route).
- New lib/stripe/invoiceCompat.ts: dual-shape readers for invoice
  subscription / payment intent / line price / line proration and the
  per-item subscription period (basil reshaped all of these). Webhook payload
  shapes follow the ENDPOINT's pinned API version, not the SDK's, so the
  helpers accept both shapes; after deploy, bump the webhook endpoint to
  2026-05-27.dahlia in Stripe, then the legacy halves can be removed.
- paymentUtils reads all go through the compat helpers.
- isAutomatedBillingRenewal compares only fields present in
  previous_attributes: basil+ endpoints report renewals as per-item
  current_period_* diffs, which the old comparison misread as plan changes.
- createSubscription pins billing_mode classic (clover defaults new
  subscriptions to flexible, which itemizes prorations differently).
- handleStripeError: v22 exports error classes as values, not types.

Stack changes above this one carry their own dahlia-isms: subscribeDeploy
billing_mode pin and price.product reads (deploy-subscribe-flows),
invoices.createPreview (deploy-project-gate), credit grants (ENG-2870).

ENG-2895
Base automatically changed from eng-2895-upgrade-stripe-node-to-v22-api-2026-05-27dahlia-and-migrate to main June 12, 2026 16:13
@Flo4604 Flo4604 added this pull request to the merge queue Jun 12, 2026
Merged via the queue into main with commit 4d02635 Jun 12, 2026
15 checks passed
@Flo4604 Flo4604 deleted the feat/meter-query branch June 12, 2026 16:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants