Skip to content

Simplify and tighten the @perf_event API#23

Merged
congwang-mk merged 5 commits into
mainfrom
perf-event-fold-type
Jun 26, 2026
Merged

Simplify and tighten the @perf_event API#23
congwang-mk merged 5 commits into
mainfrom
perf-event-fold-type

Conversation

@congwang-mk

Copy link
Copy Markdown
Contributor

Summary

Reworks the @perf_event user-facing API for safety, consistency, and a smaller compiler footprint. Five focused commits:

  1. Fold perf_type + perf_config into one event constant. The two old fields had to agree (every config belongs to exactly one type — cache_misses is always hardware), and mismatching them was a silent footgun. A single perf_event enum now packs the perf_event_attr.type tag in the high 32 bits and the config in the low 32 bits, so naming the event fixes both.

    // before
    attach(prog, perf_options { perf_type: perf_type_hardware, perf_config: cache_misses }, 0)
    // after
    attach(prog, perf_options { event: cache_misses }, 0)
    
  2. Single-source the encoding in stdlib. The PMU-slot analysis had hard-coded the bit layout (>> 32, type tags) and the enum name in the type checker, duplicating what stdlib defined. The layout now lives only in stdlib (perf_event_pack); the type checker resolves the constant generically.

  3. Remove the host-dependent static perf-group PMU-budget check. It rejected groups needing more PMU counters than a limit read from the build host's sysfs (or a guessed default of 4) — the wrong layer for a host-dependent resource check, and wrong under cross-compilation. The kernel already enforces group sizing authoritatively at perf_event_open(2) on the real host, and the generated runtime reports it. This removes ~240 lines and all perf-group/PMU/encoding knowledge from type_checker.ml (perf references: 55 → 11, the survivors being the same uniform per-program-type handling @xdp/@kprobe get).

  4. Drop the raw group_fd field from perf_options. It exposed a kernel ABI integer alongside the typed group handle. Since every perf fd comes from attach() as a perf_attachment, the raw form was redundant; grouping is now expressed only via group. group_fd: now fails type checking as an unknown field.

  5. snake_case the perf types. PerfAttachmentperf_attachment, PerfReadperf_read, so the whole perf surface matches perf_options and the rest of the language.

Behavioral notes

  • Capability trimmed (deliberately): the unified event enum currently covers the named hardware/software events (everything examples/tests used). The old tracepoint/hw_cache/raw/breakpoint type tags and arbitrary numeric configs are not expressible; they can return as encoded enum entries or a raw_event(type, config) constructor if needed.
  • No safety regression from removing the static check: oversized hardware groups fail at perf_event_open(2); >16-member software groups truncate-with-warning in the bounded group read (no overflow); group cycles are already prevented by use-before-definition; non-leader group references are rejected by the kernel.

Verification

  • Full test suite green (dune runtest, exit 0).
  • Both perf examples (perf_cache_miss, perf_page_fault) compile and the generated C builds to working binaries — including the software-event path that crosses the 32-bit encoding boundary.
  • Docs (README, SPEC, BUILTINS) updated throughout.

Net: +180 / −633 lines.

🤖 Generated with Claude Code

perf_options previously required both perf_type and perf_config, two
fields that had to agree (every config belongs to exactly one type, e.g.
cache_misses is always hardware). Mismatching them was a silent footgun.

Replace the perf_type / perf_hw_config / perf_sw_config enums with one
perf_event enum whose value packs the perf_event_attr.type tag in the
high 32 bits and the config in the low 32 bits. perf_options now takes a
single required `event` field; codegen decodes the two halves via
KS_PERF_EVENT_TYPE / KS_PERF_EVENT_CONFIG when filling perf_event_attr.

This drops the tracepoint/hw_cache/raw/breakpoint type tags and arbitrary
numeric configs from the public surface, which were unused by examples
and tests; they can return as encoded entries or a raw-event constructor.

Examples, docstring, tests, and docs (README/SPEC/BUILTINS) updated. Full
test suite passes; both perf examples build to working binaries.

Signed-off-by: Cong Wang <cwang@multikernel.io>
The PMU-slot analysis in the type checker hard-coded the perf_event bit
layout (value >> 32, type tags 1/2) and the enum name "perf_event",
duplicating the packing that stdlib already defines in the enum values.

Move the layout to stdlib: the perf_event enum is now built from
perf_event_pack, and perf_event_consumes_pmu_slot decodes with the same
named type constants. The type checker no longer knows the bit layout or
the enum name — it resolves the event constant generically across enum
definitions and asks stdlib whether it consumes a PMU slot. Adding a new
event to the stdlib enum now classifies correctly with no type_checker
change.

Signed-off-by: Cong Wang <cwang@multikernel.io>
The type checker rejected perf event groups that needed more PMU counter
slots than a limit read from the *build* host's sysfs (or an env var, or
a guessed default of 4). This was the wrong layer: it's a host-dependent
resource check, not a type rule, and the build host may not match the
deploy host. The kernel already enforces group sizing authoritatively at
perf_event_open(2) on the real host, and the generated ks_open_perf_event
reports that failure at runtime.

Deleting it removes ~240 lines and all the perf-group/PMU/event-encoding
knowledge from type_checker.ml — perf_type references drop from 55 to 11,
and those 11 are the same uniform per-program-type handling @xdp/@kprobe
get (attribute->program-type, signature check, attach() overload). The
type checker no longer understands perf groups at all.

Also drops the now-orphaned stdlib helpers (perf_event_consumes_pmu_slot,
perf_type_tracepoint) and the three tests that asserted the static check.
Docs updated to describe runtime/kernel enforcement.

Signed-off-by: Cong Wang <cwang@multikernel.io>
perf_options exposed two ways to join a perf group: the typed `group`
(a PerfAttachment handle) and the raw `group_fd: cache.perf_fd` form,
which leaked a kernel ABI integer into the user-facing struct. In
KernelScript every perf fd comes from attach() as a PerfAttachment, so
the raw form was redundant.

Remove the public `group_fd` field (and its default). Grouping is now
expressed only via `group`. Internally, ks_open_perf_event still resolves
a leader fd from the handle and defaults to -1 (no group); the generated
runtime keeps its group_fd plumbing. Passing `group_fd:` now fails type
checking as an unknown field.

Tests, examples, and docs updated; the test for the raw group_fd form is
removed and its detach-cascade assertions folded into the group-handle
test.

Signed-off-by: Cong Wang <cwang@multikernel.io>
The three user-facing perf types were inconsistently cased: perf_options
(snake) alongside PerfAttachment and PerfRead (Pascal). Rename the latter
two to perf_attachment and perf_read so the whole perf surface is
snake_case, matching the rest of the language's builtin types.

KernelScript type name and generated C type name stay identical (the
generic IRStruct -> "struct <name>" mapping handles it), so no codegen
special-casing is needed; neither name collides with kernel structs.
Examples, tests, and docs updated.

Signed-off-by: Cong Wang <cwang@multikernel.io>
@congwang-mk congwang-mk merged commit 68bbe2c into main Jun 26, 2026
1 check passed
@congwang-mk congwang-mk deleted the perf-event-fold-type branch June 26, 2026 18:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant