Simplify and tighten the @perf_event API#23
Merged
Conversation
perf_options previously required both perf_type and perf_config, two fields that had to agree (every config belongs to exactly one type, e.g. cache_misses is always hardware). Mismatching them was a silent footgun. Replace the perf_type / perf_hw_config / perf_sw_config enums with one perf_event enum whose value packs the perf_event_attr.type tag in the high 32 bits and the config in the low 32 bits. perf_options now takes a single required `event` field; codegen decodes the two halves via KS_PERF_EVENT_TYPE / KS_PERF_EVENT_CONFIG when filling perf_event_attr. This drops the tracepoint/hw_cache/raw/breakpoint type tags and arbitrary numeric configs from the public surface, which were unused by examples and tests; they can return as encoded entries or a raw-event constructor. Examples, docstring, tests, and docs (README/SPEC/BUILTINS) updated. Full test suite passes; both perf examples build to working binaries. Signed-off-by: Cong Wang <cwang@multikernel.io>
The PMU-slot analysis in the type checker hard-coded the perf_event bit layout (value >> 32, type tags 1/2) and the enum name "perf_event", duplicating the packing that stdlib already defines in the enum values. Move the layout to stdlib: the perf_event enum is now built from perf_event_pack, and perf_event_consumes_pmu_slot decodes with the same named type constants. The type checker no longer knows the bit layout or the enum name — it resolves the event constant generically across enum definitions and asks stdlib whether it consumes a PMU slot. Adding a new event to the stdlib enum now classifies correctly with no type_checker change. Signed-off-by: Cong Wang <cwang@multikernel.io>
The type checker rejected perf event groups that needed more PMU counter slots than a limit read from the *build* host's sysfs (or an env var, or a guessed default of 4). This was the wrong layer: it's a host-dependent resource check, not a type rule, and the build host may not match the deploy host. The kernel already enforces group sizing authoritatively at perf_event_open(2) on the real host, and the generated ks_open_perf_event reports that failure at runtime. Deleting it removes ~240 lines and all the perf-group/PMU/event-encoding knowledge from type_checker.ml — perf_type references drop from 55 to 11, and those 11 are the same uniform per-program-type handling @xdp/@kprobe get (attribute->program-type, signature check, attach() overload). The type checker no longer understands perf groups at all. Also drops the now-orphaned stdlib helpers (perf_event_consumes_pmu_slot, perf_type_tracepoint) and the three tests that asserted the static check. Docs updated to describe runtime/kernel enforcement. Signed-off-by: Cong Wang <cwang@multikernel.io>
perf_options exposed two ways to join a perf group: the typed `group` (a PerfAttachment handle) and the raw `group_fd: cache.perf_fd` form, which leaked a kernel ABI integer into the user-facing struct. In KernelScript every perf fd comes from attach() as a PerfAttachment, so the raw form was redundant. Remove the public `group_fd` field (and its default). Grouping is now expressed only via `group`. Internally, ks_open_perf_event still resolves a leader fd from the handle and defaults to -1 (no group); the generated runtime keeps its group_fd plumbing. Passing `group_fd:` now fails type checking as an unknown field. Tests, examples, and docs updated; the test for the raw group_fd form is removed and its detach-cascade assertions folded into the group-handle test. Signed-off-by: Cong Wang <cwang@multikernel.io>
The three user-facing perf types were inconsistently cased: perf_options (snake) alongside PerfAttachment and PerfRead (Pascal). Rename the latter two to perf_attachment and perf_read so the whole perf surface is snake_case, matching the rest of the language's builtin types. KernelScript type name and generated C type name stay identical (the generic IRStruct -> "struct <name>" mapping handles it), so no codegen special-casing is needed; neither name collides with kernel structs. Examples, tests, and docs updated. Signed-off-by: Cong Wang <cwang@multikernel.io>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Reworks the
@perf_eventuser-facing API for safety, consistency, and a smaller compiler footprint. Five focused commits:Fold
perf_type+perf_configinto oneeventconstant. The two old fields had to agree (every config belongs to exactly one type —cache_missesis always hardware), and mismatching them was a silent footgun. A singleperf_eventenum now packs theperf_event_attr.typetag in the high 32 bits and the config in the low 32 bits, so naming the event fixes both.Single-source the encoding in stdlib. The PMU-slot analysis had hard-coded the bit layout (
>> 32, type tags) and the enum name in the type checker, duplicating what stdlib defined. The layout now lives only in stdlib (perf_event_pack); the type checker resolves the constant generically.Remove the host-dependent static perf-group PMU-budget check. It rejected groups needing more PMU counters than a limit read from the build host's sysfs (or a guessed default of 4) — the wrong layer for a host-dependent resource check, and wrong under cross-compilation. The kernel already enforces group sizing authoritatively at
perf_event_open(2)on the real host, and the generated runtime reports it. This removes ~240 lines and all perf-group/PMU/encoding knowledge fromtype_checker.ml(perf references: 55 → 11, the survivors being the same uniform per-program-type handling@xdp/@kprobeget).Drop the raw
group_fdfield fromperf_options. It exposed a kernel ABI integer alongside the typedgrouphandle. Since every perf fd comes fromattach()as aperf_attachment, the raw form was redundant; grouping is now expressed only viagroup.group_fd:now fails type checking as an unknown field.snake_case the perf types.
PerfAttachment→perf_attachment,PerfRead→perf_read, so the whole perf surface matchesperf_optionsand the rest of the language.Behavioral notes
eventenum currently covers the named hardware/software events (everything examples/tests used). The oldtracepoint/hw_cache/raw/breakpointtype tags and arbitrary numeric configs are not expressible; they can return as encoded enum entries or araw_event(type, config)constructor if needed.perf_event_open(2); >16-member software groups truncate-with-warning in the bounded group read (no overflow); group cycles are already prevented by use-before-definition; non-leadergroupreferences are rejected by the kernel.Verification
dune runtest, exit 0).perf_cache_miss,perf_page_fault) compile and the generated C builds to working binaries — including the software-event path that crosses the 32-bit encoding boundary.Net: +180 / −633 lines.
🤖 Generated with Claude Code