Skip to content

CPU Profiles part 2: MSR adjustment logic#8442

Draft
olivereanderson wants to merge 21 commits into
cloud-hypervisor:mainfrom
olivereanderson:upstream-cpu-profiles-msrs
Draft

CPU Profiles part 2: MSR adjustment logic#8442
olivereanderson wants to merge 21 commits into
cloud-hypervisor:mainfrom
olivereanderson:upstream-cpu-profiles-msrs

Conversation

@olivereanderson

Copy link
Copy Markdown
Contributor

This is the second PR in a series adding support for x86_64 CPU profiles to Cloud hypervisor which is requested in #7068.

The x86_64 CPU profile feature in its entirety is currently tracked in #7068 (comment), but you may also want to read the "full implementation overview" section below to get the big picture of what the complete implementation will look like.

Reviewers unfamiliar with CPU profiles/templates/models may want to jump directly to the "Motivation and background" section in this description first.

What this PR does

This PR does not add any user facing functionality to Cloud hypervisor. All it does is introduce data structures and logic for updating and filtering MSRs in accordance with a (at this point) hypothetical CPU profile.

The logic introduced here is (mostly) oblivious to the CPU manufacturer (as long as it is an x86_64 CPU), but focuses on the KVM hypervisor. This means that there are a few unimplemented!() invocations in the MSHV path, but note that these paths should not be encountered in the case of the host CPU profile.

If/When CPU profile support for MSHV is implemented then it should be possible to work with a lot of the code introduced in this PR though.

Differences with the implementation in the Cyberus Technology fork

For those familiar with the working prototype available in the Cyberus Technology fork there are a few differences to note here with regards to the implementation:

  1. We let KVM take care of most feature MSR compatibility checks for us. Since this is done when setting MSRs we unfortunately had to perform a more invasive refactor, but it saves us from introducing a lot of relatively complex code.
  2. The check for IA32_ARCH_CAPABILITIES is still necessary though because KVM does not check compatibility when setting that particular MSR. I noticed a bug in the initial implementation during this work and it has been fixed here.
  3. We now use a filter that explicitly permits MSRs specified by the CPU profile and denies everything else instead of declaring what to be denied in the filter and allowing everything else. We expect the explicit permit mode used here to be more robust and future proof.

Our working prototype

We have a full implementation for Intel CPUs on our fork.

Generating a new CPU profile

See our docs for instructions on how to generate a new CPU profile for your own Intel CPU(s)

Experimenting with existing CPU profiles

You can also test running cloud hypervisor with one of our pre-generated profiles: Intel Sapphire Rapids or Intel Skylake; Simply add ,profile=sapphire-rapids or ,profile=skylake to the --cpus <cpu option> argument when bringing up the VM.

How we have tested our implementation

We have performed a range of tests on deployments of various sizes including:

  • Live migration from an Intel Granite Rapids to an Intel Sapphire Rapids processor where the Sapphire Rapids CPU profile is being utilized. We ensured that the migration was successful while the guest was executing a workload involving the intel AMX feature (available on both processors).
  • CPU profile restricted VMs boot with several images (CirrOS, Ubuntu, Windows Server, NixOS) and also with direct kernel boot.
  • Checks that MSRs not permitted by CPU profiles are not available to the guest.
  • A plethora of successful live migrations on our customer's clusters.

Full implementation overview

At a high level our implementation consists of:

  1. A tool for generating a CPU profile for the hardware and hypervisor the tool is executed on. The output of the tool are JSON files describing required CPUID and MSR adjustments.
  2. A build script that traverses all pre-generated CPU profile JSON files, then bakes their compressed versions into the Cloud hypervisor binary at build time and also generates a CpuProfile enum with a variant per CPU profile together with logic for extracting the compressed JSON files.
  3. Cloud hypervisor is adapted to decompress and deserialize the aforementioned CPUID and MSR adjustment descriptions and sets CPUID and MSR values visible to the guest accordingly.
  4. We perform compatibility checks at runtime to determine whether the chosen CPU profile can be applied given the hardware and hypervisor the VM is running on. If these checks fail a hard error is returned and the VM will not start.

We plan to bring all of this functionality to Cloud hypervisor through several PRs:

  1. Plumbing for enabling Cloud hypervisor to adjust CPUID entries according to a CPU profile (this PR).
  2. Plumbing for enabling Cloud hypervisor to adjust MSR entries according to a CPU profile.
  3. The CPU profile generation tool limited to CPUID adjustments on Intel CPUs with the KVM hypervisor.
  4. Extending the CPU profile generation tool to also consider MSRs on Intel CPUs with the KVM hypervior.
  5. Introducing the build script that generates Rust code based on the CPU profile JSON files in the tree.
  6. Including some CPU profiles generated by our tool (Intel Sapphire Rapids and possibly also Intel Granite Rapids and Intel Skylake). This could be split into more PRs.
  7. Adding support for AMD CPUs. Note that we have yet to do this on our fork, but we want to get it done in the not too distant future. This step may of course also possibly be split into more PRs.

The CPU profile generation tool in more detail

Our CPU profile generation tool is aware of pretty much all (Intel) CPUID (sub) leaves and MSRs as well as CPUID and MSR entries tied to KVM.

It uses these hard coded lists together with our specified policies when the tool is executed to select, or prevent CPU features to become part of the generated CPU profile.

We will do our best to make the tool automatically warn or error when it encounters CPUID leaves and/or MSRs it is not aware of as this is a good sign that the tool needs to be updated.

If/when individual reserved CPUID and/or MSR bits within a CPUID or MSR register become specified, then one may also want to update the CPU profile generation tool to take this into account. If this is forgotten then we primarily expect this to just lead to new profiles having a slightly reduced set of supported features. This is something we can fix upon detection by re-generating the profile (with a "v2" suffix for backward compatibility reasons).

Additional binary size per new CPU profile

Our experiments show that with compression each new CPU profile adds between 3 and 4 KB to the Cloud hypervisor binary. Without compression the pretty printed JSON files sum up to about 50 - 60 KB per CPU profile.

Further optimizations to binary size are definitely possible, but we consider 3 - 4 KB per CPU profile good enough for the time being.

Motivation and background

Recall that software is usually developed to run on a variety of processors with various features. In order for the software to dynamically discover which hardware features may be utilized one typically uses the CPUID instruction to query the CPU for information, or in some often more low-level cases one uses so called MSRs (model specific registers) to obtain relevant processor specific information.

In the context of live migration this can of course lead to a time of check to time of use bug if the guest obtains processor information through CPUID or MSRs from the migration source and then ends up making decisions based on these findings on the migration destination that may have a different processor that might not support the same instructions as the migration source.

To mitigate this problem Cloud hypervisor performs CPUID checks (MSR checks are done by KVM, but it is debatable whether these are sufficient) at the beginning of a live migration. If the CPUID entries reported by the destination's hypervisor are not compatible with those of the source VM then the migration is aborted.

Such compatibility checks, although important, are thus also somewhat limiting in clusters with hosts/nodes running on different processors. There is no way to perform a live migration from a host with a say Intel Granite Rapids processor to a destination with a Intel Sapphire Rapids processor, even if the guest is not utilizing any functionality outside of the capabilities of the Intel Sapphire Rapids machine.

The aforementioned checks also prevent migrations from older hardware to newer in some cases where the older hardware supports deprecated CPU features (such as for example Intel MPX).

There are also certain CPU features that are unlikely to ever give accurate results in the context of live migration such as performance counters and debugging capabilities. We do not want guests making decisions based on these capabilities during live migration, even when all CPUs involved are identical!

Luckily hypervisors are capable of manipulating what guests see when executing the CPUID instruction or reading MSRs. This is a fact that we can and will use to our advantage. A CPU profile is thus a recipe for adjusting CPUID (sub) leaves and MSRs to hide certain CPU features from guests. One can then vastly increase the number of nodes/hosts in a cluster a booted VM with an applied CPU profile may live migrate to at some point in the future!

FAQ

What about other ISAs

We focus on x86_64 for the time being. Other architectures such as ARM and RISC-V will only have the Host CPU profile for the foreseeable future, unless someone steps up and wants to tackle either of them already now.

Due to the vastly different nature of these architectures we expect there to be relatively little overlap with our work here in the context of x86_64 CPUs.

What about live migration between Intel and AMD CPUs?

It might work with an extremely limited minimal profile, but I don't think that would be suitable for workloads intended to be used in production.

The intended use-cases are thus Intel <-> Intel and AMD <-> AMD live migrations.

olivereanderson and others added 21 commits June 19, 2026 21:33
KVM defines feature MSRs as MSRs that expose host capabilities and
processor features.

CPU profiles will describe how to adjust such feature MSRs similarly to
how they describe CPUID modifications.

We take the first step in this direction by introducing a method on
the hypervisor trait to obtain a list of the indices of the supported
feature MSRs.

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
CPU profiles will describe a list of MSRs they explicitly permit.
When applying the profile we wanto to check that the host has all the
required MSRs otherwise we have an incompatibility issue.

In order to check which MSRs the host supports we start by exposing the
hypervisor's get_msr_index_list method which lists most of the MSRs
supported by the host.

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
Note that even though a host might support all MSRs required by the
CPU profile, it might still expose some MSRs that are not compatible
with the given CPU profile.

Such MSRs are not necessarily guarded by CPUID and we cannot simply set
them to zero either.

KVM supports a means to filter MSRs however and we intend to utilize
this capability to prevent the guest from accessing MSRs that are
not permitted by the CPU profile.

We start with adding a type representing an MsrFilter.

NOTE: We can consider removing our MsrFilter type once
rust-vmm/kvm#359 is integrated in
Cloud Hypervisor.

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
Our second step towards denying guests access to CPU profile
incompatible MSRs is to add the missing msr_filter method on
the KvmVm type.

We emphasize that this is a temporary workaround until
rust-vmm/kvm#359 is integrated in
CHV.

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
We place a new method on the Vm trait which we will later call from the
vmm crate in order to deny the guest from accessing MSRs that are
incompatible with a selected CPU profile (if any).

We only provide an implementation for the KVM backend for now and leave
MSHV for later.

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
In order to help callers construct correct filter ranges when calling
set_msr_filter we provide a method that conveys the maximum number of
filter ranges the hypervisor backend permits.

We follow the existing precedent in Cloud Hypervisor of returning
`&static T` from methods on the hypervisor related interfaces as a means
to provide constants associated with types as a workaround for the fact
that associated constants are not dyn compatible.

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
CPU Profiles will be serialized to JSON by the upcoming CPU Profile
generation tool and we want MSRs to be serialized as hex strings.

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
We create 64-bit analogues of the already existing hex (de-)serialzer
helper functions for the 32-bit case.

These are necessary because the CPU profile needs associated data
describing how to adjust feature MSRs whose values are 64-bits.

In this case we prefer some small amount of code duplication over
macros and/or traits since we do not expect the need for further
variants of these helpers.

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
We introduce a type analogous to CpuidOutputRegisterAdjustment, but for
feature MSRs.

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
We introduce a type describing MSR adjustments associated with a
CPU profile.

The upcoming CPU profile generation tool will serialize instances of
this struct when generating a CPU profile.

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
When Cloud hypervisor has not been configured for (KVM) Hyper-V we want
to adapt the CPU profiles not to require existence of Hyper-V related
MSRs.

The first step introduced here is to create a list of such MSRs.

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
We introduce a method on the CpuProfile enum that computes the required
MSR related updates in order to be compatible with the CPU profile.

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
In order to safely apply CPU profiles we need to ensure that the host's
feature MSRs that are permitted by the CPU profile are also compatible
with the values the CPU profile dictates.

KVM_SET_MSRs takes care of checking compatibility for most feature MSRs
on both Intel and AMD CPUs that we will permit CPU profiles to have
(more on this in the upcoming CPU profile generation tool PRs),
but there is one exception.

Userspace may set whatever value for the Intel exclusive
IA32_ARCH_CAPABILITIES MSR without receiving any complaints from KVM.

We thus introduce our own compatibility check for
IA32_ARCH_CAPABILITIES.

One might even argue that KVM_SET_MSRs is called relatively late when
creating or receiving a VM and that it would be preferable to have
compatibility checks for all permitted feature MSRs run earlier. This
would also mean more informative debug logs.

We argue however that those additional checks would lead to too much
code that is not strictly necessary which is why we decided against
doing that in this patch set.

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
We introduce a function somewhat analogous to `generate_common_cpuid`,
except that it is only relevant for CPU profiles.

This function is more "high level" than the
CpuProfile::required_msr_updates method and is intended to be called
from the vmm crate.

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
Currently the KVM and MSHV Vm implementations have internal buffers
used as scratch space when snapshotting/restoring MSRS.

Upon construction the hypervisor sets an entry in this buffer for
essentially each MSR it supports. There are a few exceptions such as
MSRs that are treated as sregs, or MCE banks, but that is not relevant
here.

With CPU profiles some of the MSR entries in the buffer may however no
longer be permitted and we do not want to attempt to save or restore
their state.

Since the hypervisor crate (where KvmVm resides) has no awareness of CPU
profiles, it is more appropriate for the buffer to live in CpuManager,
allowing dynamic updates based on the active CPU profile.

Note that since we work with the Vm abstraction in the vmm crate we
also needed to adapt the MSHV implementation even though CPU profiles
will only be available to KVM to start with.

This commit builds on commit dd47149
on the Cyberus Technology fork by Philipp Schuster.

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
When a non-host CPU profile is selected the guest should only be able
to access MSRs that the CPU profile permits.

We thus introduce a function which takes a list of permitted MSRs and
produces a filter essentially only permitting the given MSRs.

The logic in this commit is somewhat complex, but we test it extensively
with property based testing.

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
We introduce a method on the CpuManager that prepares MSR related
changes required by the selected CPU profile.

This method is called upon constructing the CpuManager in
`Vm::create_cpu_manager` thus setting up all state that we need in
order to obtain MSR compatibility with the chosen CPU profile upon
creating (or restoring) vCPUs.

The `feature_msr` field introduced in this commit will be utilized to
set MSRs on vCPUs in a follow up commit.

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
We refactor the CPU configuration functionality so that we can set
feature MSRs according to a CPU profile.

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
In the context of live migration or restoring a snapshot more generally
we can only trust compatibility with the selected CPU profile as long
as setting the feature MSRs defined by the CPU profile succeeds.

This means that we have to adapt the current behavior, which only logs
a warning on MSRs that cannot be set, to instead error if the MSR is
declared to be crucial by the caller.

Alternatively we could perform compatibility checks for all necessary
feature MSRs before attempting to call `Vcpu::set_state`, but then we
would have to introduce a lot of complex code. We thus prefer to rather
let KVM do these checks for us when setting the MSRs.

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de>
On-behalf-of: SAP oliver.anderson@sap.com
@olivereanderson olivereanderson requested a review from a team as a code owner June 19, 2026 20:14
@olivereanderson

Copy link
Copy Markdown
Contributor Author

@phip1611 and @tpressure could you please give me a first review here since there are quite a few changes compared with our fork?

@olivereanderson olivereanderson marked this pull request as draft June 19, 2026 20:17
@phip1611 phip1611 self-requested a review June 20, 2026 05:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants