Offloaded snapshot/restore (part 1) by sboeuf · Pull Request #8403 · cloud-hypervisor/cloud-hypervisor

sboeuf · 2026-06-17T08:05:58Z

By relying on the existing local live migration support and reusing the
semantics and the protocol associated with it, we intend to provide a
way for snapshotting and restoring a VM to/from a dedicated process that
we can call the offload daemon.

By allowing an external to perform the snapshot/restore actions on
behalf of Cloud Hypervisor, we give our users the opportunity to
implement their own offloaded daemon. The goal is to avoid bloating
Cloud Hypervisor with numerous features related to snapshot/restore, and
let the user decide how to perform the snapshot/restore actions. One
example is that we can decide to encrypt the guest RAM on the fly in
order to avoid writing an unencrypted version to local disk. Another
example is to be able to send guest RAM and associated state/config data
over the network without having to persist the data first to local
storage.

There might be other reasons to choose going with an offloaded daemon to
perform the snapshot/restore of the VM, but in every case, this empowers
the user to make their own choice.

rbradford · 2026-06-17T09:02:26Z

@sboeuf Oh noes! The CI failed on your new test

sboeuf · 2026-06-17T09:06:20Z

@sboeuf Oh noes! The CI failed on your new test

Yes I'm fixing it, that's a problem linked to the split of the original PR :)

phip1611

Thanks for working on this. Left a few remarks.

Expose VmMigrationConfig as a public facing structure that can be used by an offload daemon to act as if it was the VM to migrate to, or the VM to migrate from. Signed-off-by: Sebastien Boeuf <sboeuf@meta.com> Assisted-by: Claude:claude-opus-4-7

Adding a new dedicated binary that is meant to be used as a reference implementation for validating that offloaded snapshot/restore works and meant to be used through tests in general. Signed-off-by: Sebastien Boeuf <sboeuf@meta.com> Assisted-by: Claude:claude-opus-4-7

Signed-off-by: Sebastien Boeuf <sboeuf@meta.com> Assisted-by: Claude:claude-opus-4-7

Move next_data_extent and write_region_sparse out of memory_manager.rs into a new vmm::sparse module so the snapshot writer, the restore reader, and the offload daemon can share one implementation. No functional change intended. Signed-off-by: Sebastien Boeuf <sboeuf@meta.com> Assisted-by: Claude:claude-opus-4-7

Copy only populated extents when writing the snapshot file and when filling the restore memfd, leaving unwritten ranges as holes. Both the on-disk snapshot and the restored guest RAM stay sparse, so that untouched guest pages cost no disk space or host memory. This brings the offload daemon closer to be at feature parity with CH's internal implementation of snapshot/restore. The only missing piece is on-demand paging at this point. Signed-off-by: Sebastien Boeuf <sboeuf@meta.com> Assisted-by: Claude:claude-opus-4-7

Extending the snapshot/restore documentation so that it explains what are the goals behind this offloaded snapshot/restore feature, how to use it in practice, and also by documenting the protocol used by the offload daemon so that anyone could write its own daemon. By relying on the existing local live migration support and reusing the semantics and the protocol associated with it, we intend to provide a way for snapshotting and restoring a VM to/from a dedicated process that we can call the offload daemon. By allowing an external process to perform the snapshot/restore actions on behalf of Cloud Hypervisor, we give our users the opportunity to implement their own offloaded daemon. The goal is to avoid bloating Cloud Hypervisor with numerous features related to snapshot/restore, and let the user decide how to perform the snapshot/restore actions. One example is that we can decide to encrypt the guest RAM on the fly in order to avoid writing an unencrypted version to local disk. Another example is to be able to send guest RAM and associated state/config data over the network without having to persist the data first to local storage. There might be other reasons to choose going with an offloaded daemon to perform the snapshot/restore of the VM, but in every case, this empowers the user to make their own choice. Signed-off-by: Sebastien Boeuf <sboeuf@meta.com> Assisted-by: Claude:claude-opus-4-7

sboeuf · 2026-06-18T13:32:53Z

@phip1611 are we good to go with this PR?

phip1611

Thanks for addressing all concerns so fast. LGTM 🎉

sboeuf requested a review from a team as a code owner June 17, 2026 08:05

sboeuf requested review from phip1611 and rbradford and removed request for a team June 17, 2026 08:06

phip1611 requested a review from tpressure June 17, 2026 08:15

sboeuf mentioned this pull request Jun 17, 2026

Extend live migration protocol for postcopy and ondemand paging #8264

Open

sboeuf force-pushed the offload_snapshot_part1 branch from 0acc069 to 0e0bbba Compare June 17, 2026 09:11

rbradford reviewed Jun 17, 2026

View reviewed changes

Comment thread offload_daemon/src/main.rs Outdated

sboeuf force-pushed the offload_snapshot_part1 branch from 0e0bbba to 3a49e96 Compare June 17, 2026 09:20

rbradford approved these changes Jun 17, 2026

View reviewed changes

Comment thread vmm/src/sparse.rs

Comment thread vmm/src/lib.rs Outdated

sboeuf force-pushed the offload_snapshot_part1 branch 6 times, most recently from 130c1c8 to e39187c Compare June 17, 2026 19:08

phip1611 requested changes Jun 18, 2026

View reviewed changes

sboeuf added 6 commits June 18, 2026 02:24

vmm: Export VmMigrationConfig as public

6a69225

Expose VmMigrationConfig as a public facing structure that can be used by an offload daemon to act as if it was the VM to migrate to, or the VM to migrate from. Signed-off-by: Sebastien Boeuf <sboeuf@meta.com> Assisted-by: Claude:claude-opus-4-7

ci: Add integration test for offload snapshot

880a023

Signed-off-by: Sebastien Boeuf <sboeuf@meta.com> Assisted-by: Claude:claude-opus-4-7

sboeuf force-pushed the offload_snapshot_part1 branch from e39187c to 2902197 Compare June 18, 2026 09:24

phip1611 reviewed Jun 18, 2026

View reviewed changes

Comment thread offload_daemon/src/main.rs

phip1611 approved these changes Jun 18, 2026

View reviewed changes

sboeuf added this pull request to the merge queue Jun 18, 2026

Merged via the queue into cloud-hypervisor:main with commit 2f2f709 Jun 18, 2026
39 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Offloaded snapshot/restore (part 1)#8403

Offloaded snapshot/restore (part 1)#8403
sboeuf merged 6 commits into
cloud-hypervisor:mainfrom
sboeuf:offload_snapshot_part1

sboeuf commented Jun 17, 2026

Uh oh!

rbradford commented Jun 17, 2026

Uh oh!

sboeuf commented Jun 17, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

phip1611 left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sboeuf commented Jun 18, 2026

Uh oh!

phip1611 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

sboeuf commented Jun 17, 2026

Uh oh!

rbradford commented Jun 17, 2026

Uh oh!

sboeuf commented Jun 17, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

phip1611 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sboeuf commented Jun 18, 2026

Uh oh!

phip1611 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants