Skip to content

Add a stop-the-world, serial Compressor #1340

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 24 commits into from
Jul 24, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
1ddbb5e
Add a vaguely Compressor-esque GC
no-defun-allowed Jul 9, 2025
b80ba55
Clean up some
no-defun-allowed Jul 9, 2025
cd136b1
rustfmt
no-defun-allowed Jul 15, 2025
43b094a
clippy fix
no-defun-allowed Jul 15, 2025
4e02f12
cargo fmt
no-defun-allowed Jul 15, 2025
10e5439
Review comments
no-defun-allowed Jul 15, 2025
e568453
Capitalise sentences
no-defun-allowed Jul 15, 2025
b30d542
Allow running MockVM with side metadata
qinsoon Jul 16, 2025
237fbbf
Add a feature to disable LOS and immortal space for Compressor
no-defun-allowed Jul 16, 2025
efcf713
Use MMTk metadata for the offset vector and a separate mark bitmap
no-defun-allowed Jul 18, 2025
def9e36
Revert "Allow running MockVM with side metadata"
no-defun-allowed Jul 18, 2025
4608807
Fix some comments
no-defun-allowed Jul 18, 2025
2470de7
Skip MockVM tests with the Compressor on unsupported configurations
no-defun-allowed Jul 18, 2025
cd3bde3
Include common plan for create_space_mapping
no-defun-allowed Jul 18, 2025
a004cd6
cargo fmt
no-defun-allowed Jul 18, 2025
dfe9185
i686 is spelled x86 in this instance
no-defun-allowed Jul 18, 2025
b420216
Clean up some more
no-defun-allowed Jul 21, 2025
1ef15cf
Fix up some names and comments, and add SideMetadataSpec::are_differe…
no-defun-allowed Jul 21, 2025
b819763
Add a warning about the compressor_single_space feature
no-defun-allowed Jul 21, 2025
fd84cbf
More spellings of things
no-defun-allowed Jul 21, 2025
5845836
Use Region and RegionIterator for blocks
no-defun-allowed Jul 22, 2025
438f248
Actually use the LOS
no-defun-allowed Jul 23, 2025
e32d7a3
cargo fmt and clippy
no-defun-allowed Jul 23, 2025
ce4e1f4
Skip mock VM tests for Compressor
no-defun-allowed Jul 24, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions .github/scripts/ci-test.sh
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,15 @@ if [[ $arch == "x86_64" && $os == "linux" ]]; then
fi

ALL_PLANS=$(sed -n '/enum PlanSelector/,/}/p' src/util/options.rs | sed -e 's;//.*;;g' -e '/^$/d' -e 's/,//g' | xargs | grep -o '{.*}' | grep -o '\w\+')
ALL_DISCONTIGUOUS_PLANS=$(echo -n "$ALL_PLANS" | sed '/Compressor/d')

if [[ $arch == "x86" ]]; then
ALL_PLANS=$ALL_DISCONTIGUOUS_PLANS
fi

# At the moment, the Compressor does not work with the mock VM tests.
# So we skip testing the Compressor entirely.
ALL_PLANS=$ALL_DISCONTIGUOUS_PLANS

# Test with mock VM:
# - Find all the files that start with mock_test_
Expand All @@ -25,6 +34,8 @@ find ./src ./tests -type f -name "mock_test_*" | while read -r file; do
PLANS=$(sed -n 's/^\/\/ *GITHUB-CI: *MMTK_PLAN=//p' $file | tr ',' '\n')
if [[ $PLANS == 'all' ]]; then
PLANS=$ALL_PLANS
elif [[ $PLANS == 'discontiguous' ]]; then
PLANS=$ALL_DISCONTIGUOUS_PLANS
elif [[ -z $PLANS ]]; then
PLANS=NoGC
fi
Expand Down
6 changes: 6 additions & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -197,6 +197,12 @@ extreme_assertions = []
# Enable multiple spaces for NoGC, each allocator maps to an individual ImmortalSpace.
nogc_multi_space = []

# Disable multiple spaces for Compressor.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reason that we want to keep this feature? Would it be useful to anyone?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be the most space-efficient an algorithm can be. I believe Steve mentioned it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@no-defun-allowed Just mention in the comment there that this is configuration will move all objects. Don't use if you can't tolerate that. This is primarily for understanding how space-efficient we can be.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. Also the non-moving semantic can't be used with this feature.

It would be better if we can panic when the non-moving semantic is ever used. This probably would need a 'fake allocator' that panics when used, and the non-moving semantic is mapped to it. I am not sure if it is worth it.

# Enabling this feature will cause the Compressor to move all objects, which will be
# slower for large objects and will be outright incorrect for bindings which allocate
# non-moving objects; but compacting all objects will be more space-efficient.
compressor_single_space = []

# To collect statistics for each GC work packet. Enabling this may introduce a small overhead (several percentage slowdown on benchmark time).
work_packet_stats = []

Expand Down
114 changes: 114 additions & 0 deletions src/plan/compressor/gc_work.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
use super::global::Compressor;
use crate::policy::compressor::CompressorSpace;
use crate::policy::compressor::{TRACE_KIND_FORWARD_ROOT, TRACE_KIND_MARK};
use crate::policy::largeobjectspace::LargeObjectSpace;
use crate::scheduler::gc_work::PlanProcessEdges;
use crate::scheduler::gc_work::*;
use crate::scheduler::GCWork;
use crate::scheduler::GCWorker;
use crate::scheduler::WorkBucketStage;
use crate::vm::ActivePlan;
use crate::vm::Scanning;
use crate::vm::VMBinding;
use crate::MMTK;
use std::marker::PhantomData;

/// Iterate through the heap and calculate the new location of live objects.
pub struct CalculateForwardingAddress<VM: VMBinding> {
compressor_space: &'static CompressorSpace<VM>,
}

impl<VM: VMBinding> GCWork<VM> for CalculateForwardingAddress<VM> {
fn do_work(&mut self, _worker: &mut GCWorker<VM>, _mmtk: &'static MMTK<VM>) {
self.compressor_space.calculate_offset_vector();
}
}

impl<VM: VMBinding> CalculateForwardingAddress<VM> {
pub fn new(compressor_space: &'static CompressorSpace<VM>) -> Self {
Self { compressor_space }
}
}

/// Create another round of root scanning work packets
/// to update object references.
pub struct UpdateReferences<VM: VMBinding> {
p: PhantomData<VM>,
}

unsafe impl<VM: VMBinding> Send for UpdateReferences<VM> {}

impl<VM: VMBinding> GCWork<VM> for UpdateReferences<VM> {
fn do_work(&mut self, _worker: &mut GCWorker<VM>, mmtk: &'static MMTK<VM>) {
// The following needs to be done right before the second round of root scanning
VM::VMScanning::prepare_for_roots_re_scanning();
mmtk.state.prepare_for_stack_scanning();
#[cfg(feature = "extreme_assertions")]
mmtk.slot_logger.reset();

// We do two passes of transitive closures. We clear the live bytes from the first pass.
mmtk.scheduler
.worker_group
.get_and_clear_worker_live_bytes();

for mutator in VM::VMActivePlan::mutators() {
mmtk.scheduler.work_buckets[WorkBucketStage::SecondRoots].add(ScanMutatorRoots::<
CompressorForwardingWorkContext<VM>,
>(mutator));
}

mmtk.scheduler.work_buckets[WorkBucketStage::SecondRoots]
.add(ScanVMSpecificRoots::<CompressorForwardingWorkContext<VM>>::new());
}
}

impl<VM: VMBinding> UpdateReferences<VM> {
pub fn new() -> Self {
Self { p: PhantomData }
}
}

/// Compact live objects based on the previously-calculated forwarding pointers.
pub struct Compact<VM: VMBinding> {
compressor_space: &'static CompressorSpace<VM>,
los: &'static LargeObjectSpace<VM>,
}

impl<VM: VMBinding> GCWork<VM> for Compact<VM> {
fn do_work(&mut self, worker: &mut GCWorker<VM>, _mmtk: &'static MMTK<VM>) {
self.compressor_space.compact(worker, self.los);
}
}

impl<VM: VMBinding> Compact<VM> {
pub fn new(
compressor_space: &'static CompressorSpace<VM>,
los: &'static LargeObjectSpace<VM>,
) -> Self {
Self {
compressor_space,
los,
}
}
}

/// Marking trace
pub type MarkingProcessEdges<VM> = PlanProcessEdges<VM, Compressor<VM>, TRACE_KIND_MARK>;
/// Forwarding trace
pub type ForwardingProcessEdges<VM> = PlanProcessEdges<VM, Compressor<VM>, TRACE_KIND_FORWARD_ROOT>;

pub struct CompressorWorkContext<VM: VMBinding>(std::marker::PhantomData<VM>);
impl<VM: VMBinding> crate::scheduler::GCWorkContext for CompressorWorkContext<VM> {
type VM = VM;
type PlanType = Compressor<VM>;
type DefaultProcessEdges = MarkingProcessEdges<VM>;
type PinningProcessEdges = UnsupportedProcessEdges<VM>;
}

pub struct CompressorForwardingWorkContext<VM: VMBinding>(std::marker::PhantomData<VM>);
impl<VM: VMBinding> crate::scheduler::GCWorkContext for CompressorForwardingWorkContext<VM> {
type VM = VM;
type PlanType = Compressor<VM>;
type DefaultProcessEdges = ForwardingProcessEdges<VM>;
type PinningProcessEdges = UnsupportedProcessEdges<VM>;
}
199 changes: 199 additions & 0 deletions src/plan/compressor/global.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,199 @@
use super::gc_work::CompressorWorkContext;
use super::gc_work::{
CalculateForwardingAddress, Compact, ForwardingProcessEdges, MarkingProcessEdges,
UpdateReferences,
};
use crate::plan::compressor::mutator::ALLOCATOR_MAPPING;
use crate::plan::global::CreateGeneralPlanArgs;
use crate::plan::global::CreateSpecificPlanArgs;
use crate::plan::global::{BasePlan, CommonPlan};
use crate::plan::plan_constraints::MAX_NON_LOS_ALLOC_BYTES_COPYING_PLAN;
use crate::plan::AllocationSemantics;
use crate::plan::Plan;
use crate::plan::PlanConstraints;
use crate::policy::compressor::CompressorSpace;
use crate::policy::space::Space;
use crate::scheduler::gc_work::*;
use crate::scheduler::GCWorkScheduler;
use crate::scheduler::WorkBucketStage;
use crate::util::alloc::allocators::AllocatorSelector;
use crate::util::heap::gc_trigger::SpaceStats;
#[allow(unused_imports)]
use crate::util::heap::VMRequest;
use crate::util::metadata::side_metadata::SideMetadataContext;
use crate::util::opaque_pointer::*;
use crate::vm::VMBinding;
use enum_map::EnumMap;
use mmtk_macros::{HasSpaces, PlanTraceObject};

/// Compressor implements a stop-the-world and serial implementation of
/// the Compressor, as described in Kermany and Petrank,
/// [The Compressor: concurrent, incremental, and parallel compaction](https://dl.acm.org/doi/10.1145/1133255.1134023).
#[derive(HasSpaces, PlanTraceObject)]
pub struct Compressor<VM: VMBinding> {
#[parent]
pub common: CommonPlan<VM>,
#[space]
pub compressor_space: CompressorSpace<VM>,
}

/// The plan constraints for the Compressor plan.
pub const COMPRESSOR_CONSTRAINTS: PlanConstraints = PlanConstraints {
max_non_los_default_alloc_bytes: MAX_NON_LOS_ALLOC_BYTES_COPYING_PLAN,
moves_objects: true,
needs_forward_after_liveness: true,
..PlanConstraints::default()
};

impl<VM: VMBinding> Plan for Compressor<VM> {
fn constraints(&self) -> &'static PlanConstraints {
&COMPRESSOR_CONSTRAINTS
}

fn collection_required(&self, space_full: bool, _space: Option<SpaceStats<Self::VM>>) -> bool {
self.base().collection_required(self, space_full)
}

fn common(&self) -> &CommonPlan<VM> {
&self.common
}

fn base(&self) -> &BasePlan<VM> {
&self.common.base
}

fn base_mut(&mut self) -> &mut BasePlan<Self::VM> {
&mut self.common.base
}

fn prepare(&mut self, tls: VMWorkerThread) {
self.common.prepare(tls, true);
self.compressor_space.prepare();
}

fn release(&mut self, tls: VMWorkerThread) {
self.common.release(tls, true);
self.compressor_space.release();
}

fn end_of_gc(&mut self, tls: VMWorkerThread) {
self.common.end_of_gc(tls);
}

fn get_allocator_mapping(&self) -> &'static EnumMap<AllocationSemantics, AllocatorSelector> {
&ALLOCATOR_MAPPING
}

fn schedule_collection(&'static self, scheduler: &GCWorkScheduler<VM>) {
// TODO use schedule_common once it can work with the Compressor
// The main issue there is that we need to ForwardingProcessEdges
// in FinalizableForwarding.

// Stop & scan mutators (mutator scanning can happen before STW)
scheduler.work_buckets[WorkBucketStage::Unconstrained]
.add(StopMutators::<CompressorWorkContext<VM>>::new());

// Prepare global/collectors/mutators
scheduler.work_buckets[WorkBucketStage::Prepare]
.add(Prepare::<CompressorWorkContext<VM>>::new(self));

scheduler.work_buckets[WorkBucketStage::CalculateForwarding].add(
CalculateForwardingAddress::<VM>::new(&self.compressor_space),
);
// do another trace to update references
scheduler.work_buckets[WorkBucketStage::SecondRoots].add(UpdateReferences::<VM>::new());
scheduler.work_buckets[WorkBucketStage::Compact]
.add(Compact::<VM>::new(&self.compressor_space, &self.common.los));

// Release global/collectors/mutators
scheduler.work_buckets[WorkBucketStage::Release]
.add(Release::<CompressorWorkContext<VM>>::new(self));

// Reference processing
if !*self.base().options.no_reference_types {
use crate::util::reference_processor::{
PhantomRefProcessing, SoftRefProcessing, WeakRefProcessing,
};
scheduler.work_buckets[WorkBucketStage::SoftRefClosure]
.add(SoftRefProcessing::<MarkingProcessEdges<VM>>::new());
scheduler.work_buckets[WorkBucketStage::WeakRefClosure]
.add(WeakRefProcessing::<VM>::new());
scheduler.work_buckets[WorkBucketStage::PhantomRefClosure]
.add(PhantomRefProcessing::<VM>::new());

use crate::util::reference_processor::RefForwarding;
scheduler.work_buckets[WorkBucketStage::RefForwarding]
.add(RefForwarding::<ForwardingProcessEdges<VM>>::new());

use crate::util::reference_processor::RefEnqueue;
scheduler.work_buckets[WorkBucketStage::Release].add(RefEnqueue::<VM>::new());
}

// Finalization
if !*self.base().options.no_finalizer {
use crate::util::finalizable_processor::{Finalization, ForwardFinalization};
// finalization
// treat finalizable objects as roots and perform a closure (marking)
// must be done before calculating forwarding pointers
scheduler.work_buckets[WorkBucketStage::FinalRefClosure]
.add(Finalization::<MarkingProcessEdges<VM>>::new());
// update finalizable object references
// must be done before compacting
scheduler.work_buckets[WorkBucketStage::FinalizableForwarding]
.add(ForwardFinalization::<ForwardingProcessEdges<VM>>::new());
}

// VM-specific weak ref processing
scheduler.work_buckets[WorkBucketStage::VMRefClosure]
.set_sentinel(Box::new(VMProcessWeakRefs::<MarkingProcessEdges<VM>>::new()));

// VM-specific weak ref forwarding
scheduler.work_buckets[WorkBucketStage::VMRefForwarding]
.add(VMForwardWeakRefs::<ForwardingProcessEdges<VM>>::new());

// VM-specific work after forwarding, possible to implement ref enququing.
scheduler.work_buckets[WorkBucketStage::Release].add(VMPostForwarding::<VM>::default());

// Analysis GC work
#[cfg(feature = "analysis")]
{
use crate::util::analysis::GcHookWork;
scheduler.work_buckets[WorkBucketStage::Unconstrained].add(GcHookWork);
}
#[cfg(feature = "sanity")]
scheduler.work_buckets[WorkBucketStage::Final]
.add(crate::util::sanity::sanity_checker::ScheduleSanityGC::<Self>::new(self));
}

fn current_gc_may_move_object(&self) -> bool {
true
}

fn get_used_pages(&self) -> usize {
self.compressor_space.reserved_pages() + self.common.get_used_pages()
}
}

impl<VM: VMBinding> Compressor<VM> {
pub fn new(args: CreateGeneralPlanArgs<VM>) -> Self {
let mut plan_args = CreateSpecificPlanArgs {
global_args: args,
constraints: &COMPRESSOR_CONSTRAINTS,
global_side_metadata_specs: SideMetadataContext::new_global_specs(&[]),
};

let res = Compressor {
compressor_space: CompressorSpace::new(plan_args.get_space_args(
"compressor_space",
true,
false,
VMRequest::discontiguous(),
)),
common: CommonPlan::new(plan_args),
};

res.verify_side_metadata_sanity();

res
}
}
5 changes: 5 additions & 0 deletions src/plan/compressor/mod.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
pub(super) mod gc_work;
pub(super) mod global;
pub(super) mod mutator;

pub use self::global::Compressor;
Loading
Loading