Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

analyze: implement Box<T> rewrites #1106

Merged
merged 20 commits into from
Oct 24, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
20 commits
Select commit Hold shift + click to select a range
894e8b0
analyze: refactor: change mir_op::RewriteKind::MutToImm to Reborrow
spernsteiner Jun 27, 2024
3dd9d5f
analyze: dataflow: add FREE/OFFSET special case to avoid impossible c…
spernsteiner Jun 27, 2024
3df3a0e
analyze: rewrite: initial ty and expr rewrites for FREE/Box<T>
spernsteiner Jun 27, 2024
701ae00
analyze: replace realloc1 test with one that can be made safe
spernsteiner Jun 27, 2024
8e86c79
analyze: initial DynOwned rewrites
spernsteiner Jun 28, 2024
780b22e
analyze: mir_op: refactor visit_place to distinguish Imm/Mut/Move con…
spernsteiner Jun 28, 2024
433e64e
analyze: mir_op: insert take() call on moves of DynOwned
spernsteiner Jul 1, 2024
1688386
analyze: rename ZeroizeType::Iterable to Array
spernsteiner Jul 3, 2024
92e01c9
analyze: add struct name to ZeroizeType::Struct
spernsteiner Jul 3, 2024
6b9904b
analyze: implement malloc -> Box::new rewrite
spernsteiner Jul 3, 2024
e2d8f91
analyze: mir_op: factor out "emit cast with adjusted TypeDesc"
spernsteiner Jul 5, 2024
82cc01b
analyze: implement free -> drop rewrite
spernsteiner Jul 5, 2024
82efb11
analyze: pointee: ensure free arg has a pointee type
spernsteiner Jul 5, 2024
6c23f5b
analyze: mir_op: enforce Ownership::Box in malloc/free casts
spernsteiner Jul 5, 2024
4f9b021
analyze: implement realloc rewrite
spernsteiner Jul 5, 2024
9b33750
analyze: fix tests/filecheck/alloc.rs realloc1 example
spernsteiner Jul 5, 2024
e4c4771
analyze: improve alloc test case
spernsteiner Jul 5, 2024
61e7876
analyze: rewrite::ty: use std::boxed::Box instead of alloc::boxed::Box
spernsteiner Jul 5, 2024
8b1b38b
analyze: implement calloc rewrite
spernsteiner Jul 5, 2024
95d4038
analyze: add comment about using Result<_, ()> for DynOwned
spernsteiner Oct 24, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 31 additions & 4 deletions c2rust-analyze/src/dataflow/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,18 @@ mod type_check;
#[derive(Clone, Debug)]
enum Constraint {
/// Pointer `.0` must have a subset of the permissions of pointer `.1`.
///
/// `Subset` and `SubsetExcept` have a special case involving `FREE` and `OFFSET` permissions.
/// The rewriter can't produce a cast that converts `Box<[T]>` to `Box<T>`; to avoid needing
/// such casts, we forbid assignment operations from discarding the `OFFSET` permission while
/// keeping `FREE`. We implement this restriction by adding an additional requirement to the
/// definition of `Subset(L, R)`: if `L` contains `FREE` and `R` contains `OFFSET`, then `L`
/// must also contain `OFFSET`. This is sufficient because all assignments and
/// pseudo-assignments generate `Subset` constraints.
///
/// If `L` does not contain `FREE`, then no additional requirement applies, even if `R` does
/// contain `OFFSET`. We allow discarding both `FREE` and `OFFSET` simultaneously during an
/// assignment.
Subset(PointerId, PointerId),
/// Pointer `.0` must have a subset of permissions of pointer `.1`, except
/// for the provided permission set.
Expand Down Expand Up @@ -102,10 +114,25 @@ impl DataflowConstraints {
| PermissionSet::OFFSET_SUB
| PermissionSet::FREE;

(
old_a & !(!old_b & (PROPAGATE_DOWN & !except)),
old_b | (old_a & (PROPAGATE_UP & !except)),
)
let remove_a = !old_b & PROPAGATE_DOWN & !except;
let add_b = old_a & PROPAGATE_UP & !except;

// Special case: as documented on `Constraint::Subset`, if the subset has `FREE`,
// we propagate `OFFSET` in the opposite direction. Specifically, if the superset
// has `OFFSET`, we add it to the subset, propagating "down". (Propagating "up"
// here could allow `OFFSET` and `!OFFSET` to propagated up into the same
// `PointerId` through two different constraints, creating a conflict.)
let add_a = if old_a.contains(PermissionSet::FREE) {
#[allow(bad_style)]
let PROPAGATE_DOWN_WHEN_FREE =
PermissionSet::OFFSET_ADD | PermissionSet::OFFSET_SUB;
old_b & PROPAGATE_DOWN_WHEN_FREE & !except
} else {
PermissionSet::empty()
};
debug_assert_eq!(add_a & remove_a, PermissionSet::empty());

((old_a | add_a) & !remove_a, old_b | add_b)
}

fn all_perms(
Expand Down
11 changes: 9 additions & 2 deletions c2rust-analyze/src/pointee_type/type_check.rs
Original file line number Diff line number Diff line change
Expand Up @@ -299,8 +299,15 @@ impl<'tcx> TypeChecker<'tcx, '_> {
self.assign(dest_lty.label, arg_lty.label);
}
Callee::Free => {
// No constraints on `free`, since it doesn't reveal anything about the concrete
// type.
// Here we create a fresh inference variable and associate it with the argument
// pointer. This doesn't constraint the type, since `free` doesn't reveal anything
// about the concrete type of the data, but it does ensure that the pointee type of
// the argument operand matches the pointee type of other pointers to the same
// allocation, which lets us remove a `void*` cast during rewriting.
let var = self.constraints.fresh_var();
assert_eq!(args.len(), 1);
let arg_lty = self.acx.type_of(&args[0]);
self.use_pointer_at_type(arg_lty.label, var);
}

Callee::Memcpy => {
Expand Down
171 changes: 166 additions & 5 deletions c2rust-analyze/src/rewrite/expr/convert.rs
Original file line number Diff line number Diff line change
Expand Up @@ -286,6 +286,119 @@ impl<'tcx> ConvertVisitor<'tcx> {
)
}

mir_op::RewriteKind::MallocSafe {
ref zero_ty,
elem_size,
single,
}
| mir_op::RewriteKind::CallocSafe {
ref zero_ty,
elem_size,
single,
} => {
// `malloc(n)` -> `Box::new(z)` or similar
ahomescu marked this conversation as resolved.
Show resolved Hide resolved
assert!(matches!(hir_rw, Rewrite::Identity));
let zeroize_expr = generate_zeroize_expr(zero_ty);
let mut stmts = match *rw {
mir_op::RewriteKind::MallocSafe { .. } => vec![
Rewrite::Let(vec![("byte_len".into(), self.get_subexpr(ex, 0))]),
Rewrite::Let1(
"n".into(),
Box::new(format_rewrite!("byte_len as usize / {elem_size}")),
),
],
mir_op::RewriteKind::CallocSafe { .. } => vec![
Rewrite::Let(vec![
("count".into(), self.get_subexpr(ex, 0)),
("size".into(), self.get_subexpr(ex, 1)),
]),
format_rewrite!("assert_eq!(size, {elem_size})"),
Rewrite::Let1("n".into(), Box::new(format_rewrite!("count as usize"))),
],
_ => unreachable!(),
};
let expr = if single {
stmts.push(Rewrite::Text("assert_eq!(n, 1)".into()));
format_rewrite!("Box::new({})", zeroize_expr)
} else {
stmts.push(Rewrite::Let1(
"mut v".into(),
Box::new(Rewrite::Text("Vec::with_capacity(n)".into())),
ahomescu marked this conversation as resolved.
Show resolved Hide resolved
));
stmts.push(format_rewrite!(
"for i in 0..n {{\n v.push({});\n}}",
zeroize_expr,
));
Rewrite::Text("v.into_boxed_slice()".into())
};
Rewrite::Block(stmts, Some(Box::new(expr)))
}

mir_op::RewriteKind::FreeSafe { single: _ } => {
// `free(p)` -> `drop(p)`
assert!(matches!(hir_rw, Rewrite::Identity));
Rewrite::Call("std::mem::drop".to_string(), vec![self.get_subexpr(ex, 0)])
}

mir_op::RewriteKind::ReallocSafe {
ref zero_ty,
elem_size,
src_single,
dest_single,
} => {
// `realloc(p, n)` -> `Box::new(...)`
assert!(matches!(hir_rw, Rewrite::Identity));
let zeroize_expr = generate_zeroize_expr(zero_ty);
let mut stmts = vec![
Rewrite::Let(vec![
("src_ptr".into(), self.get_subexpr(ex, 0)),
("dest_byte_len".into(), self.get_subexpr(ex, 1)),
]),
Rewrite::Let1(
"dest_n".into(),
Box::new(format_rewrite!("dest_byte_len as usize / {elem_size}")),
),
];
if dest_single {
stmts.push(Rewrite::Text("assert_eq!(dest_n, 1)".into()));
}
let expr = match (src_single, dest_single) {
(false, false) => {
stmts.push(Rewrite::Let1(
"mut dest_ptr".into(),
Box::new(Rewrite::Text("Vec::from(src_ptr)".into())),
));
stmts.push(format_rewrite!(
"dest_ptr.resize_with(dest_n, || {})",
zeroize_expr,
));
Rewrite::Text("dest_ptr.into_boxed_slice()".into())
}
(false, true) => {
format_rewrite!(
"src_ptr.into_iter().next().unwrap_or_else(|| {})",
ahomescu marked this conversation as resolved.
Show resolved Hide resolved
zeroize_expr
)
}
(true, false) => {
stmts.push(Rewrite::Let1(
"mut dest_ptr".into(),
Box::new(Rewrite::Text("Vec::with_capacity(dest_n)".into())),
));
stmts.push(Rewrite::Text(
"if dest_n >= 1 { dest_ptr.push(*src_ptr); }".into(),
));
stmts.push(format_rewrite!(
"dest_ptr.resize_with(dest_n, || {})",
zeroize_expr,
));
Rewrite::Text("dest_ptr.into_boxed_slice()".into())
}
(true, true) => Rewrite::Text("src_ptr".into()),
};
Rewrite::Block(stmts, Some(Box::new(expr)))
}

mir_op::RewriteKind::CellGet => {
// `*x` to `Cell::get(x)`
assert!(matches!(hir_rw, Rewrite::Identity));
Expand Down Expand Up @@ -566,7 +679,7 @@ fn generate_zeroize_code(zero_ty: &ZeroizeType, lv: &str) -> String {
match *zero_ty {
ZeroizeType::Int => format!("{lv} = 0"),
ZeroizeType::Bool => format!("{lv} = false"),
ZeroizeType::Iterable(ref elem_zero_ty) => format!(
ahomescu marked this conversation as resolved.
Show resolved Hide resolved
ZeroizeType::Array(ref elem_zero_ty) => format!(
"
{{
for elem in {lv}.iter_mut() {{
Expand All @@ -576,7 +689,7 @@ fn generate_zeroize_code(zero_ty: &ZeroizeType, lv: &str) -> String {
",
generate_zeroize_code(elem_zero_ty, "(*elem)")
),
ZeroizeType::Struct(ref fields) => {
ZeroizeType::Struct(_, ref fields) => {
eprintln!("zeroize: {} fields on {lv}: {fields:?}", fields.len());
let mut s = String::new();
writeln!(s, "{{").unwrap();
Expand All @@ -594,6 +707,27 @@ fn generate_zeroize_code(zero_ty: &ZeroizeType, lv: &str) -> String {
}
}

/// Generate an expression to produce a zeroized version of a value.
fn generate_zeroize_expr(zero_ty: &ZeroizeType) -> String {
ahomescu marked this conversation as resolved.
Show resolved Hide resolved
match *zero_ty {
ZeroizeType::Int => format!("0"),
ZeroizeType::Bool => format!("false"),
ZeroizeType::Array(ref elem_zero_ty) => format!(
"std::array::from_fn(|| {})",
generate_zeroize_expr(elem_zero_ty)
),
ZeroizeType::Struct(ref name, ref fields) => {
let mut s = String::new();
write!(s, "{} {{\n", name).unwrap();
for (name, field_zero_ty) in fields {
write!(s, "{}: {},\n", name, generate_zeroize_expr(field_zero_ty),).unwrap();
}
write!(s, "}}\n").unwrap();
s
}
}
}

fn take_prefix_while<'a, T>(slice: &mut &'a [T], mut pred: impl FnMut(&'a T) -> bool) -> &'a [T] {
let i = slice.iter().position(|x| !pred(x)).unwrap_or(slice.len());
let (a, b) = slice.split_at(i);
Expand All @@ -614,14 +748,14 @@ pub fn convert_cast_rewrite(kind: &mir_op::RewriteKind, hir_rw: Rewrite) -> Rewr
Rewrite::Ref(Box::new(elem), mutbl_from_bool(mutbl))
}

mir_op::RewriteKind::MutToImm => {
// `p` -> `&*p`
mir_op::RewriteKind::Reborrow { mutbl } => {
// `p` -> `&*p` / `&mut *p`
let hir_rw = match fold_mut_to_imm(hir_rw) {
Ok(folded_rw) => return folded_rw,
Err(rw) => rw,
};
let place = Rewrite::Deref(Box::new(hir_rw));
Rewrite::Ref(Box::new(place), hir::Mutability::Not)
Rewrite::Ref(Box::new(place), mutbl_from_bool(mutbl))
}

mir_op::RewriteKind::OptionUnwrap => {
Expand Down Expand Up @@ -661,6 +795,33 @@ pub fn convert_cast_rewrite(kind: &mir_op::RewriteKind, hir_rw: Rewrite) -> Rewr
Rewrite::MethodCall(ref_method, Box::new(hir_rw), vec![])
}

mir_op::RewriteKind::DynOwnedUnwrap => {
Rewrite::MethodCall("unwrap".to_string(), Box::new(hir_rw), vec![])
}
mir_op::RewriteKind::DynOwnedTake => {
// `p` -> `mem::replace(&mut p, Err(()))`
Rewrite::Call(
"std::mem::replace".to_string(),
vec![
Rewrite::Ref(Box::new(hir_rw), hir::Mutability::Mut),
Rewrite::Text("Err(())".into()),
],
)
}
mir_op::RewriteKind::DynOwnedWrap => {
Rewrite::Call("std::result::Result::<_, ()>::Ok".to_string(), vec![hir_rw])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't this just be an Option?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could, but DynOwned can be composed with Option<T> if we encounter a nullable owned pointer, and Option<DynOwned<T>> is clearer than Option<Option<T>>. But c2rust-analyze doesn't yet have a run-time support library where we could define DynOwned<T>, so for now we use the somewhat ugly Result<T, ()> in place of DynOwned<T>.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be useful to add a comment.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a comment on the translation of DynOwned types: 95d4038

}

mir_op::RewriteKind::DynOwnedDowngrade { mutbl } => {
let ref_method = if mutbl {
"as_deref_mut".into()
} else {
"as_deref".into()
};
let hir_rw = Rewrite::MethodCall(ref_method, Box::new(hir_rw), vec![]);
Rewrite::MethodCall("unwrap".into(), Box::new(hir_rw), vec![])
}

mir_op::RewriteKind::CastRefToRaw { mutbl } => {
// `addr_of!(*p)` is cleaner than `p as *const _`; we don't know the pointee
// type here, so we can't emit `p as *const T`.
Expand Down
Loading
Loading