Skip to content
This repository has been archived by the owner on Nov 7, 2024. It is now read-only.

Add new refescape module #107

Merged
merged 1 commit into from
Sep 30, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions lib/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,7 @@ tracing = "0.1"
[dev-dependencies]
clap = "2.33.3"
indoc = "1.0.3"
quickcheck = "1"
sh-inline = "0.1.0"
structopt = "0.3.21"

Expand Down
1 change: 1 addition & 0 deletions lib/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ pub mod cli;
pub mod container;
pub mod diff;
pub mod ima;
pub mod refescape;
pub mod tar;
pub mod tokio_util;

Expand Down
198 changes: 198 additions & 0 deletions lib/src/refescape.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,198 @@
//! Escape strings for use in ostree refs.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now, perhaps this should go into https://github.com/ostreedev/ostree-rs - but that takes us down the new road of adding Rust-native API. Which...well, we'll visit ostreedev/ostree#2427 at some point.

//!
//! It can be desirable to map arbitrary identifiers, such as RPM/dpkg
//! package names or container image references (e.g. `docker://quay.io/examplecorp/os:latest`)
//! into ostree refs (branch names) which have a quite restricted set
//! of valid characters; basically alphanumeric, plus `/`, `-`, `_`.
//!
//! This escaping scheme uses `_` in a similar way as a `\` character is
//! used in Rust unicode escaped values. For example, `:` is `_3A_` (hexadecimal).
//! Because the empty path is not valid, `//` is escaped as `/_2F_` (i.e. the second `/` is escaped).
Copy link
Member

@lucab lucab Sep 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fear that the rules for handling / are too weird and will lead to future footguns, as you can see trying to escape a// vs a/b/ vs a/b/c (and similar).

I think the problem is that this logic is conflating an input character / with the ostree fragments separator /. Per ostree regexp, those two are not the same thing. We do introduce ambiguity here.

My suggestions here would be:

  • always escape all / from the input string
  • clarify that the input stream is a single fragment, not a full ref (i.e. one can't do docker_name/my_ostree_branch)
  • if there is a need to encode a ref with multiple fragments, take them as already split inputs through a string slice (i.e. ["docker_name", "my_ostree_branch"].

Same for the unescape counterpart. With the current logic, once you get back an unescaped a/b/c it's ambiguous whether the initial input was a single docker name or three ostree fragments or something inbetween.

Copy link
Member Author

@cgwalters cgwalters Sep 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way the container code uses this, it is unambiguous - see this https://github.com/ostreedev/ostree-rs-ext/pull/99/files#diff-d618b05deec14d83f95c25b128171241e11614efec182d232e6ae46a8f4c09feR21 - basically because we have a static string as prefix, and the rest is assumed defined/required to be escaped.

I think it's at least somewhat nicer to see / literally, you just learn to read _3A_ as :. Here's a random paste from my test VM:

[root@cosa-devsh tmp]# ostree refs
ostree/container/blob/sha256_3A_0f79a34559f64a74b08fd5bbad34072f8c1a7784f33bd750d878c7f387f1b5a1
ostree/container/blob/sha256_3A32d62d73008776158de3532b560bb4d1c087f9600eecc732594a5a31cdccaf5b
ostree/container/blob/sha256_3A_32d62d73008776158de3532b560bb4d1c087f9600eecc732594a5a31cdccaf5b
ostree/container/blob/sha256_3A0f79a34559f64a74b08fd5bbad34072f8c1a7784f33bd750d878c7f387f1b5a1
ostree/container/blob/sha256_3A655734cf6267d3ba1506d4dac15329679216c9169093a5a99b60b7484e4320d4
ostree/1/1/2
ostree/1/1/1
ostree/container/image/docker_3A_/_2F_quay_2E_io/cgwalters/fcos-derivation-example_3A_latest
ostree/1/1/0
ostree/container/blob/sha256_3A_809c66ca43fa2ac72c15a83adb27b1e1794a7f5ae8463b3a8c559164a9201d8b
ostree/container/blob/sha256_3A_655734cf6267d3ba1506d4dac15329679216c9169093a5a99b60b7484e4320d4
ostree/container/blob/sha256_3A809c66ca43fa2ac72c15a83adb27b1e1794a7f5ae8463b3a8c559164a9201d8b
ostree/container/blob/sha256_3A5707006e0d4d228ea1011b85c431727af86bf46309615b04555e95fb3849dab7
test
[root@cosa-devsh tmp]# 

With your proposal of escaping individual fragments...the calling app would need to have rules/schema for which portions of a ref are escaped. I guess I could imagine doing that, but I am having trouble thinking of a strong use case for that versus the "prefixing" approach.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually we should canonicalize docker:// to registry: I think instead of the other way around. Then it'd be:
ostree/container/image/registry_3A_quay_2E_io/cgwalters/fcos-derivation-example_3A_latest
which I think is nicer than reading
ostree/container/image/registry_3A_quay_2E_io_2F_cgwalters_2F_fcos-derivation-example_3A_latest

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the additional context! It looks the API you are aiming for is perhaps something like $input -> prefix/$escaped-input and back? But then is there anything preventing people to do prefix/$escaped-input/arbitrarystuff?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I reworked this to require a prefix to better demonstrate best practice.


use anyhow::Result;
use std::convert::TryInto;
use std::fmt::Write;

/// Escape a single string; this is a backend of [`prefix_escape_for_ref`].
fn escape_for_ref(s: &str) -> Result<String> {
if s.is_empty() {
return Err(anyhow::anyhow!("Invalid empty string for ref"));
}
fn escape_c(r: &mut String, c: char) {
write!(r, "_{:02X}_", c as u32).unwrap()
}
let mut r = String::new();
let mut it = s
.chars()
.map(|c| {
if c == '\0' {
Err(anyhow::anyhow!(
"Invalid embedded NUL in string for ostree ref"
))
} else {
Ok(c)
}
})
.peekable();

let mut previous_alphanumeric = false;
while let Some(c) = it.next() {
let has_next = it.peek().is_some();
let c = c?;
let current_alphanumeric = c.is_ascii_alphanumeric();
match c {
c if current_alphanumeric => r.push(c),
'/' if previous_alphanumeric && has_next => r.push(c),
// Pass through `-` unconditionally
'-' => r.push(c),
// The underscore `_` quotes itself `__`.
'_' => r.push_str("__"),
o => escape_c(&mut r, o),
}
previous_alphanumeric = current_alphanumeric;
}
Ok(r)
}

/// Compute a string suitable for use as an OSTree ref, where `s` can be a (nearly)
/// arbitrary UTF-8 string. This requires a non-empty prefix.
///
/// The restrictions on `s` are:
/// - The empty string is not supported
/// - There may not be embedded `NUL` (`\0`) characters.
///
/// The intention behind requiring a prefix is that a common need is to use e.g.
/// [`ostree::Repo::list_refs`] to find refs of a certain "type".
///
/// # Examples:
///
/// ```rust
/// # fn test() -> anyhow::Result<()> {
/// use ostree_ext::refescape;
/// let s = "registry:quay.io/coreos/fedora:latest";
/// assert_eq!(refescape::prefix_escape_for_ref("container", s)?,
/// "container/registry_3A_quay_2E_io/coreos/fedora_3A_latest");
/// # Ok(())
/// # }
/// ```
pub fn prefix_escape_for_ref(prefix: &str, s: &str) -> Result<String> {
Ok(format!("{}/{}", prefix, escape_for_ref(s)?))
}

/// Reverse the effect of [`escape_for_ref()`].
fn unescape_for_ref(s: &str) -> Result<String> {
let mut r = String::new();
let mut it = s.chars();
let mut buf = String::new();
while let Some(c) = it.next() {
match c {
c if c.is_ascii_alphanumeric() => {
r.push(c);
}
'-' | '/' => r.push(c),
'_' => {
let next = it.next();
if let Some('_') = next {
r.push('_')
} else if let Some(c) = next {
buf.clear();
buf.push(c);
while let Some(c) = it.next() {
if c == '_' {
break;
}
buf.push(c);
}
let v = u32::from_str_radix(&buf, 16)?;
let c: char = v.try_into()?;
r.push(c);
}
}
o => anyhow::bail!("Invalid character {}", o),
}
}
Ok(r)
}

/// Remove a prefix from an ostree ref, and return the unescaped remainder.
///
/// # Examples:
///
/// ```rust
/// # fn test() -> anyhow::Result<()> {
/// use ostree_ext::refescape;
/// let s = "registry:quay.io/coreos/fedora:latest";
/// assert_eq!(refescape::unprefix_unescape_ref("container", "container/registry_3A_quay_2E_io/coreos/fedora_3A_latest")?, s);
/// # Ok(())
/// # }
/// ```
pub fn unprefix_unescape_ref(prefix: &str, ostree_ref: &str) -> Result<String> {
let rest = ostree_ref
.strip_prefix(prefix)
.map(|s| s.strip_prefix('/'))
.flatten()
.ok_or_else(|| {
anyhow::anyhow!(
"ref does not match expected prefix {}/: {}",
ostree_ref,
prefix
)
})?;
Ok(unescape_for_ref(rest)?)
}

#[cfg(test)]
mod test {
use super::*;
use quickcheck::{quickcheck, TestResult};

const TESTPREFIX: &str = "testprefix/blah";

const UNCHANGED: &[&str] = &["foo", "foo/bar/baz-blah/foo"];
const ROUNDTRIP: &[&str] = &[
"localhost:5000/foo:latest",
"fedora/x86_64/coreos",
"/foo/bar/foo.oci-archive",
"docker://quay.io/exampleos/blah:latest",
"oci-archive:/path/to/foo.ociarchive",
];
const CORNERCASES: &[&str] = &["/", "blah/", "/foo/"];

#[test]
fn escape() {
// These strings shouldn't change
for &v in UNCHANGED {
let escaped = &escape_for_ref(v).unwrap();
ostree::validate_rev(escaped).unwrap();
assert_eq!(escaped.as_str(), v);
}
// Roundtrip cases, plus unchanged cases
for &v in UNCHANGED.iter().chain(ROUNDTRIP).chain(CORNERCASES) {
let escaped = &prefix_escape_for_ref(TESTPREFIX, v).unwrap();
ostree::validate_rev(escaped).unwrap();
let unescaped = unprefix_unescape_ref(TESTPREFIX, &escaped).unwrap();
assert_eq!(v, unescaped);
}
// Explicit test
assert_eq!(
escape_for_ref(ROUNDTRIP[0]).unwrap(),
"localhost_3A_5000/foo_3A_latest"
);
}

fn roundtrip(s: String) -> TestResult {
// Ensure we only try strings which match the predicates.
let r = prefix_escape_for_ref(TESTPREFIX, &s);
let escaped = match r {
Ok(v) => v,
Err(_) => return TestResult::discard(),
};
let unescaped = unprefix_unescape_ref(TESTPREFIX, &escaped).unwrap();
TestResult::from_bool(unescaped == s)
}

#[test]
fn qcheck() {
quickcheck(roundtrip as fn(String) -> TestResult);
}
}