Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AEAD Trait #40

Merged
merged 17 commits into from
Aug 20, 2019
Merged

AEAD Trait #40

merged 17 commits into from
Aug 20, 2019

Conversation

jcape
Copy link
Contributor

@jcape jcape commented Jul 26, 2019

Hi,

I've previously spoken with @newpavlov via e-mail, and he asked me to create this PR and made some cursory comments on the original, internal-to-MobileCoin, source code for this crate. I've attempted to address the comments and clarify the Nonce object's purpose. My goal here is to get this interface into widespread usage, so I'm amenable to changing whatever needs to be changed to make that a reality.

The crate itself is intended to provide a misuse-resistant, rust-ish way to use RFC5116 AEAD ciphers.

aead/src/lib.rs Outdated
/// An enum describing possible failure modes
#[cfg(not(feature = "serde"))]
#[derive(Clone, Copy, Debug, Eq, Fail, Hash, Ord, PartialEq, PartialOrd)]
pub enum AeadError {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might be worth considering an opaque error type here, and in general look at some papers, resources, and case studies on misuse resistance in the design of AEAD APIs.

This sort of information, leaked to the attacker, can potentially be used for attacks. Whether or not that is possible will depend on the code which impl's this trait, but I think it's very much worth considering.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even beyond that, AEADs built using the Encode-then-Encipher paradigm (HCTR, HCH, AEZ, and various others), which provides the maximal misuse resistance (RAE, in Rogaway's terminology), are incapable of distinguishing many of these. As a result, there exist AEADs that will have to choose which kind of misrepresentation they will perform, because this error type forces them to distinguish between things they cannot.

aead/src/lib.rs Outdated
+ Display
+ Into<Vec<u8>>
+ Send
+ Sized
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This bound disallows the use of variable-width nonces. Whether these are a good idea is debatable, but I'll point out this trait is incompatible with Miscreant in its current form:

https://github.com/miscreant/miscreant.rs/blob/master/src/aead.rs#L57

aead/src/lib.rs Outdated
}

/// A standard nonce implementation for AEAD algorithms which do not use
/// explicit nonces.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which algorithms are these? Is there precedent for these being used as part of an AEAD interface?

When I generally think of implicit nonces, or algorithms that do not need nonces (e.g. the many SIV algorithms implemented by Miscreant), I think of them as using an interface which lies below the AEAD abstraction layer (e.g. the SIV interface), or protocols which exist above the AEAD layer (TLS/Noise)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, that answers my question, but this seems like a single, draft, nonstandard AEAD construction (which self-identifies as such, per 2.1) which probably shouldn't figure into the design of the trait as a whole, except that the bounds on Nonce are sufficiently broad to allow concretely representing an empty nonce..

aead/src/lib.rs Outdated
/// encryption is meant to be managed internally by the implementation itself.
pub trait AuthenticatedBlockCipher: Sized + Send + Sync
where
for<'ciphertext> Vec<u8>: From<&'ciphertext Self::Ciphertext>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm seeing a lot of HRTBs in this PR and having used them in the past I think they're often unnecessary and desirable to avoid if possible. Just a thought.


#![no_std]

extern crate alloc;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This precludes usage on platforms without a heap. It'd be nice to gate this on an alloc feature and make the core APIs no_std-friendly with allocating wrappers. See Misreant's seal_in_place:

https://github.com/miscreant/miscreant.rs/blob/master/src/aead.rs#L57

...vs its seal wrapper which allocates:

https://github.com/miscreant/miscreant.rs/blob/master/src/aead.rs#L74

aead/src/lib.rs Outdated
/// Encrypts the given plaintext into a new ciphertext object and the nonce
fn encrypt<'ad, RngType: CryptoRng + RngCore>(
&mut self,
csprng: &mut RngType,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this is to randomly select the nonce. That's great from a misuse resistance perspective, however for protocols that use implicit nonces, they need to explicitly specify it.

I guess that's what you were going for with EmptyNonce? However if you're using an empty nonce, this parameter is irrelevant.

Overall I think this API is mixing concerns that belong at different layers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was originally added to randomly select an IV for the draft linked above.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems a lot of the eccentricities of this draft are leaking out in this API design, whereas a generalized AEAD trait needs to be broad enough to cover much more than just that.

aead/src/lib.rs Outdated
additional_data: impl Iterator<Item = &'ad [u8]>,
nonce: Self::Nonce,
ciphertext: Self::Ciphertext,
) -> Result<Vec<u8>, AeadError>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned earlier, I think it'd be nice for the core trait API that algorithm providers impl be no-std friendly. The rest of the RustCrypto crates are, so forcing allocation here is "missed it by that much"

@tarcieri
Copy link
Member

This seems like a good start, but it'd sure be nice if it were truly #![no_std]-friendly, and there seem to be a few things that I think should get solved at either a higher or lower layer which are making things a bit convoluted.

@burdges
Copy link

burdges commented Jul 27, 2019

I think being truly #[no_std] runs into the same issues with my approach in https://github.com/SymbolicSoft/noiseexplorer/pull/34 In that pull, I failed to avoid reinventing Vec but backed by a &mut [u8].

I agree with @tarcieri that we should work out doing AEAD without alloc though. Two questions:

Is a low-level "detached" interface that treats head, body, and tail separately wise?

Are we happy with some reinvention of Vec except with even more offsets than https://github.com/SymbolicSoft/noiseexplorer/pull/34/files#diff-bff167111f517232d669710d432efad3R15-R22

We could reduce the number of offsets by creating a new wrapper type for each message, so you'd have a builder for encryption, and ultimately leave the &mut [u8] in a properly encrypted state, while decryption would be simpler, only borrowing the interior, but leave the full &mut [u8] corrupted afterwards.

Any convenience methods for use with alloc should prefer Bytes over Vec probably.

@burdges
Copy link

burdges commented Jul 27, 2019

I could imagine some ATC tricks being helpful, like

pub trait AEAD {
    fn decrypt<'b>(&mut self, msg: &'b mut [u8], ...) -> (&'b mut [u8], ...);

    fn head_len(&self) -> usize;
    fn tail_len(&self) -> usize;
    fn overhead(&self) -> usize { self.head_len() + self.tail_len() }

    type Encrypt<'s,'b>: Encryptor<'b> + 's;
    fn encrypt<'b>(&'s mut self, buf: &'b mut [u8], ...) -> Encrypt<'s,'b>;
}
pub trait Encryptor<'b> {
    fn body(&mut self, len: usize) -> &mut [u8];
    ...
    fn done(self) -> &'b mut [u8];
}

@tarcieri
Copy link
Member

@burdges even that seems more complex and with more lifetimes than are necessary. I still don't see the need for anything resembling Vec except as an allocating convenience wrapper for those with alloc - it should be possible to build the API entirely in terms of slices and generic-array/const generics/associated consts (or fake typenum equivalents).

@burdges
Copy link

burdges commented Jul 27, 2019

Yes, if you know head and tail lengths at compile time, then you can handle them detached:

pub trait RawAEAD {
    fn decrypt<'b>(&mut self, msg: &'b mut [u8], ...) -> (&'b mut [u8], ...);

    const HEAD_LEN: usize;
    const TAIL_LEN: usize;
    fn encrypt<'b>(&mut self, buf: &'b mut [u8], ...) -> ([u8; HEAD_LEN], &'b [u8], [u8; TAIL_LEN],);
}

I think this only punts the problem though because now the user must reassemble them correctly. Right now, Noise explorer makes user's magically supply a correctly padded buffer, which makes debugging impossible.

In Noise, head_len changes among the early protocol messages, so you must handle everything using slices. I think detached schemes remain possible with slices, maybe even returning an impl Iterator<Item=&'b [u8]>, but again that only punts the problem.

In the scheme I provided, encrypt should tell users to call body, not prefill buf. And you could enforce this by zeroing buf, but regardless users were well warnned.

It's possible the detached scheme provides a internal foundational layer, from which once builds various convenience methods. I'd maybe do that via inherent impl though, not traits, not sure.

@tarcieri
Copy link
Member

Some other options:

  • seal_in_place or encrypt_in_place with separate arguments for &mut msg and &mut tag, with the allocating Vec wrapper handling the concatenation. In such an arrangement, it could have a default impl which appends the tag to the message tail, which could be overriden in special cases like AES-(PMAC-)SIV where the tag is prepended.
  • A Buffer (or thereabouts) type which handles the split_at_mut, possibly bounding the input message with Into<Buffer>

The first seems like a simple and straightforward approach to me.

@burdges
Copy link

burdges commented Jul 27, 2019

I took roughly that second buffer approach in my linked partial pull to noise explorer, but it's actually way messier than I wanted to go. You cannot make the user know their own message size when they supply the buffer.

If you adapt RawAEAD to slices then you'll take roughly that first detached approach. It's also the "most general" approach because it delays reassembly. Are there any use cases that benefit from detached usage though?

I have not considered all AEAD use cases, only network handshakes, so maybe detached comes up more in a raw AEAD context?

@burdges
Copy link

burdges commented Jul 27, 2019

In fact, we could combine everything so the impl provides the interface you describe, but a wrapper type provides a more friendly builder pattern, thus avoiding ATCs.

pub trait AEAD {
    fn head_len(&self) -> usize;
    fn tail_len(&self) -> usize;
    fn overhead(&self) -> usize { self.head_len() + self.tail_len() }

    fn decrypt_detached(&mut self, head: &[u8], text: &mut [u8], tail: &[u8], ad: &[u8]);
    fn decrypt<'b>(&mut self, ciphertext: &'b mut [u8], ad: &[u8]) -> &'b mut [u8] {
        let (head,text) = ciphertext.split_at_mut(self.head_len());
        let (text,tail) = text.split_at_mut(ciphertext.len() - self.tail_len());
        self.decrypt_detached(head,text,tail,ad)
        text
    }

    fn encrypt_detached(&mut self, head: &mut [u8], text: &mut [u8], tail: &mut [u8], ad: &[u8]);
    fn encrypt<'b>(&'s mut self, buf: &'b mut [u8]) -> Encryptor<'b,'s,Self> {
        Encryptor { aead: self, buf, len: self.aead.head_len(), }
    }
}
pub struct Encryptor<'b,'s,T: AEAD> {
    buf: &'b mut [u8],
    aead: &'s mut T,
    len: usize,
}
impl<'b,'s,T: AEAD> Encryptor<'b,'s,T> {
    fn append(&mut self, len: usize) -> &mut [u8] {
        let old_len = self.len;
        self.len += len;
        self.buf.split_at_mut(old_len).1.split_at_mut(len).0
    }
    fn done(self, ad: &[u8]) -> &'b mut [u8] {
        let h = self.aead.head_len();
        let (head,more) = self.buf.split_at_mut(h);
        let (text,tail) = more.split_at_mut(self.len - h);
        let tail = tail.split_at_mut(self.aead.tail_len())
        self.aead.encrypt_detached(head,text,tail, ad);
    }
}

We need to figure out the best sugar for types like Bytes too though.

@jcape
Copy link
Contributor Author

jcape commented Jul 29, 2019

So to summarize the comments, I see these as the primary things:

  1. AeadError as ZWT.
  2. No Vec usage.
  3. It should be possible to make miscreant's stuff implement this without having to do anything silly.

For the no-vec change, the suggestion is to make encrypt (as provided by the implementation) be encrypt_in_place(), but this leads to issues with AEAD implementations that want to stick things before the ciphertext, which in turn leads to the whole head_len()/tail_len() thing. The "tell me the layout of the ciphertext without telling me the layout of the ciphertext" honestly seems like it's in the wrong place. IMO, plaintext should start at byte zero, and if an AEAD wants the actual cipher output to live somewhere other than byte zero, it should be responsible for doing the memmove() to put it wherever it wants, and the caller's responsibility should be to provide a large-enough buffer to handle the ciphertext and whatever else the AEAD wants to throw into the mix.

This was part of the original intent of Ciphertext: an implementation would use these types to handle the output of encrypt() and input to decrypt(), and it would handle all the byte layout/serialization stuff. That is, it's meant to act as the Buffer type, but perhaps adding the corresponding Plaintext type is the answer?

Each AEAD implementation ships with a pair of Ciphertext/Plaintext objects which store references to slices and provide accessor(s) for the bits of the slice that should be used for different things.

That is, you take a mutable slice and a length, you give it to a Plaintext structure. That structure makes sure that the buffer is the right length for an AEAD on creation. It may/may not also memmove the data, and may/may not do something really slick like "take the RNG the draft needs" on construction.

Conversely, a Plaintext getting spit out from the decrypt() method would consume the interior slice of a Ciphertext and you'd use an accessor to get the actual plaintext, or maybe an into() that returns a slice?

So something like this:

struct BlahBlahCiphertext(&mut [u8]);
struct BlahBlahPlaintext(&mut [u8]);

let aead = BlahBlah::new(blahblah);
let mut plain_data = "this is the song that never ends."
let mut pt = vec![0u8; Plaintext::required_len(plain_data.len()];
(&pt[..plain_datalen()]).copy_from_slice(plain_data.as_bytes());
let mut plaintext = BlahBlahPlaintext::try_from(&pt)?;
let (ciphertext, nonce) = aead.encrypt(None, plaintext)?;
let plaintext = aead.decrypt(nonce, None, ciphertext)?;
let out_pt: &[u8] = plaintext.into();
assert_eq(out_pt, pt.as_ref());

How does that sound?

@burdges
Copy link

burdges commented Jul 30, 2019

It's okay to memmove instead of tracking offsets of course, but rust slices have only one length, so anyone calling encrypt cannot communicate their data length, which makes encryption impossible.

There are two reasonable options, the *_detached variants suggested by @tarcieri and some version that takes a buffer and data in separate calls. And memmove does not fit either approach.

You could supply a buffer and a separate length in one call, like C code might do, but such an interface is awkward, error prone, and not idiomatic rust. Do not do this.

@tarcieri
Copy link
Member

tarcieri commented Jul 30, 2019

One option I think is worth considering is punting on the question of an "attached" #![no_std]-friendly API, and only initially providing the detached API (which would be what cipher authors impl) and an "attached" allocating API which operates in terms of Vec<u8> or wrappers thereof, and which only supports postfix tags.

This dramatically simplifies the API and covers the most common cases. The only AEAD-ish modes I can even think of which don't fit this are AES-SIV and AES-PMAC-SIV (but, as it were, AES-GCM-SIV was amended to use postfix tags like other common AEAD modes).

This postpones the bikeshed over what exactly the "buffer" API looks like, which is really just much ado about split_at_mut. We can always circle back on methods which act on a "buffer" type, and eventually implement the allocating Vec<u8>-based methods in terms of those "buffer" methods, which can invoke the "detached" methods. This would allow redefining how to concatenate the tag and the ciphertext in a way that can support the older SIV modes and therefore Miscreant, in a way that's backwards compatible with everything else that handles that concatenation the "normal" way.

@burdges
Copy link

burdges commented Jul 30, 2019

I think the prefix vs postfix tags largely comes down to whether the same trait should handle handshakes too.

I think noise needs only postfix after the handshake, but noise requires messages during the handshake, ala certs, etc., so you want the same interface there. I've no idea if this even makes any sense for say rustls.

@jcape
Copy link
Contributor Author

jcape commented Jul 31, 2019

I don't think doing only the "detached" methods proposed by @burdges actually helps implement RFC5116-compliant AEADs. The ciphertext format is entirely up to the algorithm, which is why you can have the CBC-HMAC draft prepend a 16-byte IV to the ciphertext, and append a MAC, miscreant prepend the tag, and both of them are still well within the lines of RFC5116. Making the user care about what goes where seems like it's just making a lot of pain.

The Encryptor/Decryptor builder things: It's not too far from what I described, and it strikes me as though it's useful for the case where you're getting partial data and need to re-assemble it yourself, but I can't see why that functionality belongs in an AEAD interface.

My takeaway from all this is that you really want a mutable slice with some additional constraints (at minimum, it must have at least X bytes of unused space for the algo to fill stuff in). In my experience, "I want this existing type, but I want extra constraints on it" seems like exactly the place to create a new type, hence my description of the Plaintext associated type.

Is there a limitation around the Ciphertext/Plaintext helper types which I'm not seeing?

@tarcieri
Copy link
Member

tarcieri commented Jul 31, 2019

I don't think doing only the "detached" methods proposed by @burdges actually helps implement RFC5116-compliant AEADs. The ciphertext format is entirely up to the algorithm

The detached modes allow you to reassemble the ciphertext however you want. You can override the default impl of the Vec-based mode to do what you want.

My takeaway from all this is that you really want a mutable slice with some additional constraints (at minimum, it must have at least X bytes of unused space for the algo to fill stuff in). In my experience, "I want this existing type, but I want extra constraints on it" seems like exactly the place to create a new type, hence my description of the Plaintext associated type.

I think we're in violent agreement about this, but there's lots of bikeshedding happening about the precise way in which to implement it.

Is there a limitation around the Ciphertext/Plaintext helper types which I'm not seeing?

You say Ciphertext/Plaintext, and seem to be suggesting concrete types? (associated types? I'm not sure)

I think that can all be done with one Buffer type which is generic around A: Aead.

I'm seeing a lot of bikeshedding and talking past each other here, which I think can be addressed by punting on what that newtype (or types, or associated types, or a type generic around a trait), having an actual concrete trait to think about things in terms of, and it will still fit everyone's use cases, just not be the absolute optimal #![no_std] in-place/sans-alloc API, which the original PR punted on anyway.

Make sense?

@jcape
Copy link
Contributor Author

jcape commented Jul 31, 2019

I'm suggesting Ciphertext/Plaintext traits, and AEADs are on the hook to provide their own implementations. That is, each AEAD provides it's own set of Buffer types, because what goes where in the buffer is left entirely up to the AEAD in the RFC anyways.

@tarcieri
Copy link
Member

tarcieri commented Jul 31, 2019

Yes, I understand what you're suggesting. I think an alternative worth considering is if instead of creating two newtypes per algorithm, if instead that can be expressed as a single type which is generic around the single Aead trait, with the layout of the assembled encrypted message specified in terms of e.g. associated constants (especially post-const generics, which are quite close).

But also, I'm suggesting we punt on that debate for now, and move forward on a simple trait with the least common denominator of our many needs.

If I'm not expressing myself clearly enough in prose, perhaps I can open an alternative PR?

@burdges
Copy link

burdges commented Jul 31, 2019

Is there a limitation around the Ciphertext/Plaintext helper types which I'm not seeing?

I proposed the Ciphertext/Plaintext types approach in https://github.com/SymbolicSoft/noiseexplorer/pull/34 After @georgio ran into troubles doing it, I looked closer and realized I needed like 4ish offsets, not just two.

I think the Encryptor wrapper approach survives with only one ephemeral offset len. Also, there is no Decryptor, so that's only one type too.

I think that can all be done with one Buffer type which is generic around A: Aead.

Yes, you can reduce the offsets by making the buffer type dependent upon A: Aead, well Encryptor does exactly this.

I suspect an Encryptor wrapper provides some sweet spot though, minimal complexity, zeroing, write-friendly, only one extra type, etc.

It's true Encryptor does not exploit statically known sizes. You'll wants statically known plaintext sizes there too though, so that's some higher level with arrayref glue everywhere.

The detached modes allow you to reassemble the ciphertext however you want. You can override the default impl of the Vec-based mode to do what you want.

I think either attached or detached modes simplify backing the buffers with Vec, Bytes, SmallVec, future VLAs, static muts, etc. Just Vec and Bytes makes some !alloc interface worthwhile.

I agree with @jcape that's you're not actually complying with RFCs when doing detached mode. We all accept that detached mode exists only as glue to simplify everything else.

I'm suggesting Ciphertext/Plaintext traits, and AEADs are on the hook to provide their own implementations. That is, each AEAD provides it's own set of Buffer types, because what goes where in the buffer is left entirely up to the AEAD in the RFC anyways.

Any types satisfying these traits would still accepts arbitrary slices though? I do not believe you require so much flexibility, but who knows.

Are any AEAD modes commonly used on rope data structures? There is another interesting approach like

pub trait AEAD {
    decrypt<'a,I>(&'a mut self, iter: I) -> impl Iterator<Item=&'a mut [u8]>
    where I: Iterator<Item=&'a mut [u8]>;

    encrypt<'a,I>(&'a mut self, iter: I) -> impl Iterator<Item=&'a mut [u8]>
    where I: Iterator<Item=&'a mut [u8]>
}

which requires the AEAD allocate enough space for the prefix and postfix itself. We must tie the lifetimes of self and the buffers together though, which sounds annoying. And you need yet another allocation before merging, so this works poorly for building Bytes, Vec, etc. methods

@eternaleye
Copy link

eternaleye commented Jul 31, 2019

@burdges

Are any AEAD modes commonly used on rope data structures? There is another interesting approach like

pub trait AEAD {
    fn decrypt<'a,I>(&'a mut self, iter: I) -> impl Iterator<Item=&'a mut [u8]>
    where I: Iterator<Item=&'a mut [u8]>;

    fn encrypt<'a,I>(&'a mut self, iter: I) -> impl Iterator<Item=&'a mut [u8]>
    where I: Iterator<Item=&'a mut [u8]>
}

which requires the AEAD allocate enough space for the prefix and postfix itself. We must tie the lifetimes of self and the buffers together though, which sounds annoying. And you need yet another allocation before merging, so this works poorly for building Bytes, Vec, etc. methods

The big question here is whether this is expected to be a streaming/online AEAD where each element of the iterator can be decrypted as it is received (which is a very fraught topic; see Hoang, Reyhanitabar, Rogaway, & Vizár 2015), or if it's effectively just feeding discontiguous input to the AEAD (which should specify I: Clone + DoubleEndedIterator in addition to the bound you wrote, as some AEADs must make multiple passes, which may include backwards passes).

(Edit: added missing fns to fix syntax highlighting)

EDIT: Additionally, the interface as-written is not an AEAD - it is at most an AE, but even then it has no way to represent failure. It would need something more like this to be an AEAD, and to represent failure:

struct DecryptionError;

pub trait AEAD {
    fn encrypt<'a, 'd, A, P>(
        &mut self,
        ad: A,
        plaintext: P
    ) -> impl Iterator<Item = &'d mut [u8]>
    where
        A: Iterator<Item = &'a [u8]> + Clone + DoubleEndedIterator,
        P: Iterator<Item = &'d mut [u8]> + Clone + DoubleEndedIterator;

    fn decrypt<'a, 'd, A, C>(
        &mut self,
        ad: A,
        ciphertext: C,
    ) -> Result<impl Iterator<Item = &'d mut [u8]>, DecryptionError>
    where
        A: Iterator<Item = &'a [u8]> + Clone + DoubleEndedIterator,
        C: Iterator<Item = &'d mut [u8]> + Clone + DoubleEndedIterator;
}

(This becomes somewhat unwieldy, yes.)

Even so, this still has some key problems:

  1. This interface is stateful, and does not play well with uses of the AEAD which require particular nonce behavior (see also, the CHAIN and STREAM modes in the paper linked earlier). In addition, this means it takes &mut self rather than &self, which makes it more restrictive to use e.g. in parallel contexts.
  2. This interface still does not address expansion in the encrypt method, as it provides no way for the method to know how much of the plaintext iterator is actual plaintext, and how much is extra space. If the entire plaintext iterator is meant to be actual plaintext, then this interface cannot support it without not only allocating, but leaking the underlying Vec to make the lifetimes work.
  3. impl Trait in traits is still not available, so you'd need an associated type for each of A, P, and C (or, in the previous iteration, I), which is going to be troublesome due to the lifetimes involved.

The *_detached API addresses point (2), and avoids (3) entirely. However, it still needs to report failure.

In addition, an &mut self-using StatefulAead trait could be trivially implemented by wrappers around explicit-nonce interfaces.

Furthermore, the *_detached API naturally generalizes to discontiguous input, simply by replacing each &mut [u8] argument with a type parameter bounded by Iterator<Item=&mut [u8]> + Clone + DoubleEndedIterator


Some of this, such as the statefulness, is due to aspects of the draft that diverge from the common definition of "AEAD" in the literature - that is, a deterministic encryption function from a tuple (key, nonce, ad, plaintext) to a ciphertext, and a similarly deterministic decryption function from (key, nonce, ad, ciphertext) to either a plaintext or an uninformative error. This makes the draft itself a somewhat problematic basis for a trait named AEAD - users who hope to implement protocols from the literature that require an AEAD will be unable to do so.

If we take the standard definition at face value (and borrow a bit from Rogaway, in the use of $\tau$ to refer to the ciphertext expansion), we arrive at this API:

struct DecryptionError;

trait Aead {
    fn nonce_len() -> Option<usize>;
    fn tau_len() -> Option<usize>;
    fn encrypt(&self, nonce: &[u8], ad: &[u8], pt: &mut [u8], tau: &mut [u8]);
    fn decrypt<'a>(&self, nonce: &[u8], ad: &[u8], ct: &'a mut [u8], tau: &mut [u8])
    -> Result<&'a mut [u8], DecryptionError>;
}

(the tau in encrypt is semantically &out, but that's not in Rust)

A nonce_len or tau_len of None indicates support for arbitrary-length values at runtime, something supported by AEADs implemented according to the encode-then-encipher paradigm.

Higher-level constructs - such as the draft - can then be cleanly implemented in terms of this, maintaining and framing nonce, ct (ciphertext), and tau as they please.

@newpavlov
Copy link
Member

newpavlov commented Jul 31, 2019

So following the common abstract definition of AEAD, I think the simplest API would look like this:

// note: I would prefer to also have methods which will use fixed sizes
// for key and nonce, and ideally methods should be panic-free
trait Aead {
    fn new(key: &[u8]) -> Result<Self, CreationError>;
    // Encryption result is written to `buf`. Note that we probably should be able to
   // support AEAD schemes which prepend and append AD.
    fn encrypt<'a>(&self, nonce: &[u8], ad: &[u8], pt: &mut [u8], buf: &'a mut [u8])
        -> Result<&'a [u8], EncryptionError>;
    // IIUC not all AEAD schemes include nonce into AD, so we may have to
   // provide nonce. If scheme does include nonce into ciphertext, nonce argument
   // should be ignored. First returned slice is AD, second one is plaintext. We could
   // add a helper structure to make it less error-prone.
    fn decrypt<'a>(&self, ct: &'a mut [u8], nonce: Option<&[u8]>)
        -> Result<(&'a [u8], &'a [u8]), DecryptionError>;
}

The problem here is a memcpy of plaintext into the provided buffer, which could unnecessary bound performance of implementation. Also providing Vec-based convenience methods will be somewhat awkward.

An alternative could look like this:

trait Aead: Sized {
    fn new(key: &[u8]) -> Result<Self, CreationError>;
    // Here `f` will be called several times with AD, MAC, ciphertext and paddings.
    // If it returns an error, it will be bubbled to the caller.
    fn encrypt(
        &self, nonce: &[u8], ad: &[u8], pt: &mut [u8],
        f: impl FnMut(&[u8]) -> Result<(), EncryptionError>,
    ) -> Result<(), EncryptionError>;

    // Resulting ciphertext length must be equal or less of the hint length
    fn ciphertext_len_hint(nonce: &[u8], ad: &[u8], pt: &[u8]) -> usize;

    fn encrypt_to_slice(&self, nonce: &[u8], ad: &[u8], pt: &mut [u8], buf: &mut [u8])
        -> Result<(), EncryptionError>
    {
        let mut offset = 0;
        self.encrypt(nonce, ad, pt, |v| {
            // we can't use split_at_mut inside this closure, so we have to perform
            // a small dance to convince compiler to remove panic branch originating
            // from the potential overflow of offset after addition
            let n = v.len();
            if offset > buf.len() { return Err(EncryptionError) }
            let buf = &mut buf[offset..]
            if n > buf.len() { return Err(EncryptionError) }
            buf[..n].copy_from_slice(v);
            offset += n;
            Ok(())
        })?;
        Ok(())
    }

    #[cfg(feature="alloc")]
    fn encrypt_to_vec(&self, nonce: &[u8], ad: &[u8], pt: &'a mut [u8])
        -> Result<Vec<u8>, EncryptionError>
    {
        let n = Self::ciphertext_len_hint(nonce, ad, pt);
        let mut buf = Vec::with_capacity(n);
        self.encrypt(nonce, ad, pt, |v| {
            buf.extend_from_slice(v);
            Ok(())
        })?;
        Ok(buf)
    }

    fn decrypt<'a>(&self, ct: &'a mut [u8], nonce: Option<&[u8]>)
        -> Result<(&'a [u8], &'a [u8]), DecryptionError>;
}

What do you think? Have I missed something?

@eternaleye
Copy link

eternaleye commented Jul 31, 2019

@newpavlov My main issue with doing it precisely that way is that, by having tau (or buf) separate on the encrypt side, but as part of the ciphertext on the decrypt side, we open up more opportunities for error. An AEAD with that interface needs to presume its tau is at the start or the end, and the user must place it where the AEAD expects. Keeping tau separate for both encrypt and decrypt avoids this issue.

In addition, there are use cases where keeping them contiguous isn't useful to the user - for example, bcachefs' encryption keeps only the non-expanded portion of the ciphertext in the data extents, and stores tau in the btree data structure, so that the data extents do not become misaligned.

@tarcieri
Copy link
Member

tarcieri commented Jul 31, 2019

Here's what I'd suggest (I'll just write this with const generics syntax):

pub struct Error;

pub trait Aead {
    const KEY_SIZE: usize;
    const TAG_SIZE: usize;
    type Nonce;

    fn new(key: &[u8; KEY_SIZE]) -> Self;
    fn seal_in_place_detached(&self, nonce: Self::Nonce, associated_data: &[u8], buffer: &mut [u8], tag: &mut [u8; TAG_SIZE]);
    fn open_in_place_detached(&self, nonce: Self::Nonce, associated_data: &[u8], buffer: &mut [u8], tag: &[u8; TAG_SIZE])) -> Result<(), Error>;

    #[cfg(feature = "alloc")]
    fn seal(&self, nonce: Self::Nonce, associated_data: &[u8], plaintext: &[u8]) -> Vec<u8> {
        let mut buffer = vec![0; plaintext.len() + Self::TAG_SIZE];
        buffer[Self::TAG_SIZE..].copy_from_slice(plaintext);
        let (buf, tag) = buffer.split_at_mut(plaintext.len() - Self::TAG_SIZE);
        self.seal_in_place_detached(nonce, associated_data, buf, tag);
        buffer
    }

    #[cfg(feature = "alloc")]
    fn open(&self, nonce: Self::Nonce, associated_data: &[u8], ciphertext: &[u8]) -> Result<Vec<u8>, Error> {
        let mut buffer = Vec::from(ciphertext);
        if ciphertext.len() >= Self::TAG_SIZE {
            let (buf, tag) = buffer.split_at_mut(ciphertext.len() - Self::TAG_SIZE);
            self.open_in_place_detached(nonce, associated_data, buf, tag)?;
        } else {
            return Err(Error);
        }
        buffer.drain(..Self::TAG_SIZE);
        Ok(buffer)
    }
}

What I like about this:

  • Provides a happy path for the most common AEADs that concatenate the tag onto the end of the ciphertext, but can still be overridden for AES-(PMAC-)SIV or draft-mcgrew-aead-aes-cbc-hmac-sha2-05
  • Generic Nonce means Miscreant can use &[u8], draft-mcgrew-aead-aes-cbc-hmac-sha2-05 could use e.g. ()
  • Eliminates potential runtime errors with type safety
  • No lifetimes
  • Handles all slicing internally
  • Concrete instantiations of ciphers need only impl one trait
  • #![no_std] friendly (without requiring an allocator)
  • seal/open seem like popular names for these methods in other AEAD interfaces

I think it could still use a "buffer" type to handle the actual ciphertext assembly part, supplied as a parameter to hypothetical non-detached seal_in_place/open_in_place, e.g. Buffer<A: Aead> (or perhaps an impl AsRef<Buffer<A: Aead>> bound) which handles the split_at_mut (possibly with some sort of enum Layout associated const on Aead), but I think that can also be added retroactively.

@tarcieri
Copy link
Member

@jcape I agree, and would be happy with just encrypt and decrypt

@newpavlov
Copy link
Member

We can use vector-based encrypt/decrypt for v0.1 and revisit the naming in v0.2.

aead/src/lib.rs Outdated Show resolved Hide resolved
Copy link
Member

@tarcieri tarcieri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Looks good to me now (aside from @newpavlov’s nit)

@tarcieri
Copy link
Member

@jcape I think you'll need to edit .travis.yml and change all of the instances of:

cargo test --verbose --all --release

to

cargo test --verbose --all --exclude aead --release

...and then add a build matrix entry specifically for AEAD on Rust 1.36 that does:

cargo test --verbose --package aead --release

aead/src/lib.rs Outdated
extern crate alloc;

use alloc::vec::Vec;
use generic_array::{GenericArray, ArrayLength};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd probably be good to have a pub use generic_array like the other trait crates

@tarcieri
Copy link
Member

tarcieri commented Aug 19, 2019

I implemented the ChaCha20Poly1305 AEAD using this trait:

RustCrypto/AEADs#3

A few thoughts:

  1. GenericArray is an essential part of the API right now, but unlike the other RustCrypto traits it isn't publicly re-exported (see above)
  2. It presently takes keys as a value, but nonces as a reference, which was a bit surprising and felt inconsistent to me. Nonces should always be small enough to Copy, so I think it makes sense to just take the nonces by value as well.
  3. When trying to use encrypt/decrypt I found it a bit surprising that AAD was the first parameter, and not a nonce, which is what I'm used to from other AEAD APIs. My expectation would be a bit more operation(nonce, aad, message) or operation(nonce, message, aad). With just a slice-based API, this runs the risk of someone confusing the aad/message, so perhaps it'd be good to have an aead::Aad (or thereabouts) newtype so people can use aead::Aad::default() in the common case.

One last thought: in implementing RFC 8439, the specification and test vectors are "detached" in several places. I ended up internally implementing a detached API in order to implement the attached one. I think a detached API would be extremely helpful for AEAD implementations to write their core impls to (in addition to real-world use cases for detached AEAD consumers).

For example, here is the pseudocode for the ChaCha20Poly1305 AEAD, which is defined in the RFC in terms of a detached API:

https://tools.ietf.org/html/rfc8439#section-2.8.1

chacha20_aead_encrypt(aad, key, iv, constant, plaintext):
         nonce = constant | iv
         otk = poly1305_key_gen(key, nonce)
         ciphertext = chacha20_encrypt(key, 1, nonce, plaintext)
         mac_data = aad | pad16(aad)
         mac_data |= ciphertext | pad16(ciphertext)
         mac_data |= num_to_8_le_bytes(aad.length)
         mac_data |= num_to_8_le_bytes(ciphertext.length)
         tag = poly1305_mac(mac_data, otk)
         return (ciphertext, tag)

@newpavlov
Copy link
Member

Nonces should always be small enough to Copy, so I think it makes sense to just take the nonces by value as well.

If data is bigger than register, than it may be better to pass it by reference. Plus other crates pass both key and nonce by reference, so I think it will be better to be consistent.

@tarcieri
Copy link
Member

I'd be fine with them both being references too

@newpavlov
Copy link
Member

With just a slice-based API, this runs the risk of someone confusing the aad/message, so perhaps it'd be good to have an aead::Aad (or thereabouts) newtype so people can use aead::Aad::default() in the common case.

Hm, it's indeed not a great property, but I am not sure about using an associated type to fix it. I see two alternatives:

Cipher.encrypt(nonce, Payload { ad, msg })
// builder pattern
Cipher.with_ad(ad).encrypt(nonce, msg)

@jcape
Copy link
Contributor Author

jcape commented Aug 19, 2019

Between the two, I dislike the builder pattern less, but why not go all the way (naming notwithstanding):

CiphertextBuilder.ad(ad).nonce(nonce).plaintext(msg).encrypt();

@tarcieri
Copy link
Member

@jcape that's an interesting idea. One nit: I think plaintext(msg).encrypt() is redundant, since any time you're dealing with plaintext you're encrypting.

How about:

CiphertextBuilder.ad(ad).nonce(nonce).encrypt(msg);

@jcape
Copy link
Contributor Author

jcape commented Aug 19, 2019

@tarcieri The interesting bit is when you do CiphertextBuilder::new(aead: impl Aead) or CiphertextBuilder::stateless(aead: impl StatelessAead) :-)

@tarcieri
Copy link
Member

@jcape it's an idea worth pursuing, but perhaps after we land this PR to get the ball rolling?

@newpavlov
Copy link
Member

but why not go all the way

Because it's significantly more typing without sufficient improvements on being less error-prone? The problem originates from ad and msg having the same type, so by splitting only them we have already achieved the objective.

To make it nicer to read we could use Cipher.with(nonce, ad).encrypt(msg).

@tarcieri
Copy link
Member

@jcape looks like you need to modify build_std.sh to exclude the aead directory

@tarcieri
Copy link
Member

nightly appears to be failing because it seems implicit --all no longer works on a toplevel workspace? It seems you just need to add --all anywhere you're using --exclude

@tarcieri
Copy link
Member

Can we land this? 😉

@newpavlov
Copy link
Member

Yeah, lets merge it and continue work in a separate PR. Before publishing I think we should implement a solution for the ad/msg problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants