Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: Implement core::hash::Hasher on Digest #106

Closed
rjzak opened this issue Sep 12, 2023 · 3 comments
Closed

Feature: Implement core::hash::Hasher on Digest #106

rjzak opened this issue Sep 12, 2023 · 3 comments

Comments

@rjzak
Copy link

rjzak commented Sep 12, 2023

I'm trying to update the crc dependency for my crate, and I need the core::hash::Hasher trait. Related to #62, I'm having some issues I can't figure out.

Code: https://github.com/malwaredb/lzjd-rs/blob/ee6183a692b8691d60550ef51f22132243f0b263/src/crc32.rs#L21

The closest I've come is this weird code which doesn't work:

pub struct CRC32Hasher<'a> {
    crc: crc::Crc<u32>,
    digest: Option<&'a crc::Digest<'a, u32>>,
}

impl<'a> CRC32Hasher<'a> {
    fn new() -> Self {
        let crc = crc::Crc::<u32>::new(&CRC_32_BZIP2);
        Self { crc, digest: None }
    }
}

impl<'a> Hasher for CRC32Hasher<'a> {
    fn finish(&self) -> u64 {
        self.digest.clone().unwrap().finalize() as u64
    }

    fn write(&mut self, bytes: &[u8]) {
        if self.digest.is_none() {
            self.digest = Some(self.crc.digest()); // doesn't live long enough
        }

        //self.digest.as_ref().unwrap().update(bytes);
    }
}

/// std::hash::BuildHasher that builds CRC32Hashers
#[derive(Clone, Default)]
pub struct CRC32BuildHasher<'a> {
    phantom: Option<&'a [u8]>, // need lifetimes specified, and need a member to connect a lifetime, but this data isn't used
}

impl<'a> BuildHasher for CRC32BuildHasher<'a> {
    type Hasher = CRC32Hasher<'a>;

    fn build_hasher(&self) -> Self::Hasher {
        CRC32Hasher::new()
    }
}
  1. The finalize in the crc crate takes ownership, which causes a problem with the Hasher train.
  2. I can't have crc::Crc::<u32>::new(&CRC_32_BZIP2)::digest(); as the sole member of the struct of a complaint that the intermediate value doesn't last long enough. Nor does have the variable in the function scope work, since the borrow checker also complains about the reference to the local variable.

But having the Digest struct in this crate already implement the Hasher trait would be helpful, and probably useful to others as well. But maybe I'm missing something?

@KillingSpark
Copy link
Contributor

That is quite the interesting problem. The core of your immediate issues with the lifetimes is that you are basically trying to construct a self-refential struct with your CRC32Hasher. The digest field holds a reference to the crc field.

I'd propose to create a static variable that stores the Crc. This would allow to discard all lifetimes

pub struct CRC32Hasher {
    digest: crc::Digest<'static, u32>,
}
static CRC: crc::Crc<u32> = crc::Crc::<u32>::new(&CRC_32_BZIP2);

impl CRC32Hasher {
    fn new() -> Self {

        Self { digest: CRC.digest() }
    }
}

impl Hasher for CRC32Hasher {
    fn finish(&self) -> u64 {
        self.digest.clone().finalize() as u64
    }

    fn write(&mut self, bytes: &[u8]) {
        self.digest.update(bytes);
    }
}

/// std::hash::BuildHasher that builds CRC32Hashers
#[derive(Clone, Default)]
pub struct CRC32BuildHasher {}

impl BuildHasher for CRC32BuildHasher {
    type Hasher = CRC32Hasher;

    fn build_hasher(&self) -> Self::Hasher {
        CRC32Hasher::new()
    }
}

@KillingSpark
Copy link
Contributor

KillingSpark commented Oct 4, 2023

Edit Note: This originally said I couldn't figure out how to do this, but then I saw my mistake...

You can also just lift the Crc struct into the CRC32BuildHasher so you can avoid the static/const variable:

pub struct CRC32Hasher<'a> {
    digest: crc::Digest<'a, u32>,
}

impl<'a> CRC32Hasher<'a> {
    fn new(crc: &'a Crc<u32>) -> Self {
        Self { digest: crc.digest() }
    }
}

impl<'a> Hasher for CRC32Hasher<'a> {
    fn finish(&self) -> u64 {
        self.digest.clone().finalize() as u64
    }

    fn write(&mut self, bytes: &[u8]) {
        self.digest.update(bytes);
    }
}

pub struct CRC32BuildHasher {
    crc: crc::Crc<u32>,
}

impl CRC32BuildHasher {
    pub fn new() -> Self {
        Self { crc: crc::Crc::<u32>::new(&CRC_32_BZIP2) }
    }
}

impl<'a> BuildHasher for &'a CRC32BuildHasher {
    type Hasher = CRC32Hasher<'a>;
 
    fn build_hasher(&self) -> Self::Hasher { 
        CRC32Hasher::new(&self.crc)
    }
}

@rjzak
Copy link
Author

rjzak commented Oct 6, 2023

Thank you for that. Due to these difficulties, and having an issue trying to figure out which flavour of CRC32 was the correct one I needed for my algorithm (I unsuccessfully tried options which seemed usable), I switched to the crc32fast crate.

@rjzak rjzak closed this as completed Oct 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants