Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some questions for a port to Go #33

Open
nlepage opened this issue Feb 1, 2023 · 11 comments
Open

Some questions for a port to Go #33

nlepage opened this issue Feb 1, 2023 · 11 comments

Comments

@nlepage
Copy link

nlepage commented Feb 1, 2023

Hello,

I discovered cuid2 (and cuid) today, and I gave a try at porting it to Go: https://github.com/nlepage/go-cuid2

It seems to be working fine, here are some ids generated with default length:

mhsx3pf2p8ocd2nlaioilwah
jt06wcr8l51t7wj9bxsdb0al
fvqdusrnkcohfi68gor7pooa
sdgqj4r3j0om523ykm0nch85
qw60k14c0kq14gnqjis8nh9f

I have some questions:

  • In createEntropy(), I'm calling a random prime number generator from the Go std library instead of picking in a predefined list of prime numbers
    • is ot OK?
    • isn't it too expensive? an alternative could be to generate a random prime numbers list once, then pick in this list when createEntropy() is called
  • In createFingerprint(), I don't have an equivalent of Object.keys(globalObj).toString()) to give to the hash function, is it a problem?

Thanks for your help!

@nlepage nlepage changed the title Some question for a port to Go Some questions for a port to Go Feb 1, 2023
@ericelliott
Copy link
Collaborator

You don't need to do the random prime thing in createEntropy. I will eventually pull it out of the JS version.

If you don't have host-unique global entropy to use as a fingerprint, you should find another source of host fingerprint entropy (e.g. hostname, pid). You should also parametrize it in case whatever you pick doesn't work for some users.

@ericelliott
Copy link
Collaborator

What's the status on your port?

@ericelliott
Copy link
Collaborator

@nlepage I simplified the createEntropy function in JavaScript. You may find it less confusing, now.

@nlepage
Copy link
Author

nlepage commented Feb 20, 2023

Hi @ericelliott
Thanks for your responses.

I'll try to integrate the simplified createEntropy into my port this week.

Regarding the host-unique global entropy, I'll try to build a fingerprint using some host information as you suggested.

@ericelliott
Copy link
Collaborator

Looking forward to it! 😎

@ericelliott
Copy link
Collaborator

Is this done? Can we add the Go port to the docs?

@nlepage
Copy link
Author

nlepage commented Feb 21, 2023

I made the changes, you can see the diff here nlepage/go-cuid2@3097e1d...c4d3129

For the fingerprint creation I used the OS environment, and added the hostname and pid to it.

I have some questions left:

  • In the JS version, why don't you give the random source to createFingerprint()? It calls createEntropy() which as a result uses Math.random()
  • Why are you using Object.keys() on globalObj? Wouldn't it be better to use Object.entries()?

@ericelliott
Copy link
Collaborator

In the JS version, why don't you give the random source to createFingerprint()? It calls createEntropy() which as a result uses Math.random()

This is a bug. 🐛

Why are you using Object.keys() on globalObj? Wouldn't it be better to use Object.entries()?

A few reasons:

  1. Keys usually supply sufficient cross-host entropy in JavaScript, because most hosts have different global keys, especially when paired with the random entropy.
  2. Values can grow extremely large and be many different types, which may or may not map well to strings, which could waste a lot of time and potentially cause errors.
  3. Security: Values may contain secrets that we don't want to potentially leak. Our entropy smoothie should be enough to prevent that from happening, but we should follow the principle of least knowledge here. Don't ask for information you don't actually need to do the job.

@nlepage
Copy link
Author

nlepage commented Feb 22, 2023

Okay, so using the system environment variables' keys and values as I did in the Go port is probably not a good idea (same potentially large values and security risk as Object.entries()).
I'm going to change that and use only the environment variables' keys.

One minor question : in bufToBigInt(), after shifting the number 8 bits left, I'm using a binary or (| operator) instead of an addition, I think it's OK, and might be a little faster, can you confirm?

Appart from that I think the port is OK and can be added to the list.
I still need to write some docs and more tests though.

@nlepage
Copy link
Author

nlepage commented Feb 22, 2023

Also is there a maximum length for generated ids?
What I see in practice is around 97/98 characters.

@ericelliott
Copy link
Collaborator

Okay, so using the system environment variables' keys and values as I did in the Go port is probably not a good idea (same potentially large values and security risk as Object.entries()).
I'm going to change that and use only the environment variables' keys.

Good call.

One minor question : in bufToBigInt(), after shifting the number 8 bits left, I'm using a binary or (| operator) instead of an addition, I think it's OK, and might be a little faster, can you confirm?

This sounds like it should be equivalent, but I wouldn't value a tiny micro-optimization over the potential to introduce a subtle bug in the crypto. Especially since compilers tend to automatically make optimizations like that anyway. Did you profile it?

Also is there a maximum length for generated ids? What I see in practice is around 97/98 characters.

I went with a conservative 32 char max length pretty arbitrarily, but mostly because ids should be long enough to be practically impossible to guess or extract entropy from - but not more. Excessive size causes other problems, such as storage, transport, and usability concerns. Users who want more probably need things like secret keys, or other things this standard was not designed for. They should probably be using a purpose-built, cryptographically secure algorithm. Better to have a sane range that solves the common case for 99.9% of users, IMO.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants