Skip to content

get random in UUID more cryptography secure #135226

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

LamentXU123
Copy link

random.getrandbits() is used widely in lib uuid espectially in generating clock_seqs in UUID v1 and v6

if 624*32//14+1 UUIDs are leaked to hackers, they could predict the next UUID generated.

reference: https://github.com/LamentXU123/cve/blob/main/UUID.md

to make it more cryptography secure, secrets.randbits() should be used instead of random.getrandbits()

@python-cla-bot
Copy link

python-cla-bot bot commented Jun 6, 2025

All commit authors signed the Contributor License Agreement.

CLA signed

@bedevere-app
Copy link

bedevere-app bot commented Jun 6, 2025

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

@LamentXU123 LamentXU123 changed the title get random in UUID more cryptography securty get random in UUID more cryptography secure Jun 6, 2025
Copy link
Member

@ZeroIntensity ZeroIntensity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uuid1() already has security warnings, and the rest that were changed here are documented as pseudo-random. I don't think we need to change anything here, apart from maybe adding some extra documentation notes.

@LamentXU123
Copy link
Author

You're right. But I think changing them to a more secure way has no disadvantage. Why not make it more unpredictable with no side effects?

@ZeroIntensity
Copy link
Member

Well, there are clearly side-effects. The UUID tests are totally broken by this change.

@LamentXU123
Copy link
Author

Well it seems like its broken because the test script uses random.getrandbits()

for example:

    def test_uuid1_time(self):
        with mock.patch.object(self.uuid, '_generate_time_safe', None), \
             mock.patch.object(self.uuid, '_last_timestamp', None), \
             mock.patch.object(self.uuid, 'getnode', return_value=93328246233727), \
             mock.patch('time.time_ns', return_value=1545052026752910643), \
             mock.patch('random.getrandbits', return_value=5317): # guaranteed to be random
            u = self.uuid.uuid1()
            self.assertEqual(u, self.uuid.UUID('a7a55b92-01fc-11e9-94c5-54e1acf6da7f'))

this will triger a assertion error because random.getrandbits() was changed to secrets.randbits() so teh script could not control the random number by patching ;)

I'll take a closer look soon

@LamentXU123
Copy link
Author

Well it seems like its broken because the test script uses random.getrandbits()

for example:

    def test_uuid1_time(self):
        with mock.patch.object(self.uuid, '_generate_time_safe', None), \
             mock.patch.object(self.uuid, '_last_timestamp', None), \
             mock.patch.object(self.uuid, 'getnode', return_value=93328246233727), \
             mock.patch('time.time_ns', return_value=1545052026752910643), \
             mock.patch('random.getrandbits', return_value=5317): # guaranteed to be random
            u = self.uuid.uuid1()
            self.assertEqual(u, self.uuid.UUID('a7a55b92-01fc-11e9-94c5-54e1acf6da7f'))

this will triger a assertion error because random.getrandbits() was changed to secrets.randbits() so the script could not control the random number by patching ;)

I'll take a closer look soon

@LamentXU123
Copy link
Author

Maybe I could add a parameter to uuid1() and uuid6() which is False by default, and if its true, using secret.randbits() instead of random.getrandbits() ?

@picnixz
Copy link
Member

picnixz commented Jun 7, 2025

First of all, please open an issue. This needs a wider discussion. I'll still post here.


I explicitly avoided using a CSPRNG for UUIDv8 because the RFC states that:

Implementations SHOULD NOT assume that UUIDs are hard to guess

And at the same time it states:

Implementations SHOULD utilize a cryptographically secure pseudorandom number generator (CSPRNG) to provide values that are both difficult to predict ("unguessable") and have a low likelihood of collision ("unique"). The exception is when a suitable CSPRNG is unavailable in the execution environment

Now, it's possible to predict the next UUIDv8 if we have enough UUIDv8s values (we can rewind MT-19937 on which random.getrandbits() is based if we have 624 32-bit outputs of the PRNG, but that's assuming we do have them). Since UUIDv8 directly embeds the random values, we could say that it's recoverable. But using a CSPRNG instead comes at a cost.

Now, since UUIDv8 is highly customizable, if someone wants to use a CSPRNG for the blocks, they should simply supply it outside the call. The wrapper is simple enough to write. In addition, if someone is worried about using a true random UUIDv8, they should use UUIDv4 instead. UUIDv8's purpose is not be secure or unguessable, that's the purpose of UUIDv4. Therefore, I don't want to change UUIDv8 (otherwise, the default output would behave as for UUIDv4 which is not interesting).


For the Node ID, it's something I overlooked and forgot. RFC 4122 actually didn't talk about CSPRNG, but its successor, RFC 9562, does:

Alternatively, implementations MAY elect to obtain a 48-bit cryptographic-quality random number as per Section 6.9 to use as the Node ID.

So for Node ID, I'm willing to switch to a CSPRNG, although we're already dealing with privacy issues due to the MAC address. Note that the random Node ID is only here in case we fail to fetch the MAC address.


Finally, for the clock sequence, it appears that other programming languages use a CSPRNG, even though the number of random bits is small (14) and thus, while not predictable with a single shot, it's possible to guess with a probability ~1/16000.


To summarize:

  • open an issue
  • don't change UUIDv8; a secure version of UUIDv8 is UUIDv4 (and I think I've said it in the docs, but we can still make it clearer)
  • change _random_node though it's only if we fail to guess the MAC and since gh-132710: only use stable _uuid.generate_time_safe() to deduce MAC address #132901, this function should be called less
  • benchmarks the performance impact when switching to secrets.getrandbits() for the clock sequence. Check if using os.urandom() + cutoff + int.from_bytes() is actually faster as well. Run those benchmarks under --enable-optimizations --with-lto and use pyperf to make the benchmarks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants