Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

database instance names contain redundant data #16

Closed
wydengyre opened this issue Sep 13, 2024 · 1 comment · Fixed by #17
Closed

database instance names contain redundant data #16

wydengyre opened this issue Sep 13, 2024 · 1 comment · Fixed by #17
Assignees
Labels
bug something isn't working

Comments

@wydengyre
Copy link

wydengyre commented Sep 13, 2024

Database instance names all look like the following:

3aa05350d41d8cd98f00b204e9800998ecf8427e
f1c841c1d41d8cd98f00b204e9800998ecf8427e
e5dde3f7d41d8cd98f00b204e9800998ecf8427e

Note that only the first eight characters change. The remaining are all identical: d41d8cd98f00b204e9800998ecf8427e

This is because the bytes are generated with this function:

func randomID(prefix string) (string, error) {

The function doesn't do what it's meant to. hash.Sum(bytes) is not generating a hash of the 4 random bytes, but rather generating a constant hash of empty data and appending it to the four random bytes.

Besides this bug in the code, though, the general idea of taking a hash of the 4 random bytes is odd. It adds no entropy and only serves to make the id longer and unwieldy.

I'd suggest replacing this with a call to stdlib uuid.

@peterldowns
Copy link
Owner

The function doesn't do what it's meant to. hash.Sum(bytes) is not generating a hash of the 4 random bytes, but rather generating a constant hash of empty data and appending it to the four random bytes.

Good catch! You're right.

Besides this bug in the code, though, the general idea of taking a hash of the 4 random bytes is odd. It adds no entropy and only serves to make the id longer and unwieldy.

Wish I could remember what I was thinking — I agree with you.

I'd suggest replacing this with a call to stdlib uuid.

I'm going to fix it by just hexifying the bytes and removing the hashing part, makes for a nice short string with the same amount of entropy. PR coming shortly.

peterldowns added a commit that referenced this issue Sep 25, 2024
As reported in #16, the
previous implementation of `randomID` used a hashing construction (a)
inappropriately, (b) incorrectly. As a result, the IDs that were being
generated had a random prefix and then a consistent suffix, like this:

```
3aa05350d41d8cd98f00b204e9800998ecf8427e
f1c841c1d41d8cd98f00b204e9800998ecf8427e
e5dde3f7d41d8cd98f00b204e9800998ecf8427e
```

This PR fixes the function to essentially just return the randomly
generated prefixes, and omit the suffix entirely. The fixed-suffix was
always a mistake and never intended, I just wasn't paying close
attention beyond "are the instance database names colliding".

I tested this change by running some tests that used `pgtestdb` and
intentionally failed, then checking the connection strings that were
printed as part of the failing test's logs. Here's an example:

```
testdbconf: postgres://pgtdbuser:pgtdbpass@localhost:5433/testdb_tpl_c2b33a22a68750f84b5af76e9774be0e_inst_a4a3ce6c?sslmode=disabl
```

The key part is `inst_a4a3ce6c`, which is now (a) shorter than before,
(b) just as random.
@peterldowns peterldowns added the bug something isn't working label Sep 25, 2024
@peterldowns peterldowns self-assigned this Sep 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants