Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Telemetry code executes and throws in container, even with telemetry disabled #233

Closed
monken opened this issue May 6, 2024 · 6 comments · Fixed by #237
Closed

Telemetry code executes and throws in container, even with telemetry disabled #233

monken opened this issue May 6, 2024 · 6 comments · Fixed by #237

Comments

@monken
Copy link

monken commented May 6, 2024

Hi team,

I'm running baybe in a container under a non-root user. I'm setting BAYBE_TELEMETRY_ENABLED=false. However, some telemetry code still seems to be executed.

File "/workdir/.venv/lib/python3.12/site-packages/baybe/telemetry.py", line 109, in <module>
--
hashlib.sha256(getpass.getuser().upper().encode()).hexdigest().upper()[:10]
^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/getpass.py", line 169, in getuser
return pwd.getpwuid(os.getuid())[0]
^^^^^^^^^^^^^^^^^^^^^^^^^
KeyError: 'getpwuid(): uid not found: 1000'

As a workaround, I will create a user in the container so that this call doesn't throw an exception. However, I would expect that no code is executed as part of loading the baybe module when telemetry is disabled.

In general, the telemetry code seems pretty heavy (making suspicious syscalls to retrieve hostname, uid, tries to make an HTTP request which will likely raise some flags when scanning code bases for malicious behavior or backdoors).

I'd also wanted to mention that this hash is irreversible and cannot identify the user or their machine is not necessarily true. You can easily pre-compute a rainbow table of all hashes for valid usernames and then do a reverse lookup.

@Scienfitz
Copy link
Collaborator

Hey @monken

  1. Yep, seems like a design flaw in telemetry. Possibly simply solved by further shifting where the user and hostname are hashed. Will look into it, doesnt seem urgent from your side

  2. Was quite tricky to test all cases for this telemetry module because we simply couldn't think about all machine/setup combos let alone have access to them, thanks for the report

  3. We need the HTTP request to find out whether the user is within the Merck VPN. If you have a more suitable idea how to test this please let me know (one slight complication was also that the actually telemetry endpoint is not a HTTP but an opentelemetry backend with GRPC protocol if I remember correctly)

  4. Didnt know about the rainbow tables, thats tricky indeed, we are shortening the hash which should further increase the amount of possible texts for a given hash, but I guess thats not worth a lot. Any idea how to solve that or use another method for the ideally fully-non-reversible indicator?

@monken
Copy link
Author

monken commented May 6, 2024

You could just do a DNS lookup instead

import socket
socket.gethostbyname('verkehrsnachrichten.merck.de')

Should fail immediately if you are not on the internal network and doesn't require a heavy library like requests. Would it be possible it have an install target like baybe[notelemetry] that doesn't ship with this code at all? I really dislike the idea that an import statement triggers any sort of network request.

  1. Don't collect user data :)

@AdrianSosic
Copy link
Collaborator

Hi @monken, thanks for the valuable input. I guess @Scienfitz can have a look into the socket thing and also see how we can further tweak the telemetry part.

Regarding baybe[notelemetry]: unfortunately, pip has no opt-out mechanism, so we really have no idea how we could handle the problem any better. Let us know if you have a better solution.

Regarding don't collect user data: @Scienfitz, I guess we could actually drop the hashes entirely? As far as I remember, it was only do disambiguate between devs and non-devs, but we could simply leave the hashes empty for non-devs?

@Scienfitz
Copy link
Collaborator

removing the hash would mean we cant track roughly the number of unique users, a main metric we wanted
How much is the technical reversibility of usernames for company-internal users actually a problem?

I would support removing requests but there were additional requirements for this to work on the premade sagemaker kernels on UPTIMIZE AWS which had some sort of issue with it. So Id have to check that again to verify. requests and opentelemetry* can already be uninstalled if desired after #194 is merged

Scienfitz added a commit that referenced this issue May 13, 2024
This fixes #233 in that iT moves some code parts that have been executed
even if telemetry was disabled (and thus could fail).

It doesn't touch the suggested replacement of `requests` because a
confirmation with the UPTIMIZE Sagemaker kernels would be required,
which in itself requires an update of the Sagemaker<->baybe connection
due to the Python 3.8 deprecation
@Scienfitz
Copy link
Collaborator

the telemetry execution order had beenc hanged to not execute anything when its disabled, this should fix this Issue
will possibly also move towards socket instead of requests but will have to test a bit

AdrianSosic added a commit that referenced this issue May 20, 2024
This ditches `requests` as suggested in #233
@Scienfitz
Copy link
Collaborator

@monken
now 1) telemetry execution order has been fixed and 2) requests usage has been replaced by socket, thanks for the suggestions!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants