Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ability to hide or obfuscate IP addresses in logs #649

Open
joaquinrovira opened this issue Nov 23, 2023 · 17 comments · May be fixed by #785
Open

Ability to hide or obfuscate IP addresses in logs #649

joaquinrovira opened this issue Nov 23, 2023 · 17 comments · May be fixed by #785
Labels
enhancement New feature or request logging Related to what the tool outputs to the end user
Milestone

Comments

@joaquinrovira
Copy link

Issue:

Hey there! 👋 I'm currently running favonia/cloudflare-ddns:1.11.0 container on my publicly accessible ArgoCD instance, and the logs display IP addresses. To amp up privacy (especially since it's behind a Cloudflare proxy), I'm suggesting a feature to hide/obfuscate those IPs.

Details:

I'm thinking a simple config option in settings would do the trick. An environment variable like HIDE_IP. Maybe offer methods like replacing the last octet with "XXX" or hashing the IP. Personally, I think IP hashing is preferable in order to maintain observability of IP changes.

🌐 Detected the IPvX address: <SHA_256_IP_HASH>

Note:

Currently loving the tool! 🙌 Thanks for maintaining it. Your work is much appreciated! 🚀

@favonia favonia added enhancement New feature or request needs-investigation labels Nov 27, 2023
@favonia
Copy link
Owner

favonia commented Nov 27, 2023

@joaquinrovira Thanks for the idea and I am glad that the updater is working for you! I'm sorry I was offline during Thanksgiving. The updater still reveals too much information for my liking (#603), so thanks for pointing out another possibility.

I do want to understand more about the use cases, though, because IP addresses are not very "private," so to speak. There are many ways to uncover your IP: for example, if you have ever sent an email to a public mailing list or joined a public IRC channel, your IP might have already been recorded permanently. If your MX record ever contains your IP, or if you have once turned off the Cloudflare proxying, someone on the internet might have already permanently recorded that. (Many websites are keeping past DNS records.) The other part is, even if your IP is hidden, there are numerous bots scanning vulnerable servers, proxied or not. Overall, the internet is never designed to keep your IP a secret. Only a few protocols and services (e.g., Tor) try to hide your IP.

Therefore, I would like to understand more about the attacks you are trying to prevent. Currently, sensitive tokens and URLs are hidden so that someone looking over your shoulder cannot (easily) gain access to your account. However, many people can directly know your IP if they are physically that close. The only other case I could think of is copying and pasting your log into a GitHub issue. However, I am not yet convinced by the risks of revealing IPs v.s. the cost of making debugging more difficult.

Hashing is an interesting idea. Nonetheless, because the only case I could think of is to copy your IP into a GitHub issue, I don't think hashing will help---it might be easy to invert the hashing by enumerating plausible IPs, that is, a dictionary attack. In the case of IPv4, you can enumerate all possible addresses in no time.

The last thing I want to point out is that I wonder if you want HIDE_IP to affect messages sent to Healthchecks, Uptime Kuma, and/or the shoutrrr support I am currently adding (still a work in progress). In my opinion, the main difficulty of designing a good interface based on environment variables is to ensure all variables are as independent of each other as possible and all reasonable combinations have an intuitive meaning. This is why I am interested in learning more about your concerns and motivations to find a good design.

In any case, I am happy to implement something to address your concerns (at least after this busy semester), but I might need more information. Thank you!

@favonia favonia added the design? The next step is to reflect upon the information and come up with a good design label Nov 27, 2023
@favonia
Copy link
Owner

favonia commented Feb 28, 2024

@joaquinrovira Hi, I'm sorry if my intimidating (?) long comment accidentally shut down the conversation. Please feel free to add anything that you might find useful. Thanks! I am eager to figure out what should be changed to the tool (and implement them probably this summer 🤩).

@joaquinrovira
Copy link
Author

Sorry for the late response.

I do want to understand more about the use cases, though, because IP addresses are not very "private," so to speak.

I deploy stuff in my homelab using ArgoCD. This instance is publicly accessible just for fun. My servers are behind Cloudflare so my IP is AFAIK not trivially exposed. The issue is that ArgoCD dashboard also gives access to the cloudflare-ddns pod logs. Not terrible, but no ideal. So I was wondering if we could obfuscate those IPs somehow.

The other part is, even if your IP is hidden, there are numerous bots scanning vulnerable servers, proxied or not. Overall, the internet is never designed to keep your IP a secret. Only a few protocols and services (e.g., Tor) try to hide your IP.

Feel free to close the issue if you deem this is to be unnecessary. The idea is to make it slightly harder to get the IP, not impossible.

[...] easy to invert the hashing by enumerating plausible IPs, that is, a dictionary attack. In the case of IPv4, you can enumerate all possible addresses in no time.

This could be solved by salting before hashing. Maybe another env var or just random bytes.

The last thing I want to point out is that I wonder if you want HIDE_IP to affect messages sent to Healthchecks, Uptime Kuma, and/or the shoutrrr support I am currently adding (still a work in progress).

I have not looked into the effects outside my use case. No clue regarding this.

I do not have the disposable time right now but if I do I could try to push a proposal PR.

@joaquinrovira
Copy link
Author

Just wanted to add that the program has been running flawlessly for several months now. Thanks again for the effort of building and maintaining this project.

@favonia
Copy link
Owner

favonia commented Mar 1, 2024

I deploy stuff in my homelab using ArgoCD. This instance is publicly accessible just for fun. My servers are behind Cloudflare so my IP is AFAIK not trivially exposed. The issue is that ArgoCD dashboard also gives access to the cloudflare-ddns pod logs. Not terrible, but no ideal. So I was wondering if we could obfuscate those IPs somehow.

I see... should we just remove the IPs from the logging, then?

@favonia favonia added logging Related to what the tool outputs to the end user and removed design? The next step is to reflect upon the information and come up with a good design needs-investigation labels Mar 3, 2024
@joaquinrovira
Copy link
Author

Either one would work. Hashing allows to easily see when the underlying value changes. However, simply hiding would be enough. I would not want to add more complexity than needed.

@favonia favonia changed the title [FEATURE] Hide/Obfuscate IP in logs Ability to hide or obfuscate IP addresses in logs Mar 9, 2024
@favonia favonia added this to the near future milestone Mar 9, 2024
@favonia
Copy link
Owner

favonia commented Mar 21, 2024

@joaquinrovira Should we hide Cloudflare record/zone IDs as well?

@favonia
Copy link
Owner

favonia commented Mar 21, 2024

Proposed Design

  • LOG_OBFUSCATION=1
  • Meaning: whether the logging is obfuscated enough so that the log can be viewed by the public without fear; suitable for copy-pasting the log into a GitHub issue
  • Actual effects: domain names, zone IDs, record IDs, custom URLs, timezone, IPs are hidden
  • Known bugs: Go standard library will sometimes expose the full URLs that should be secret
  • Exceptions: This does not affect messages to Healthchecks.io, Uptime Kuma, and shoutrrr (still being implemented)

Did I miss anything that should be hidden as well? I know this is hiding more than what you requested, but I was trying to brainstorm something that could be useful in more use cases.

@favonia
Copy link
Owner

favonia commented Mar 21, 2024

The more refined version would be

LOG_OBFUSCATION=ip,timezone

@joaquinrovira
Copy link
Author

This proposal would be much more extensive the initial scope. It certainly covers my needs and probably anyone else public logs. (If there is anyone... 😅).

@favonia favonia modified the milestones: near future, 1.13.0 Jun 29, 2024
@favonia
Copy link
Owner

favonia commented Jul 1, 2024

Update: I think my proposal should probably be called LOG_REDACTION because we are planning to hide, not obfuscate, private information.

@favonia favonia linked a pull request Jul 1, 2024 that will close this issue
19 tasks
@favonia
Copy link
Owner

favonia commented Jul 3, 2024

@joaquinrovira I am thinking about these five kinds of "private" information:

  1. Tokens (token)
  2. IPs (ip)
  3. Domains (domain)
  4. IDs of records and zones (id?)
  5. Timezone (timezone)

By default only tokens are hidden. I think you want token,ip in your case... but I'm a bit reluctant to add all the complexity at once. The special value min shows everything and max hides all. As a starting point only three modes token, min, and max are supported and token is the default. Let me know if max does not work for you.

@joaquinrovira
Copy link
Author

Will do! Thank you very much. 😃

It will take me a couple of days at least as I'm AFK this week.

@favonia
Copy link
Owner

favonia commented Jul 4, 2024

@joaquinrovira I haven't implemented the feature yet! Subscribe to #785 to monitor my (slow) progress.

PS: I felt the timezone information might be too difficult to hide due to various side channels, giving up hiding it 🙃

@favonia
Copy link
Owner

favonia commented Jul 7, 2024

@joaquinrovira I have a design problem now: none of the libraries I use (including the Go standard library) were designed with obfuscation in mind, and they often generate very detailed error messages containing "private" information. To 100% block the leakage of IP addresses or other information, error messages outside my control cannot be shown at all, which could make debugging very difficult if something goes wrong.

Are you fine with your IPs usually hidden, but then potentially revealed when something goes wrong?

I also wonder if there's another way to solve your problem. For example, what if you redirect the logging into a file?

@joaquinrovira
Copy link
Author

I am okay with errors showing this kind of information. I have not seen any errors while running the application yet.

Furthermore, if errors are logged to stderr one can always redirect the output with2>/var/log/ddns.log.

@favonia favonia modified the milestones: 1.13.0, 1.14.0 Jul 16, 2024
@favonia
Copy link
Owner

favonia commented Jul 18, 2024

@joaquinrovira I'm currently going back to the drawing board because implementing this is more challenging than I thought---partially because of the elaborate system to generate various messages. I have a counterproposal: would it be easier to redirect everything into a file with a new configuration? The current coding makes it trivial to redirect logging (originally for testing the message printer). Such as

LOG_OUTPUT=/path/to/log

where the special value - means the standard output. I understand that the downside is you could not view the log directly in the ArgoCD dashboard.

By the way, it might not make much sense to send only errors to stderr because there is no inherent difference between errors and non-errors. The separation is perhaps more meaningful when a program is part of a pipeline to handle data; its standard output would be the input to the next program.

@favonia favonia modified the milestones: 1.14.0, near future Jul 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request logging Related to what the tool outputs to the end user
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants