-
Notifications
You must be signed in to change notification settings - Fork 200
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New update for https://github.com/duckduckgo/tracker-radar/tree/main/entities #130
Comments
Hey Pouneh, thanks a lot for sharing you findings, we really appreciate it!
Not sure if I understand your question, but this repo is the source. You can reference it like this:
|
Hey Konrad, thanks for your reply. |
Ah, sorry for misunderstanding. We use public WHOIS data, SSL cert data and do manual investigation (e.g. by reviewing privacy polices). We also do semi-automatic cleanup. Small portion of the data is contributed by outside contributors. LMK if that helps! |
I see that makes sense. Thank you! |
Youe wellcome I will have to go back k in data bass to see what source it was |
I was working on a project to identify tracking/adverting domains on the Alexa echo device. I used https://github.com/duckduckgo/tracker-radar/tree/main/entities to find the parent companies behind each domain name. Thanks for sharing such a great dataset!
I figured out several domain names were not available in your dataset. So, I manually look them up from ICANN, crunchbase.com, or their website. Since some are tracking/advertising websites, I think it's good to update your database. Here is the update:
{'acsechocaptiveportal.com' : 'Amazon Technologies, Inc.',
'amazon-dss.com' : 'Amazon Technologies, Inc.',
'amazonalexa.com': 'Amazon Technologies, Inc.',
'amcs-tachyon.com' : 'Amazon Technologies, Inc.',
'fireoscaptiveportal.com' : 'Amazon Technologies, Inc.',
'chtbl.com' : 'Chartable Holding Inc',
'chrt.fm' : 'Chartable Holding Inc',
'dillilabs.com' : 'Dilli Labs LLC',
'megaphone.fm' : 'Spotify AB',
'omny.fm' : 'Triton Digital, Inc.',
'podtrac.com' : 'Podtrac Inc',
'voiceapps.com' : 'Voice Apps LLC',
'mittendorf.net' : 'individual',
'doctorpooch.com' : 'Dilli Labs LLC',
'kwimer.com' : 'Highwinds Network Group, Inc'}
I'm gonna cite this dataset in our paper. Can I ask where is the source of this dataset?
The text was updated successfully, but these errors were encountered: