Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Statistics: Estimate how common key rotation within the same selector is #100

Open
foolo opened this issue Jun 21, 2024 · 4 comments · Fixed by #102
Open

Statistics: Estimate how common key rotation within the same selector is #100

foolo opened this issue Jun 21, 2024 · 4 comments · Fixed by #102
Labels
documentation Improvements or additions to documentation low

Comments

@foolo
Copy link
Contributor

foolo commented Jun 21, 2024

Using a large set of emails, for each domain-selector-pair in the set, try to dkim verify each email back in time (against current DNS record) and see if there is a pattern that older emails before some date cannot be verified, while newer emails can. This would indicate that the dkim key has been rotated for that selector.

In each specific case, there may of course be other reasons, so there is a lot of noise in the data, but for a large enough set of email, it should be possible to obtain some useful statistics.

Implementation idea:
Loop though mbox file(s) and try verify each email with https://pypi.org/project/dkimpy/

@Divide-By-0
Copy link
Member

Divide-By-0 commented Jun 21, 2024

We can also backdate dkim keys in our database to the earliest email that verifies with that key and domain and selector! Right now we just add the current date right

@foolo
Copy link
Contributor Author

foolo commented Jun 26, 2024

We can also backdate dkim keys in our database to the earliest email that verifies with that key and domain and selector! Right now we just add the current date right

It's a good idea! Should be quick to implement but I created an issue for it anyhow #101

@foolo foolo linked a pull request Jun 27, 2024 that will close this issue
@foolo foolo reopened this Jun 27, 2024
@foolo
Copy link
Contributor Author

foolo commented Jun 27, 2024

I have collected some stats but it's not entirely obvious how to interpret it.
I have handcrafted a measure on how likely it is that a certain DSP has been rotated, based on the verification status of different dates.

The crazy image below shows the DSPs sorted by this measure.
For example, the 1st line shows a DSP which started failing in 2022.
I am thankful for any additional ideas, or questions for clarification.

output2

@foolo
Copy link
Contributor Author

foolo commented Jun 27, 2024

@Divide-By-0 This is for combined_emails.mbox :
output

@Divide-By-0 Divide-By-0 added documentation Improvements or additions to documentation low labels Sep 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation low
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants