Skip to content

The first ever large scale manually labelled password dataset. The complete hotmail 2009 dataset is labelled.

License

Notifications You must be signed in to change notification settings

sirvan3tr/passwordninja

Repository files navigation

PasswordNinja

Process of Slicing the Passwords

Ethical Considerations

We have used real passwords that belong to individuals that were phished and consequently tricked into revealing their passwords. This raises few ethical issues: whether this in depth analysis will hurt those users? will it reveal any other secrets? will it identify any individuals? This dataset dates back to 2009 and therefore it is highly unlikely that the same users would have kept their passwords the same even if the identity of the individual could be revealed. Identities are highly unlikely to be revealed through our data. We have not used any PII or any other information that could link back to individuals. This dataset has also been previously used in other research papers.

This new dataset and repository have been created solely for academic research purposes. The data, sourced from publicly accessible repositories like SecList, does not reveal any novel information. Our analysis also does not expose any new details about the original users associated with these passwords. Importantly, this dataset is devoid of any personally identifiable information (PII).

The use of breached and leaked password datasets is well-established in published research, with some studies even incorporating PII such as email addresses and dates of birth. As such, this work aligns with existing practices in the field and does not necessitate additional scrutiny. Analyzing leaked datasets is crucial for advancing our understanding of how human-chosen secrets are employed, ultimately enabling us to enhance their resilience against malicious actors.

If you have any concerns then please submit an issue through this github repository.

About

The first ever large scale manually labelled password dataset. The complete hotmail 2009 dataset is labelled.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages