DGA-Prediction

Detecting Domains Names generated by Non-classical Domain Generation Algorithms in Botnets

Alexa top 1m domains
The Open-Source Intelligence (OSINT) DGA feed from Bambenek Consulting, which provided the malicious domain names [31]. This data feed was based on 50 DGA algorithms that together contained 852,116 malicious domain names. The dataset was downloaded on May 23, 2018 and DGAs were generated on that day. Also, on April 18, 2019, an additional dataset of 855,197 DGA generated domains was downloaded from OSINT for testing differences in model performance based on time and is regarded as a separate test dataset. DGAs to be implemented
classical DGA domains for the following malware families: banjori, corebot, cryptolocker, dircrypt, kraken, lockyv2, pykspa, qakbot, ramdo, ramnit, and simda.
Word-based/dictionary DGA domains for the following classical malware families:
- gozi
- matsnu
- suppobox

Environment Setup Script

conda create -n <ENVIRONMENT_NAME> python=2.7 scikit-learn keras tensorflow-gpu matplotlib
source activate <ENVIRONMENT_NAME>
pip install tldextract