diff --git a/content/all_datasets.md b/content/all_datasets.md index e01b26e..16966e8 100644 --- a/content/all_datasets.md +++ b/content/all_datasets.md @@ -6,56 +6,57 @@ full-width: true before-content: gh_buttons.html --- -| Name | Network/Host Data | TL;DR | Year | Setting | OS Type | Labeled?ยน | Data Type/Source | Packed Size | Unpacked Size | -|----------------------------------------------------------------------------------------------------|:-----------------:|-------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------:|---------------|-----------------------|:---------:|--------------------------------------------------------------------------------------|------------:|--------------:| -| [AIT Alert Dataset](../datasets/ait_alert_dataset) | Both | Alerts generated from the AIT log dataset, including labels. Only caveat is the lack of Windows machines | 2023 | Enterprise IT | Linux | ๐ŸŸฉ | Wazuh, Suricata and AMiner alerts | 96 MB | 2,9 GB | -| [OTFR Security Datasets - LSASS Campaign](../datasets/otfr_lsass_campaign) | Both | Very small simulation focusing on exploiting Windows' LSASS.exe. Lacking documentation, no labels and no user behavior | 2023 | Single OS | Windows | ๐ŸŸฅ | pcaps, Windows events, Zeek logs | 423 MB | 1 GB | -| [AIT Log Dataset](../datasets/ait_log_dataset) | Both | Huge variety of labeled logs collected from multiple simulation runs of an enterprise network under attack. With user emulation. but only Linux machines | 2022 | Enterprise IT | Linux | ๐ŸŸฉ | pcaps, Suricata alerts, misc. logs (Apache, auth, dns, vpn, audit, suricata, syslog) | 130 GB | 206 GB | -| [CLUE-LDS](../datasets/clue_lds) | Host | Database of real user behavior without known attacks, for evaluation of methods detecting shifts in user behavior | 2022 | Subsystem | Undisclosed | ๐ŸŸฅ | Custom event logs | 640 MB | 14,9 GB | -| [EVTX to MITRE ATT&CK](../datasets/evtx_to_mitre_attck) | Host | Small dataset providing various events corresponding to certain MITRE tactics/techniques | 2022 | Single OS | Windows | ๐ŸŸฉ | Windows events | <1 GB | <1 GB | -| [OTFR Security Datasets - Atomic](../datasets/otfr_atomic) | Both | Various small datasets, each corresponding to a specific MITRE tactic/technique. Lacks user simulation / underlying scenario and does not provide explicit labels | 2019-2022 | Single OS | Windows, Linux, Cloud | ๐ŸŸจ | pcaps, Windows events, auditd logs, AWS CloudTrail logs | 125 MB | - | -| [PWNJUTSU](../datasets/pwnjutsu) | Both | Rich collection of complex attacks executed by various red team participants each acting in a small network, but not labeled | 2022 | Miscellaneous | Windows, Linux | ๐ŸŸฅ | pcaps, Windows events, Sysmon, auditd, various logs (Apache, auth, dns, ssh, etc.) | 82 GB | - | -| [NF-UQ-NIDS](../datasets/nf_uq_nids) | Network | Combination of four distinct network datasets using a newly proposed set of standardized features | 2021 | Miscellaneous | Windows, Linux, MacOS | ๐ŸŸฉ | Custom NetFlows | 2 GB | 14,8 GB | -| [OTFR Security Datasets - Log4Shell](../datasets/otfr_log4shell) | Both | Very small simulation focusing on the Log4j vulnerability. Lacking documentation, no explicit labels and no user behavior | 2021 | Single OS | Linux | ๐ŸŸจ | pcaps, Ubuntu events | <1 MB | 1 MB | -| [OTFR Security Datasets - SimuLand Golden SAML](../datasets/otfr_golden_saml) | Host | Barely a dataset, only contains very few traces for some specific events. At most usable to test specific Windows detection rules. | 2021 | Enterprise IT | Windows | ๐ŸŸฉ | Windows Events | - | <1 MB | -| [SOCBED Example Dataset](../datasets/socbed_dataset) | Both | Generated using the SOCBED framework, demonstrating reproducible dataset creation, though current attacks are on the basic side | 2021 | Enterprise IT | Windows, Linux | ๐ŸŸฅ | Windows events, Linux events, packetbeat | 78 MB | 1,3 GB | -| [Unraveled](../datasets/unraveled) | Both | Large dataset with intricate labeling, though the focus seems to be on network flows. Mapping will be annoying. | 2021 | Enterprise IT | Windows, Linux | ๐ŸŸฉ | pcaps, misc. logs (syslog, audit, auth, Snort) | - | 22 GB | -| [DAPT 2020](../datasets/dapt2020) | Both | Focuses on attacks mimicking those of an APT group, executed in a rather small environment | 2020 | Enterprise IT | Undisclosed | ๐ŸŸฉ | NetFlows, misc. logs (DNS, syslog, auditd, apache, auth, various services) | 460 MB | - | -| [OpTC](../datasets/optc) | Both | Huge amount of data and interesting attacks, but possibly hard to use due to uncommon event format and requiring semi-manual labeling | 2020 | Enterprise IT | Windows | ๐ŸŸจ | Custom event logs, Zeek events | - | 1 TB | -| [OTFR Security Datasets - APT 29](../datasets/otfr_apt_29) | Both | Replication of APT29 evaluation developed by MITRE. Well made and documented, but without labels or user behavior | 2020 | Enterprise IT | Windows, Linux | ๐ŸŸฅ | pcaps, Windows events, Zeek events | 126 MB | 2 GB | -| [CICDDoS2019](../datasets/cic_ddos) | Network | Dataset focusing on various DDoS attacks, covering a broad range of categories. Includes benign behavior, but only for Pcaps, not NetFlows | 2019 | Enterprise IT | Windows, Linux | ๐ŸŸฉ | Pcaps, NetFlows, Windows events, Ubuntu events | 24,4 GB | - | -| [DARPA TC5](../datasets/darpa_tc5) | Host | Custom event logs from network under attack from APT groups, designed to facilitate provenance tracking | 2019 | Undisclosed | Undisclosed | ๐ŸŸจ | Custom event logs | - | - | -| [LID-DS 2019](../datasets/lids_ds_2019) | Host | Contains system calls + associated data/metadata for a variety of Linux exploits, includes normal behavior | 2019 | Single OS | Linux | ๐ŸŸจ | Sequences of syscalls with extended information | 13 GB | - | -| [OTFR Security Datasets - APT 3](../datasets/otfr_apt_3) | Host | Replication of APT3 evaluation developed by MITRE. Lacking documentation, no labels and no user behavior | 2019 | Enterprise IT | Windows, Linux | ๐ŸŸฅ | Windows events | 30 MB | 855 MB | -| [ASNM Datasets](../datasets/asnm_datasets) | Network | Specialized features extracted from instances of remote buffer overflow attacks for the purpose of anomaly-based detection | 2009-2018 | Miscellaneous | Windows, Linux | ๐ŸŸฉ | Custom NetFlows | 21 MB | 95 GB | -| [AWSCTD](../datasets/awsctd) | Host | Syscalls collected from ~10k malware samples running on Windows 7, no user emulation | 2018 | Single OS | Windows | ๐ŸŸฉ | Sequences of syscall numbers | 10 MB | 558 MB | -| [CSE-CIC-IDS2018](../datasets/cse_cic_ids2018) | Both | Simulation of large enterprise IT (450 machines) with user emulation and various attacks, includes host and network logs, but only the latter are labeled | 2018 | Enterprise IT | Windows, Linux, MacOS | ๐ŸŸฉ | pcaps, NetFlows, Windows events, Ubuntu events | 220 GB | - | -| [DARPA TC3](../datasets/darpa_tc3) | Host | Custom event logs from network under attack, designed to facilitate provenance tracking | 2018 | Undisclosed | Undisclosed | ๐ŸŸจ | Custom event logs | 115 GB | - | -| [NGIDS-DS](../datasets/nigds_dataset) | Both | Enterprise network undergoing variety of attacks using IXIA PerfectStorm hardware. Seems to lack host user behavior, does not provide raw host logs | 2018 | Enterprise IT | Linux | ๐ŸŸฉ | pcaps, custom host features | 941 MB | 13,4 GB | -| [CIC DoS](../datasets/cic_dos) | Network | Dataset focusing on different DoS attacks targeting the application layer (instead of network layer), but no longer available | 2017 | Enterprise IT | Linux | ๐ŸŸฉ | Network traffic (unknown format) | - | 4,6 GB | -| [CIC-IDS2017](../datasets/cic_ids2017) | Network | Simulation of medium-sized company network under attack, focuses solely on network traffic | 2017 | Enterprise IT | Windows, Linux | ๐ŸŸฉ | pcaps, NetFlows, custom network features | 48,4 GB | 50 GB | -| [Unified Host and Network Data Set](../datasets/unified_host_and_network_dataset) | Both | Selection of network and host events collected from operational environment, but without any attacks | 2017 | Enterprise IT | Windows, Linux | ๐ŸŸฅ | NetFlows, Windows events | - | - | -| [UGR'16](../datasets/ugr16) | Network | Network flows collected from real network over a long period of time, with some attack traffic injected | 2016 | Enterprise IT | Undisclosed | ๐ŸŸฉ | NetFlows | 236 GB | - | -| [Comprehensive, Multi-Source Cyber-Security Events](../datasets/comp_multi_source_cybersec_events) | Both | Various events from production network with red team activity, but extremely limited information per event | 2015 | Enterprise IT | Windows, Linux | ๐ŸŸฉ | Custom event logs (auth, proc, network flows, dns, redteam) | 12 GB | - | -| [Kyoto Honeypot](../datasets/kyoto_honeypot) | Network | Collection of features derived from attack traffic targeting honeypots over the span of 9 years | 2006-2015 | Miscellaneous | Windows, Unix, MacOS | ๐ŸŸฉ | Custom network features | 20 GB | - | -| [UNSW-NB15](../datasets/unsw_nb15) | Network | Custom network undergoing a variety of attacks using IXIA PerfectStorm hardware. Mostly geared towards anomaly-based NIDS | 2015 | Undisclosed | Undisclosed | ๐ŸŸฉ | pcaps, custom network features | >100 GB | - | -| [ADFA-WD](../datasets/adfa_wd) | Host | Mostly intended for anomaly-based stuff leveraging library calls, explores interesting concept of stealthy shellcode | 2014 | Single OS | Windows | ๐ŸŸจ | Sequences of dll calls, Windows events (dll calls only) | 403 MB | 13,6 GB | -| [Skopik 2014](../datasets/skopik_et_al) | Host | Focus on realistically emulating user behavior, does not include attacks | 2014 | Enterprise IT | Linux | ๐ŸŸฅ | misc. logs (Apache, database, mail server, bug tracker app) | - | - | -| [Twente 2014](../datasets/twente_2014) | Both | Anonymized network flows and host logs from real network, but only those related to ssh authentication, focusing on detecting related brute force attacks | 2014 | Enterprise IT | Undisclosed | ๐ŸŸฉ | NetFlows | 2,42 GB | 5,8 GB | -| [User-Computer Associations in Time](../datasets/user_computer_associations) | Host | Large number of authentication events over a period of 9 months, but with very little detail and without any attacks | 2014 | Enterprise IT | Undisclosed | ๐ŸŸฅ | Custom auth event logs | 2,3 GB | - | -| [ADFA-LD](../datasets/adfa_ld) | Host | Purely intended for anomaly-based approaches, provides only syscall numbers | 2013 | Single OS | Linux | ๐ŸŸฉ | Sequences of syscall numbers | 2 MB | 17 MB | -| [CIDD](../datasets/cidd) | Network | Spin on the DARPA'98 dataset, correlating user behavior over different systems/environments for behavior-based IDSs | 2012 | Military IT | Unix | ๐ŸŸฉ | Sequences of user "audits" | - | 22 GB | -| [ISCX IDS 2012](../datasets/iscx_ids_2012) | Network | Focus on realistic traffic generation in a company network, combined with some basic attacks | 2012 | Enterprise IT | Windows, Linux | ๐ŸŸฉ | pcaps | 84 GB | 87 GB | -| [TUIDS](../datasets/tuids) | Network | Dataset focusing on DoS attacks, but very poorly documented | 2012 | Enterprise IT | Undisclosed | ๐ŸŸฉ | pcaps, NetFlows | - | - | -| [VAST Challenge 2012](../datasets/vast_2012) | Network | Originated from a challenge about data analytics, focus an a large network being the victim of a botnet | 2012 | Enterprise IT | Undisclosed | ๐ŸŸจ | Snort alerts, firewall logs | 186 MB | 2,9 GB | -| [CTU 13](../datasets/ctu_13) | Network | Collection of various botnet behavior combined with loads of background traffic, but very limited feature space | 2011 | Enterprise IT | Windows, Undisclosed | ๐ŸŸฉ | pcaps, NetFlows, Bro logs | - | 697 GB | -| [VAST Challenge 2011](../datasets/vast_2011) | Both | Originated from a challenge about data analytics, focus on network but also contains host logs. Labeling is a bit lacking | 2011 | Enterprise IT | Windows | ๐ŸŸจ | pcaps, Windows events, misc. logs (firewall, Snort, Nessus) | 940 MB | 9,3 GB | -| [CDX CTF 2009](../datasets/cdx_2009) | Both | Dataset captured from a CTF event, generally intended to provide methods for reliable generating labeled datasets from such events | 2009 | Enterprise IT | Windows, Linux | ๐ŸŸจ | pcaps, Snort IDS alerts, Apache logs, Splunk logs | 12 GB | 15,3 GB | -| [NSL-KDD](../datasets/nsl_kdd_dataset) | Network | An improvement of the original KDD'99 dataset, but still outdated at its core | 2009 | Military IT | Unix | ๐ŸŸฉ | Connection records | 6 MB | 19 MB | -| [Twente 2009](../datasets/twente_2009) | Network | Intricately labeled network flows + alerts collected from a single honeypot over the span of 6 days | 2009 | Single OS | Linux | ๐ŸŸฉ | NetFlows | 303 MB | 1,9 GB | -| [gureKDDCup](../datasets/gure_kddcup) | Network | An extension of the KDDCup 1999 dataset, adding additional information about payloads to each connection record | 2008 | Military IT | Unix | ๐ŸŸฉ | Connection records with payload information | 10 GB | - | -| [KDD Cup 1999](../datasets/kdd_cup_1999) | Network | Network connection events derived from simulated U.S. Air Force network under attack. No longer appropriate to use for multiple reasons | 1999 | Military IT | Unix | ๐ŸŸฉ | Connection records | 18 MB | 743 MB | -| [DARPA'98 Intrusion Detection Program](../datasets/darpa98) | Both | Simulation of a small U.S. Air Force network under attack. No longer appropriate to use for a multiple reasons | 1998 | Military IT | Unix | ๐ŸŸจ | tcpdumps, host audit logs, file system dumps | 5 GB | - | +| Name | Network/Host Data | TL;DR | Year | Setting | OS Type | Labeled?ยน | Data Type/Source | Packed Size | Unpacked Size | +|----------------------------------------------------------------------------------------------------|:-----------------:|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------:|---------------|-----------------------|:---------:|--------------------------------------------------------------------------------------|------------:|--------------:| +| [AIT Alert Dataset](../datasets/ait_alert_dataset) | Both | Alerts generated from the AIT log dataset, including labels. Only caveat is the lack of Windows machines | 2023 | Enterprise IT | Linux | ๐ŸŸฉ | Wazuh, Suricata and AMiner alerts | 96 MB | 2,9 GB | +| [OTFR Security Datasets - LSASS Campaign](../datasets/otfr_lsass_campaign) | Both | Very small simulation focusing on exploiting Windows' LSASS.exe. Lacking documentation, no labels and no user behavior | 2023 | Single OS | Windows | ๐ŸŸฅ | pcaps, Windows events, Zeek logs | 423 MB | 1 GB | +| [AIT Log Dataset](../datasets/ait_log_dataset) | Both | Huge variety of labeled logs collected from multiple simulation runs of an enterprise network under attack. With user emulation. but only Linux machines | 2022 | Enterprise IT | Linux | ๐ŸŸฉ | pcaps, Suricata alerts, misc. logs (Apache, auth, dns, vpn, audit, suricata, syslog) | 130 GB | 206 GB | +| [CLUE-LDS](../datasets/clue_lds) | Host | Database of real user behavior without known attacks, for evaluation of methods detecting shifts in user behavior | 2022 | Subsystem | Undisclosed | ๐ŸŸฅ | Custom event logs | 640 MB | 14,9 GB | +| [EVTX to MITRE ATT&CK](../datasets/evtx_to_mitre_attck) | Host | Small dataset providing various events corresponding to certain MITRE ATT&CK tactics/techniques | 2022 | Single OS | Windows | ๐ŸŸฉ | Windows events | <1 GB | <1 GB | +| [OTFR Security Datasets - Atomic](../datasets/otfr_atomic) | Both | Various small datasets, each corresponding to a specific MITRE ATT&CK tactic/technique. Lacks user simulation / underlying scenario and does not provide explicit labels | 2019-2022 | Single OS | Windows, Linux, Cloud | ๐ŸŸจ | pcaps, Windows events, auditd logs, AWS CloudTrail logs | 125 MB | - | +| [PWNJUTSU](../datasets/pwnjutsu) | Both | Rich collection of complex attacks executed by various red team participants each acting in a small network, but not labeled | 2022 | Miscellaneous | Windows, Linux | ๐ŸŸฅ | pcaps, Windows events, Sysmon, auditd, various logs (Apache, auth, dns, ssh, etc.) | 82 GB | - | +| [UWF-ZeekData22](../datasets/uwf_zeekdata22) | Network | Traffic collected from a university's wargaming course. Covers all MITRE ATT&CK tactics, though the overwhelming majority is simple recon and attacks are poorly documented | 2022 | Enterprise IT | Windows, Linux | ๐ŸŸฉ | pcaps, Zeek logs | - | 209 GB | +| [NF-UQ-NIDS](../datasets/nf_uq_nids) | Network | Combination of four distinct network datasets using a newly proposed set of standardized features | 2021 | Miscellaneous | Windows, Linux, MacOS | ๐ŸŸฉ | Custom NetFlows | 2 GB | 14,8 GB | +| [OTFR Security Datasets - Log4Shell](../datasets/otfr_log4shell) | Both | Very small simulation focusing on the Log4j vulnerability. Lacking documentation, no explicit labels and no user behavior | 2021 | Single OS | Linux | ๐ŸŸจ | pcaps, Ubuntu events | <1 MB | 1 MB | +| [OTFR Security Datasets - SimuLand Golden SAML](../datasets/otfr_golden_saml) | Host | Barely a dataset, only contains very few traces for some specific events. At most usable to test specific Windows detection rules. | 2021 | Enterprise IT | Windows | ๐ŸŸฉ | Windows Events | - | <1 MB | +| [SOCBED Example Dataset](../datasets/socbed_dataset) | Both | Generated using the SOCBED framework, demonstrating reproducible dataset creation, though current attacks are on the basic side | 2021 | Enterprise IT | Windows, Linux | ๐ŸŸฅ | Windows events, Linux events, packetbeat | 78 MB | 1,3 GB | +| [Unraveled](../datasets/unraveled) | Both | Large dataset with intricate labeling, though the focus seems to be on network flows. Mapping will be annoying. | 2021 | Enterprise IT | Windows, Linux | ๐ŸŸฉ | pcaps, misc. logs (syslog, audit, auth, Snort) | - | 22 GB | +| [DAPT 2020](../datasets/dapt2020) | Both | Focuses on attacks mimicking those of an APT group, executed in a rather small environment | 2020 | Enterprise IT | Undisclosed | ๐ŸŸฉ | NetFlows, misc. logs (DNS, syslog, auditd, apache, auth, various services) | 460 MB | - | +| [OpTC](../datasets/optc) | Both | Huge amount of data and interesting attacks, but possibly hard to use due to uncommon event format and requiring semi-manual labeling | 2020 | Enterprise IT | Windows | ๐ŸŸจ | Custom event logs, Zeek events | - | 1 TB | +| [OTFR Security Datasets - APT 29](../datasets/otfr_apt_29) | Both | Replication of APT29 evaluation developed by MITRE. Well made and documented, but without labels or user behavior | 2020 | Enterprise IT | Windows, Linux | ๐ŸŸฅ | pcaps, Windows events, Zeek events | 126 MB | 2 GB | +| [CICDDoS2019](../datasets/cic_ddos) | Network | Dataset focusing on various DDoS attacks, covering a broad range of categories. Includes benign behavior, but only for Pcaps, not NetFlows | 2019 | Enterprise IT | Windows, Linux | ๐ŸŸฉ | Pcaps, NetFlows, Windows events, Ubuntu events | 24,4 GB | - | +| [DARPA TC5](../datasets/darpa_tc5) | Host | Custom event logs from network under attack from APT groups, designed to facilitate provenance tracking | 2019 | Undisclosed | Undisclosed | ๐ŸŸจ | Custom event logs | - | - | +| [LID-DS 2019](../datasets/lids_ds_2019) | Host | Contains system calls + associated data/metadata for a variety of Linux exploits, includes normal behavior | 2019 | Single OS | Linux | ๐ŸŸจ | Sequences of syscalls with extended information | 13 GB | - | +| [OTFR Security Datasets - APT 3](../datasets/otfr_apt_3) | Host | Replication of APT3 evaluation developed by MITRE. Lacking documentation, no labels and no user behavior | 2019 | Enterprise IT | Windows, Linux | ๐ŸŸฅ | Windows events | 30 MB | 855 MB | +| [ASNM Datasets](../datasets/asnm_datasets) | Network | Specialized features extracted from instances of remote buffer overflow attacks for the purpose of anomaly-based detection | 2009-2018 | Miscellaneous | Windows, Linux | ๐ŸŸฉ | Custom NetFlows | 21 MB | 95 GB | +| [AWSCTD](../datasets/awsctd) | Host | Syscalls collected from ~10k malware samples running on Windows 7, no user emulation | 2018 | Single OS | Windows | ๐ŸŸฉ | Sequences of syscall numbers | 10 MB | 558 MB | +| [CSE-CIC-IDS2018](../datasets/cse_cic_ids2018) | Both | Simulation of large enterprise IT (450 machines) with user emulation and various attacks, includes host and network logs, but only the latter are labeled | 2018 | Enterprise IT | Windows, Linux, MacOS | ๐ŸŸฉ | pcaps, NetFlows, Windows events, Ubuntu events | 220 GB | - | +| [DARPA TC3](../datasets/darpa_tc3) | Host | Custom event logs from network under attack, designed to facilitate provenance tracking | 2018 | Undisclosed | Undisclosed | ๐ŸŸจ | Custom event logs | 115 GB | - | +| [NGIDS-DS](../datasets/nigds_dataset) | Both | Enterprise network undergoing variety of attacks using IXIA PerfectStorm hardware. Seems to lack host user behavior, does not provide raw host logs | 2018 | Enterprise IT | Linux | ๐ŸŸฉ | pcaps, custom host features | 941 MB | 13,4 GB | +| [CIC DoS](../datasets/cic_dos) | Network | Dataset focusing on different DoS attacks targeting the application layer (instead of network layer), but no longer available | 2017 | Enterprise IT | Linux | ๐ŸŸฉ | Network traffic (unknown format) | - | 4,6 GB | +| [CIC-IDS2017](../datasets/cic_ids2017) | Network | Simulation of medium-sized company network under attack, focuses solely on network traffic | 2017 | Enterprise IT | Windows, Linux | ๐ŸŸฉ | pcaps, NetFlows, custom network features | 48,4 GB | 50 GB | +| [Unified Host and Network Data Set](../datasets/unified_host_and_network_dataset) | Both | Selection of network and host events collected from operational environment, but without any attacks | 2017 | Enterprise IT | Windows, Linux | ๐ŸŸฅ | NetFlows, Windows events | - | - | +| [UGR'16](../datasets/ugr16) | Network | Network flows collected from real network over a long period of time, with some attack traffic injected | 2016 | Enterprise IT | Undisclosed | ๐ŸŸฉ | NetFlows | 236 GB | - | +| [Comprehensive, Multi-Source Cyber-Security Events](../datasets/comp_multi_source_cybersec_events) | Both | Various events from production network with red team activity, but extremely limited information per event | 2015 | Enterprise IT | Windows, Linux | ๐ŸŸฉ | Custom event logs (auth, proc, network flows, dns, redteam) | 12 GB | - | +| [Kyoto Honeypot](../datasets/kyoto_honeypot) | Network | Collection of features derived from attack traffic targeting honeypots over the span of 9 years | 2006-2015 | Miscellaneous | Windows, Unix, MacOS | ๐ŸŸฉ | Custom network features | 20 GB | - | +| [UNSW-NB15](../datasets/unsw_nb15) | Network | Custom network undergoing a variety of attacks using IXIA PerfectStorm hardware. Mostly geared towards anomaly-based NIDS | 2015 | Undisclosed | Undisclosed | ๐ŸŸฉ | pcaps, custom network features | >100 GB | - | +| [ADFA-WD](../datasets/adfa_wd) | Host | Mostly intended for anomaly-based stuff leveraging library calls, explores interesting concept of stealthy shellcode | 2014 | Single OS | Windows | ๐ŸŸจ | Sequences of dll calls, Windows events (dll calls only) | 403 MB | 13,6 GB | +| [Skopik 2014](../datasets/skopik_et_al) | Host | Focus on realistically emulating user behavior, does not include attacks | 2014 | Enterprise IT | Linux | ๐ŸŸฅ | misc. logs (Apache, database, mail server, bug tracker app) | - | - | +| [Twente 2014](../datasets/twente_2014) | Both | Anonymized network flows and host logs from real network, but only those related to ssh authentication, focusing on detecting related brute force attacks | 2014 | Enterprise IT | Undisclosed | ๐ŸŸฉ | NetFlows | 2,42 GB | 5,8 GB | +| [User-Computer Associations in Time](../datasets/user_computer_associations) | Host | Large number of authentication events over a period of 9 months, but with very little detail and without any attacks | 2014 | Enterprise IT | Undisclosed | ๐ŸŸฅ | Custom auth event logs | 2,3 GB | - | +| [ADFA-LD](../datasets/adfa_ld) | Host | Purely intended for anomaly-based approaches, provides only syscall numbers | 2013 | Single OS | Linux | ๐ŸŸฉ | Sequences of syscall numbers | 2 MB | 17 MB | +| [CIDD](../datasets/cidd) | Network | Spin on the DARPA'98 dataset, correlating user behavior over different systems/environments for behavior-based IDSs | 2012 | Military IT | Unix | ๐ŸŸฉ | Sequences of user "audits" | - | 22 GB | +| [ISCX IDS 2012](../datasets/iscx_ids_2012) | Network | Focus on realistic traffic generation in a company network, combined with some basic attacks | 2012 | Enterprise IT | Windows, Linux | ๐ŸŸฉ | pcaps | 84 GB | 87 GB | +| [TUIDS](../datasets/tuids) | Network | Dataset focusing on DoS attacks, but very poorly documented | 2012 | Enterprise IT | Undisclosed | ๐ŸŸฉ | pcaps, NetFlows | - | - | +| [VAST Challenge 2012](../datasets/vast_2012) | Network | Originated from a challenge about data analytics, focus an a large network being the victim of a botnet | 2012 | Enterprise IT | Undisclosed | ๐ŸŸจ | Snort alerts, firewall logs | 186 MB | 2,9 GB | +| [CTU 13](../datasets/ctu_13) | Network | Collection of various botnet behavior combined with loads of background traffic, but very limited feature space | 2011 | Enterprise IT | Windows, Undisclosed | ๐ŸŸฉ | pcaps, NetFlows, Bro logs | - | 697 GB | +| [VAST Challenge 2011](../datasets/vast_2011) | Both | Originated from a challenge about data analytics, focus on network but also contains host logs. Labeling is a bit lacking | 2011 | Enterprise IT | Windows | ๐ŸŸจ | pcaps, Windows events, misc. logs (firewall, Snort, Nessus) | 940 MB | 9,3 GB | +| [CDX CTF 2009](../datasets/cdx_2009) | Both | Dataset captured from a CTF event, generally intended to provide methods for reliable generating labeled datasets from such events | 2009 | Enterprise IT | Windows, Linux | ๐ŸŸจ | pcaps, Snort IDS alerts, Apache logs, Splunk logs | 12 GB | 15,3 GB | +| [NSL-KDD](../datasets/nsl_kdd_dataset) | Network | An improvement of the original KDD'99 dataset, but still outdated at its core | 2009 | Military IT | Unix | ๐ŸŸฉ | Connection records | 6 MB | 19 MB | +| [Twente 2009](../datasets/twente_2009) | Network | Intricately labeled network flows + alerts collected from a single honeypot over the span of 6 days | 2009 | Single OS | Linux | ๐ŸŸฉ | NetFlows | 303 MB | 1,9 GB | +| [gureKDDCup](../datasets/gure_kddcup) | Network | An extension of the KDDCup 1999 dataset, adding additional information about payloads to each connection record | 2008 | Military IT | Unix | ๐ŸŸฉ | Connection records with payload information | 10 GB | - | +| [KDD Cup 1999](../datasets/kdd_cup_1999) | Network | Network connection events derived from simulated U.S. Air Force network under attack. No longer appropriate to use for multiple reasons | 1999 | Military IT | Unix | ๐ŸŸฉ | Connection records | 18 MB | 743 MB | +| [DARPA'98 Intrusion Detection Program](../datasets/darpa98) | Both | Simulation of a small U.S. Air Force network under attack. No longer appropriate to use for a multiple reasons | 1998 | Military IT | Unix | ๐ŸŸจ | tcpdumps, host audit logs, file system dumps | 5 GB | - | ### Legend diff --git a/content/contributing.md b/content/contributing.md index 4ad010b..7db54a0 100644 --- a/content/contributing.md +++ b/content/contributing.md @@ -16,5 +16,7 @@ If you want to contribute a new dataset entry, please use this [template](https: A new entry should consist of said template filled out and named appropriately, placed in `/content/datasets/`. Additionally, a new row should be added to the list of all datasets in `/content/all_datasets.md`, adding information to each cell as needed. +You can find a list of datasets that we are aware of, but which do not have an entry yet, in [this issue](https://github.com/fkie-cad/intrusion-detection-datasets/issues/13) + On every page you will also find an "Edit Page" button at the bottom leading you to GitHub, where you will be prompted to fork this repository - saving you a few clicks when you want to edit an existing entry. While contributions should generally be aimed towards datasets, suggestions regarding the underlying structure (like the website itself) are of course also welcome. \ No newline at end of file diff --git a/content/datasets/isot_botnet.md b/content/datasets/isot_botnet.md new file mode 100644 index 0000000..7c92d28 --- /dev/null +++ b/content/datasets/isot_botnet.md @@ -0,0 +1,62 @@ +--- +title: ISOT BOTNET +--- + +- [Overview](#overview) +- [Environment](#environment) +- [Activity](#activity) +- [Contained Data](#contained-data) +- [Papers](#papers) +- [Links](#links) + +| | | +|--------------------------|--------------------------------------------------------------------------------| +| **Network Data Source** | pcaps | +| **Network Data Labeled** | Yes | +| **Host Data Source** | - | +| **Host Data Labeled** | - | +| | | +| **Overall Setting** | Enterprise IT | +| **OS Types** | Undisclosed | +| **Number of Machines** | 2000+ | +| **Total Runtime** | n/a | +| **Year of Collection** | 2004-2010 | +| **Attack Categories** | Botnets (Storm, Waledac) | +| **Benign Activity** | Real users | +| | | +| **Packed Size** | 3 GB | +| **Unpacked Size** | 10,6 GB | +| **Download Link** | [goto](https://drive.google.com/file/d/1X1zPBJFPHU1ToQbpyd1Is1tJJuz2BeRd/view) | + +*** + +### Overview +The ISOT Botnet dataset is an amalgamation of several individual datasets, two containing malicious botnet traffic, and five datasets consisting of benign traffic. +Malicious data was taken from the "French Chapter" of the Honeynet project, while (anonymized) benign traces come from the LBNL Enterprise Trace Repository. +The combination of these traces, after some preprocessing to make them appear as if they would stem from the same network, are then used to test several botnet detection methods leveraging network behavior analysis and machine learning. +However, we were unable to find any information regarding the source of malicious traces, as linked pages no longer exist and further search remained fruitless. + +### Environment +The merged dataset contains traces from 23 individual subnets, 22 with only benign traffic (stemming from the LBNL traces) and one with both malicious and benign traffic (merged traffic from both sources). +The IPs of the latter subnet can be obtained from Table 2 of the linked documentation. +Information regarding services, operating systems and so on are not available. + +### Activity +Details regarding activity are not available; +there might be some additional information hidden in LBNL publications, but we consider this to be out of scope. + +### Contained Data +As a first step to merge benign and malicious traces, the IP addresses of infected machines were mapped to two of the machines providing benign background traffic. +Then, the authors used to the `TcpReplay` tool to replay all traces on the same network interface in order to homogenize the network behavior shown by individual datasets. +These traces are simply available in the form of a single large pcap file with 1,675,424 unique flows, of which 3.33% are malicious. +Labels are available via malicious traffic having a specific MAC, as per Table 2 of the linked documentation. + +It should be noted that the application of methods based on machine learning on merged datasets bears some additional risks; +researchers must ensure that results are not a byproducts of anomalies that remained after the merging process, which might not actually be caused by the malicious behavior, but rather the simple fact that these traces stem from separate environments. + +### Papers +- [Detecting P2P botnets through network behavior analysis and machine learning (2011)](https://doi.org/10.1109/PST.2011.5971980) + +### Links +- [Documentation](https://onlineacademiccommunity.uvic.ca/isot/wp-content/uploads/sites/7295/2023/03/ISOT-Dataset-Overview-v0.5.pdf) +- [LBNL/ICSI Enterprise Tracing Project](https://www.icir.org/enterprise-tracing/download.html) \ No newline at end of file diff --git a/content/datasets/unibs.md b/content/datasets/unibs.md new file mode 100644 index 0000000..2b495fd --- /dev/null +++ b/content/datasets/unibs.md @@ -0,0 +1,61 @@ +--- +title: UNIBS +--- + +- [Overview](#overview) +- [Environment](#environment) +- [Activity](#activity) +- [Contained Data](#contained-data) +- [Papers](#papers) +- [Links](#links) + +| | | +|--------------------------|--------------------------------------------------------------------| +| **Network Data Source** | NetFlows | +| **Network Data Labeled** | No | +| **Host Data Source** | - | +| **Host Data Labeled** | - | +| | | +| **Overall Setting** | Enterprise IT | +| **OS Types** | Undisclosed | +| **Number of Machines** | 20 | +| **Total Runtime** | 3 days | +| **Year of Collection** | 2009 | +| **Attack Categories** | None | +| **Benign Activity** | Real users | +| | | +| **Packed Size** | - | +| **Unpacked Size** | 2,7 GB | +| **Download Link** | [must be requested](http://netweb.ing.unibs.it/~ntw/tools/traces/) | + +*** + +### Overview +The University of Brescia (UNIBS) dataset was created to showcase the capabilities of the "GT" software, an open source toolset facilitating the association of application-level ground truth with network traffic traces. +This is done by probing a monitored host's kernel to gather ground truth at the application level, which can then later be assigned to any collected traces with minimal CPU overhead. +Beyond this, the dataset does not seem to serve a greater purpose, as it does not contain any malicious activity (that the authors are aware of) and is also anonymized. + +### Environment +Traffic was collected from 20 workstations located in the campus network of the University of Brescia over the course of three consecutive days (2009-09-30 to 2009-10-02). +Each workstation is running a "GT client daemon", information regarding network configuration or specific operating systems is not available. + +### Activity +(Presumably) real users used a variety of traffic generating applications and protocols, namely: +- Web (HTTP, HTTPS) +- Mail (POP3, IMAP4, SMTP) +- Skype +- P2P (Bittorrent, Edonkey) +- Other (FTP, SSH, MSN) + +Any further details are not available, most likely because the focus of this dataset was simply on correctly assigning flows to these services or protocols. +Intentional malicious activity is not present. + +### Contained Data +Traffic was collected from the central faculty router via `tcpdump` and enriched with ground truth from the GT tool (in the form of related protocol and application). +It is available in an anonymized and payload-stripped form, presumably as NetFlows, but has to be requested via mail. + +### Papers +- [GT: picking up the truth from the ground for internet traffic (2009)](https://doi.org/10.1145/1629607.1629610) + +### Links +- [Homepage](http://netweb.ing.unibs.it/~ntw/tools/traces/) \ No newline at end of file diff --git a/content/datasets/uwf_zeekdata22.md b/content/datasets/uwf_zeekdata22.md new file mode 100644 index 0000000..6d822c0 --- /dev/null +++ b/content/datasets/uwf_zeekdata22.md @@ -0,0 +1,120 @@ +--- +title: UWF-ZeekData22 +--- + +- [Overview](#overview) +- [Environment](#environment) +- [Activity](#activity) +- [Contained Data](#contained-data) +- [Papers](#papers) +- [Links](#links) +- [Data Examples](#data-examples) + +| | | +|--------------------------|--------------------------------------------------------------------------------| +| **Network Data Source** | pcaps, Zeek logs | +| **Network Data Labeled** | Yes | +| **Host Data Source** | - | +| **Host Data Labeled** | - | +| | | +| **Overall Setting** | Enterprise IT | +| **OS Types** | Windows 10/2008 Metasploitable3
Debian 11
Ubuntu 14.04 Metasploitable3 | +| **Number of Machines** | 6 | +| **Total Runtime** | 64 days | +| **Year of Collection** | 2022 | +| **Attack Categories** | All MITRE ATT&CK tactics | +| **User Emulation** | n/a | +| | | +| **Packed Size** | - | +| **Unpacked Size** | 209 GB | +| **Download Link** | [goto](https://datasets.uwf.edu/data/UWF-ZeekData22/) | + +*** + +### Overview +The University of West Florida Zeek Dataset (UWF-ZeekData22) consists of 64 days network traffic and related Zeek logs, collected from a "cyber wargaming course" held at the same university. +This course leveraged the UWF's cyber range, a virtualized and relatively diverse environment of different systems which participants were instructed to attack and defend. +The datasets' defining feature is the inclusion of MITRE ATT&CK tactic labels assigned to each packet or log, potentially allowing for attack chain detection or similar use cases. +However, the vast majority (>99.9%) of malicious traffic consists of simple reconnaissance, and, apart from statistics, there is very little information about individual attacks. +The authors also detail the process of collecting these large amounts of data with a dedicated solution (Apache Hadoop). + +### Environment +As mentioned, course participants leveraged the university's cyber range. +Although the authors state that their dataset contains thousands of distinct IP addresses, this is most likely caused by the fact that each group of students (81 in total) was assigned their own environment (as opposed to one actually large network). +Each individual network hosts, presumably, six machines with different versions of Windows and Linux operating systems, running various, partially vulnerable, services - presumably, because Section 4 of the underlying paper [1] is pretty unclear in this regard. + +Traffic is captured on one of these VMs and sent to a Hadoop instance, a distributed file system designed for storing and processing large datasets. +The same VM also generated various Zeek logs, which were forwarded in the same manner. + +### Activity +The collection period lasted from 2021/12/12 to 2022/02/20, with a break of six days, for a total of 64 days. +While attacks cover the entire range of MITRE tactics (14 at the time of writing), no detail at all is provided regarding the way in which these attack were executed; +only the number of instances per attack tactis is available: +- Reconnaissance: 9.278.722 +- Discovery: 2.086 +- Credential Access: 31 +- Privilege Escalation: 13 +- Exfiltration: 7 +- Lateral Movement: 4 +- Resource Development: 3 +- Initial Access: 1 +- Persistence: 1 +- Defense Evasion: 1 + +In other words, the vast majority of malicious traffic consists most likely of port scans and similar trivial operations. +Additionally, while there seems to be some form of benign activity, it is in no way documented. + +### Contained Data +Data is generally available in three different formats, all of which are labeled with the associated MITRE ATT&CK tactic: +- pcaps: Contains captured traffic. +Note that these are in a [custom binary format](https://docs.securityonion.net/en/latest/stenographer.html) generated by Security Onion. +These files are divided into thousands of smaller files, each covering roughly one minute of traffic. +- parquet: A binary column-oriented data storage format (basically a faster version of CSV when working with large files). +These contain the Zeek logs generated during the collection period and are equal to the CSV data regarding feature count and names (see example of CSV files below). +There are eight files in total, each covering eight days of traffic. +- CSV: A subset of aforementioned parquet files, which, according to the authors, were mainly made available for people who do not have access to "Big Data" technologies. +These files contain data from 2022/02/10, 0300-0600, 0900-1000, and 1400-1500, with one file per hour, thus five files in total. +Each file contains one million entries with a benign/attack ratio of about 80/20. +For attacks, only the tactics "Reconnaissance" and "Discovery" are included. + +It should be noted that some of the field names commonly used in Zeek logs seem to differ from what can be found in the present data. +For example, `conn` Zeek logs use the `id.orig_h` field for storing the host ip; +here, this information is stored in `src_ip`. +This also does not match with the authors own information about collected [Attributes per Zeek log type](https://datasets.uwf.edu/tables/table2.html), as, again, the features found in the example CSV data below are also the one used in the parquet files. + +The authors leverage what they call "mission logs" to perform labeling, though the nature of these mission logs is not further detailed. +Section 6.1 in [1] seems to suggest that these are manually created by participants, who document their current activity in the form of timestamps, ports, IPs, tactics, etc. + +### Papers +- [[1] Introducing UWF-ZeekData22: A Comprehensive Network Traffic Dataset Based on the MITRE ATT&CK Framework (2022)](https://doi.org/10.3390/data8010018) + +### Links +- [Homepage](https://datasets.uwf.edu/) + - [Types of collected Zeek logs](https://datasets.uwf.edu/tables/table1.html) + - [Attributes per Zeek log type](https://datasets.uwf.edu/tables/table2.html) + +### Data Examples +Traffic information in CSV format taken from `csv/part-00000-d32a9d5e-45b7-4e51-807e-1af297aba2df-c000.csv` + + +``` +resp_pkts,service,orig_ip_bytes,local_resp,missed_bytes,protocol,duration,conn_state,dest_ip,orig_pkts,community_id,resp_ip_bytes,dest_port,orig_bytes,local_orig,datetime,history,resp_bytes,uid,src_port,ts,src_ip,mitre_attack_tactics +2,dns,186,false,0,udp,0.002279996871948242,SF,143.88.5.1,2,1:Z2qpnUv+rxq4N1rn7Go962U/gi8=,186,53,130,false,2022-02-10T03:58:29.979Z,Dd,130,CwO2bA321vyBxBjtxb,36073,1.644465509979958E9,143.88.5.12,Reconnaissance +2,dns,186,false,0,udp,0.002279996871948242,SF,143.88.5.1,2,1:Z2qpnUv+rxq4N1rn7Go962U/gi8=,186,53,130,false,2022-02-10T03:58:29.979Z,Dd,130,CwO2bA321vyBxBjtxb,36073,1.644465509979958E9,143.88.5.12,Reconnaissance +2,dns,186,false,0,udp,0.002279996871948242,SF,143.88.5.1,2,1:Z2qpnUv+rxq4N1rn7Go962U/gi8=,186,53,130,false,2022-02-10T03:58:29.979Z,Dd,130,CwO2bA321vyBxBjtxb,36073,1.644465509979958E9,143.88.5.12,Reconnaissance +2,dns,186,false,0,udp,0.002279996871948242,SF,143.88.5.1,2,1:Z2qpnUv+rxq4N1rn7Go962U/gi8=,186,53,130,false,2022-02-10T03:58:29.979Z,Dd,130,CwO2bA321vyBxBjtxb,36073,1.644465509979958E9,143.88.5.12,Reconnaissance +2,dns,186,false,0,udp,0.002279996871948242,SF,143.88.5.1,2,1:Z2qpnUv+rxq4N1rn7Go962U/gi8=,186,53,130,false,2022-02-10T03:58:29.979Z,Dd,130,CwO2bA321vyBxBjtxb,36073,1.644465509979958E9,143.88.5.12,Reconnaissance +2,dns,186,false,0,udp,0.002279996871948242,SF,143.88.5.1,2,1:Z2qpnUv+rxq4N1rn7Go962U/gi8=,186,53,130,false,2022-02-10T03:58:29.979Z,Dd,130,CwO2bA321vyBxBjtxb,36073,1.644465509979958E9,143.88.5.12,Reconnaissance +2,dns,186,false,0,udp,0.002279996871948242,SF,143.88.5.1,2,1:Z2qpnUv+rxq4N1rn7Go962U/gi8=,186,53,130,false,2022-02-10T03:58:29.979Z,Dd,130,CwO2bA321vyBxBjtxb,36073,1.644465509979958E9,143.88.5.12,Reconnaissance +2,dns,186,false,0,udp,0.002279996871948242,SF,143.88.5.1,2,1:Z2qpnUv+rxq4N1rn7Go962U/gi8=,186,53,130,false,2022-02-10T03:58:29.979Z,Dd,130,CwO2bA321vyBxBjtxb,36073,1.644465509979958E9,143.88.5.12,Reconnaissance +2,dns,186,false,0,udp,0.002279996871948242,SF,143.88.5.1,2,1:Z2qpnUv+rxq4N1rn7Go962U/gi8=,186,53,130,false,2022-02-10T03:58:29.979Z,Dd,130,CwO2bA321vyBxBjtxb,36073,1.644465509979958E9,143.88.5.12,Reconnaissance +2,dns,186,false,0,udp,0.002279996871948242,SF,143.88.5.1,2,1:Z2qpnUv+rxq4N1rn7Go962U/gi8=,186,53,130,false,2022-02-10T03:58:29.979Z,Dd,130,CwO2bA321vyBxBjtxb,36073,1.644465509979958E9,143.88.5.12,Reconnaissance +2,dns,186,false,0,udp,0.002279996871948242,SF,143.88.5.1,2,1:Z2qpnUv+rxq4N1rn7Go962U/gi8=,186,53,130,false,2022-02-10T03:58:29.979Z,Dd,130,CwO2bA321vyBxBjtxb,36073,1.644465509979958E9,143.88.5.12,Reconnaissance +2,dns,186,false,0,udp,0.002279996871948242,SF,143.88.5.1,2,1:Z2qpnUv+rxq4N1rn7Go962U/gi8=,186,53,130,false,2022-02-10T03:58:29.979Z,Dd,130,CwO2bA321vyBxBjtxb,36073,1.644465509979958E9,143.88.5.12,Reconnaissance +2,dns,186,false,0,udp,0.002279996871948242,SF,143.88.5.1,2,1:Z2qpnUv+rxq4N1rn7Go962U/gi8=,186,53,130,false,2022-02-10T03:58:29.979Z,Dd,130,CwO2bA321vyBxBjtxb,36073,1.644465509979958E9,143.88.5.12,Reconnaissance +2,dns,186,false,0,udp,0.002279996871948242,SF,143.88.5.1,2,1:Z2qpnUv+rxq4N1rn7Go962U/gi8=,186,53,130,false,2022-02-10T03:58:29.979Z,Dd,130,CwO2bA321vyBxBjtxb,36073,1.644465509979958E9,143.88.5.12,Reconnaissance +2,dns,186,false,0,udp,0.002279996871948242,SF,143.88.5.1,2,1:Z2qpnUv+rxq4N1rn7Go962U/gi8=,186,53,130,false,2022-02-10T03:58:29.979Z,Dd,130,CwO2bA321vyBxBjtxb,36073,1.644465509979958E9,143.88.5.12,Reconnaissance +2,dns,186,false,0,udp,0.002279996871948242,SF,143.88.5.1,2,1:Z2qpnUv+rxq4N1rn7Go962U/gi8=,186,53,130,false,2022-02-10T03:58:29.979Z,Dd,130,CwO2bA321vyBxBjtxb,36073,1.644465509979958E9,143.88.5.12,Reconnaissance +[...] +``` + \ No newline at end of file diff --git a/content/related_work.md b/content/related_work.md index f353e0a..a9bd581 100644 --- a/content/related_work.md +++ b/content/related_work.md @@ -141,13 +141,13 @@ Referenced datasets: - [DARPA'98 Intrusion Detection Program](/intrusion-detection-datasets/content/datasets/darpa98) - [gureKDDCup](/intrusion-detection-datasets/content/datasets/gure_kddcup) - [ISCX IDS 2012](/intrusion-detection-datasets/content/datasets/iscx_ids_2012) -- ISOT +- [ISOT Botnet](/intrusion-detection-datasets/content/datasets/isot_botnet) - [KDD Cup 1999](/intrusion-detection-datasets/content/datasets/kdd_cup_1999) - [Kyoto Honeypot](/intrusion-detection-datasets/content/datasets/kyoto_honeypot) - [NSL-KDD](/intrusion-detection-datasets/content/datasets/nsl_kdd_dataset) - [Twente 2009](/intrusion-detection-datasets/content/datasets/twente_2009) - [UNSW NB15](/intrusion-detection-datasets/content/datasets/unsw_nb15) -- Mentioned, but not further detailed:
Metrosec, UNIBS 2009, [TUIDS](/intrusion-detection-datasets/content/datasets/tuids), University of Napoli traffic dataset, CSIC 2010 HTTP dataset, UNM system call dataset +- Mentioned, but not further detailed:
Metrosec, [UNIBS](/intrusion-detection-datasets/content/datasets/unibs), [TUIDS](/intrusion-detection-datasets/content/datasets/tuids), University of Napoli traffic dataset, CSIC 2010 HTTP dataset, UNM system call dataset Referenced collections: - CAIDA @@ -292,7 +292,7 @@ Referenced datasets: - DDoS 2016 - IRSC - [ISCX IDS 2012](/intrusion-detection-datasets/content/datasets/iscx_ids_2012) -- ISOT +- [ISOT Botnet](/intrusion-detection-datasets/content/datasets/isot_botnet) - [KDD Cup 1999](/intrusion-detection-datasets/content/datasets/kdd_cup_1999) - [Kent 2016](/intrusion-detection-datasets/content/datasets/comp_multi_source_cybersec_events) (alias for: Comprehensive, Multi-Source Cybersecurity Events) - [Kyoto Honeypot](/intrusion-detection-datasets/content/datasets/kyoto_honeypot) @@ -308,7 +308,7 @@ Referenced datasets: - TRAbID - [TUIDS](/intrusion-detection-datasets/content/datasets/tuids) - [Twente 2009](/intrusion-detection-datasets/content/datasets/twente_2009) -- UNIBS +- [UNIBS](/intrusion-detection-datasets/content/datasets/unibs) - [Unified Host and Network dataset](/intrusion-detection-datasets/content/datasets/unified_host_and_network_dataset) - [UNSW-NB15](/intrusion-detection-datasets/content/datasets/unsw_nb15) @@ -350,7 +350,7 @@ Referenced datasets: - [ISCX IDS 2012](/intrusion-detection-datasets/content/datasets/iscx_ids_2012) - [KDD Cup 1999](/intrusion-detection-datasets/content/datasets/kdd_cup_1999) - [NSL-KDD](/intrusion-detection-datasets/content/datasets/nsl_kdd_dataset) -- UNIBS +- [UNIBS](/intrusion-detection-datasets/content/datasets/unibs) Referenced collections: - CAIDA