From f5c3f502b906de4b42552c3b337e073274ff7dc0 Mon Sep 17 00:00:00 2001 From: Philipp Boenninghausen Date: Tue, 30 Jan 2024 17:44:02 +0100 Subject: [PATCH 01/19] Fix broken link --- content/all_datasets.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/all_datasets.md b/content/all_datasets.md index 1b12b3c..502b402 100644 --- a/content/all_datasets.md +++ b/content/all_datasets.md @@ -13,7 +13,7 @@ before-content: gh_buttons.html | [AIT Alert Dataset](../datasets/ait_alert_dataset) | Host & Network | Alerts generated from the AIT log dataset, including labels. Only caveat is the lack of Windows machines | 2023 | Enterprise IT | Linux | 🟩 | Wazuh, Suricata and AMiner alerts | 96 MB | 2,9 GB | | [AIT Log Dataset](../datasets/ait_log_dataset) | Host & Network | Huge variety of labeled logs collected from multiple simulation runs of an enterprise network under attack. With user emulation. but only Linux machines | 2023 | Enterprise IT | Linux | 🟩 | pcaps, Suricata alerts, misc. logs (Apache, auth, dns, vpn, audit, suricata, syslog) | 130 GB | 206 GB | | [ASNM Datasets](../datasets/asnm_datasets) | Network | Specialized features extracted from instances of remote buffer overflow attacks for the purpose of anomaly-based detection | 2009-2018 | Mixed | Windows, Linux | 🟩 | Custom NetFlows | 21 MB | 95 GB | -| [AWSCTD](/content/../datasets/awsctd) | Host | Syscalls collected from ~10k malware samples running on Windows 7, no user emulation | 2018 | Single OS | Windows | 🟩 | Sequences of syscall numbers | 10 MB | 558 MB | +| [AWSCTD](../datasets/awsctd) | Host | Syscalls collected from ~10k malware samples running on Windows 7, no user emulation | 2018 | Single OS | Windows | 🟩 | Sequences of syscall numbers | 10 MB | 558 MB | | [BotsV3](../datasets/botsv3) [_ON HOLD_] | | _Requires usage of Splunk + a bunch of extensions, postponed_ | 2020 | | | | | 17 GB | - | | [CDX CTF 2009](../datasets/cdx_2009) | Network | Dataset captured from a CTF event, generally intended to provide methods for reliable generating labeled datasets from such events | 2009 | Enterprise IT | Windows, Linux | 🟨 | pcaps, Snort IDS alerts, Apache logs, Splunk logs | 12 GB | 15,3 GB | | [CIC-IDS2017](../datasets/cic_ids2017) | Network | Simulation of medium-sized company network under attack, focuses solely on network traffic | 2017 | Enterprise IT | Windows, Linux | 🟩 | pcaps, NetFlows, custom network features | 48,4 GB | 50 GB | From 992db51312d6a0939b70f057d7dfd8c892a13a7d Mon Sep 17 00:00:00 2001 From: Philipp Boenninghausen Date: Tue, 30 Jan 2024 17:48:28 +0100 Subject: [PATCH 02/19] Add missing info field --- content/datasets/pwnjutsu.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/datasets/pwnjutsu.md b/content/datasets/pwnjutsu.md index 112e998..de5dc69 100644 --- a/content/datasets/pwnjutsu.md +++ b/content/datasets/pwnjutsu.md @@ -23,7 +23,7 @@ title: PWNJUTSU | **Total Runtime** | n/a | | **Year of Collection** | 2022 | | **Attack Categories** | Discovery
Lateral Movement
Credential Access
Privilege Escalation | -| **User Emulation** | | +| **User Emulation** | n/a | | | | | **Packed Size** | 82 GB | | **Unpacked Size** | n/a | From c53ce19281030a1335872555f6aceb3d2a9847e5 Mon Sep 17 00:00:00 2001 From: schlippe Date: Thu, 1 Feb 2024 17:43:38 +0100 Subject: [PATCH 03/19] Fix incorrect figure (swapped host and network data) --- assets/img/ngids_ds.svg | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/assets/img/ngids_ds.svg b/assets/img/ngids_ds.svg index 4891e90..9bc1bc6 100644 --- a/assets/img/ngids_ds.svg +++ b/assets/img/ngids_ds.svg @@ -1,4 +1,4 @@ -
Network 1
Network 2
Machine A
Machine B
IXIA Perfect Storm Traffic Generator
Internet
Host
Data
Network
Data
Ground
Truth
NGIDS-DS
\ No newline at end of file +
Network 1
Network 2
Machine A
Machine B
IXIA Perfect Storm Traffic Generator
Internet
Network
Data
Host
Data
Ground
Truth
NGIDS-DS
\ No newline at end of file From b617e1c0afec2e516fc2e8ae6e0b423d56d2441b Mon Sep 17 00:00:00 2001 From: schlippe Date: Thu, 1 Feb 2024 17:51:21 +0100 Subject: [PATCH 04/19] Update 'Year of Collection' --- content/datasets/optc.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/datasets/optc.md b/content/datasets/optc.md index 53980f7..cfeaea6 100644 --- a/content/datasets/optc.md +++ b/content/datasets/optc.md @@ -21,7 +21,7 @@ title: OpTC | **OS Types** | Windows 10 | | **Number of Machines** | 1000 (only data for 500 included) | | **Total Runtime** | 6 days | -| **Year of Collection** | 2020 | +| **Year of Collection** | 2019 | | **Attack Categories** | Powershell Empire
Malicious Upgrades | | **User Emulation** | Yes | | | | From 135935fec8249a6cc29e053c375269b0e7d7aa57 Mon Sep 17 00:00:00 2001 From: Philipp Boenninghausen Date: Wed, 14 Feb 2024 10:42:49 +0100 Subject: [PATCH 05/19] Add information about NGIDS-DS --- content/datasets/nigds_dataset.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/content/datasets/nigds_dataset.md b/content/datasets/nigds_dataset.md index 8b78052..3cccd79 100644 --- a/content/datasets/nigds_dataset.md +++ b/content/datasets/nigds_dataset.md @@ -21,7 +21,7 @@ title: NGIDS Dataset | **OS Types** | Ubuntu 14.04 | | **Number of Machines** | _n/a_ | | **Total Runtime** | ~5 days | -| **Year of Collection** | 2018 | +| **Year of Collection** | 2016 | | **Attack Categories** | DDoS
Shellcode
Worms
Reconnaissance
Exploits
"Generic" | | **User Emulation** | Yes, using IXIA PerfectStorm | | | | @@ -37,7 +37,7 @@ The Next-Generation Intrusion Detection System Dataset (NGIDS-DS) was created as It attempts to improve upon major datasets of its time (namely KDD'98 and ADFA-LD), following a set of "requirements" laid out in the paper, which are all aimed towards generating a more realistic dataset. It is a collection of host and network logs from a simulated enterprise environment, generally intended to be used with -anomaly-based detection methods, with the paper defining a novel "combined feature" for this purpose. +anomaly-based detection methods, with the paper defining a novel "combined feature" for this purpose, merging information about a system call and its execution time. Their requirements for a simulation are: - complete capture of OS audit logs and network packets @@ -79,7 +79,7 @@ which acts as an all-in-one solution: - generates ground truth for said attacks Further details regarding user behavior is not provided. -The entire simulation ran for a duration of approximately five days. +The entire simulation ran for a duration of approximately five days, from March 11, 2016, to March 16, 2016. ### Contained Data From 75e1537577656bebcf07357f892bd9d5c4f44ac4 Mon Sep 17 00:00:00 2001 From: Philipp Boenninghausen Date: Wed, 14 Feb 2024 14:03:41 +0100 Subject: [PATCH 06/19] Fix 'year of collection' --- content/datasets/ait_log_dataset.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/datasets/ait_log_dataset.md b/content/datasets/ait_log_dataset.md index fd89b75..17249c6 100644 --- a/content/datasets/ait_log_dataset.md +++ b/content/datasets/ait_log_dataset.md @@ -22,7 +22,7 @@ title: AIT Log Data Set | **OS Types** | Ubuntu 20.04 | | **Number of Machines** | 9-27 | | **Total Runtime** | 4-6 days per sim, 8 simulations total | -| **Year of Collection** | 2023 | +| **Year of Collection** | 2022 | | **Attack Categories** | Reconnaissance
Privilege Escalation
Data Exfiltration
Web-based Attacks | | **User Emulation** | Yes, models complex behavior | | | | From 2ed4831a6f371d7e3df9df4bb66597971404b0c1 Mon Sep 17 00:00:00 2001 From: schlippe Date: Sun, 18 Feb 2024 19:17:39 +0100 Subject: [PATCH 07/19] Add related-work page --- _config.yml | 1 + content/related_work.md | 140 ++++++++++++++++++++++++++++++++++++++++ 2 files changed, 141 insertions(+) create mode 100644 content/related_work.md diff --git a/_config.yml b/_config.yml index e8ad733..1b3efc5 100644 --- a/_config.yml +++ b/_config.yml @@ -23,6 +23,7 @@ navbar-links: All Datasets: "content/all_datasets" Contributing: "content/contributing" About: "content/about" + Related Work: "content/related_work" # Author's home: "https://deanattali.com" ################ diff --git a/content/related_work.md b/content/related_work.md new file mode 100644 index 0000000..9248a96 --- /dev/null +++ b/content/related_work.md @@ -0,0 +1,140 @@ +--- +title: Related Work +--- + +This page lists publications and other collections covering IDS datasets, sorted by their year of release. +Each entry consists of citation and a brief description of the surveys scope of selected datasets. +Additionally, for publications, all datasets contained in the survey are also listed, linking to their respective +entries on this website, if available. + +## Contents + +- Publications + - [A Comprehensive Survey of Databases and Deep Learning Methods for Cybersecurity and Intrusion Detection Systems (2020)](#a-comprehensive-survey-of-databases-and-deep-learning-methods-for-cybersecurity-and-intrusion-detection-systems-2020) + - [Pillars of Sand: The current state of Datasets in the field of Network Intrusion Detection (2022)](#pillars-of-sand-the-current-state-of-datasets-in-the-field-of-network-intrusion-detection-2022) + - [Survey of Intrusion Detection Systems: Techniques, Datasets and Challenges (2019)](#survey-of-intrusion-detection-systems-techniques-datasets-and-challenges-2019) +- Other collections + - [Awesome Cybersecurity Datasets (2021)](#awesome-cybersecurity-datasets-2021) + - [Public Security Log Sharing Site](#public-security-log-sharing-site-2010) + - [SecRepo - Samples of Security Related Data](#secrepo---samples-of-security-related-data-2020) + +## Publications + +### A Comprehensive Survey of Databases and Deep Learning Methods for Cybersecurity and Intrusion Detection Systems (2020) + +``` +Gümüşbaş, D., Yıldırım, T., Genovese, A., & Scotti, F. (2020). A comprehensive survey of databases and deep learning methods for cybersecurity and intrusion detection systems. IEEE Systems Journal, 15(2), 1717-1731. +``` + +This survey focuses on machine learning methods for intrusion detection, especially those based on deep learning. +Alongside this, the authors present a list of datasets used to benchmark these approaches, which they categorize into +either host-based (system calls) or network-based (pcaps and NetFlows). +Each dataset is described in a couple of sentences, with the six most commonly used ones undergoing some more analysis +regarding properties like feature and sample count or attack types. + +- [ASNM CDX](/intrusion-detection-datasets/content/datasets/asnm_datasets) +- CAIDA +- [CDX CTF 2009](/intrusion-detection-datasets/content/datasets/cdx_2009) +- [CIC-IDS 2017](/intrusion-detection-datasets/content/datasets/cic_ids2017) +- [CSE-CIC-IDS 2018](/intrusion-detection-datasets/content/datasets/cse_cic_ids2018) +- [CTU 13](/intrusion-detection-datasets/content/datasets/ctu_13) +- CIC DoS +- [DARPA'98 Intrusion Detection Program](/intrusion-detection-datasets/content/datasets/darpa98) +- DEFCON +- Gure-KDD-Cup +- [ISCX IDS 2012](/intrusion-detection-datasets/content/datasets/iscx_ids_2012) +- ISOT +- [KDD Cup 1999](/intrusion-detection-datasets/content/datasets/kdd_cup_1999) +- [Kyoto Honeypot](/intrusion-detection-datasets/content/datasets/kyoto_honeypot) +- Lawrence Berkeley National Laboratory +- MAWI +- [NSL-KDD](/intrusion-detection-datasets/content/datasets/nsl_kdd_dataset) +- [Twente 2009](/intrusion-detection-datasets/content/datasets/twente_2009) +- UMass +- [UNSW NB15](/intrusion-detection-datasets/content/datasets/unsw_nb15) +- Mentioned, but not further detailed:
Metrosec, UNIBS 2009, TUIDS, University of Napoli traffic dataset, CSIC 2010 + HTTP + dataset, UNM system call dataset + +### Pillars of Sand: The current state of Datasets in the field of Network Intrusion Detection (2022) + +``` +Gints Engelen, Robert Flood, Lisa Liu, Vera Rimmer, Henry Clausen, David Aspinall, & Wouter Joosen. (2022). Pillars of Sand: The current state of Datasets in the field of Network Intrusion Detection. Zenodo. https://doi.org/10.5281/zenodo.7068716 +``` + +An analysis of the five most commonly used datasets for anomaly-based NIDS evaluation, focusing on highlighting flaws +and errors within these datasets, and discussing the lack of variability in benign and malicious traffic. +They also offer an allegedly improved version of one of the surveyed datasets, CSE-CIC-IDS 2018. + +- [CIC-IDS 2017](/intrusion-detection-datasets/content/datasets/cic_ids2017) +- [CSE-CIC-IDS 2018](/intrusion-detection-datasets/content/datasets/cse_cic_ids2018) +- [UNSW NB15](/intrusion-detection-datasets/content/datasets/unsw_nb15) +- TON-IoT +- IoT-23 + +### Survey of Intrusion Detection Systems: Techniques, Datasets and Challenges (2019) + +``` +Khraisat, A., Gondal, I., Vamplew, P., & Kamruzzaman, J. (2019). Survey of intrusion detection systems: techniques, datasets and challenges. Cybersecurity, 2(1), 1-22. +``` + +Mainly focuses on commonly used detection methodology (especially anomaly-based), but also shortly describes eight +datasets commonly used to evaluate these approaches. + +- [ADFA-LD](/intrusion-detection-datasets/content/datasets/adfa_ld) +- [ADFA-WD](/intrusion-detection-datasets/content/datasets/adfa_wd) +- CAIDA +- [CIC IDS 2017](/intrusion-detection-datasets/content/datasets/cic_ids_2017) +- [DARPA'98 Intrusion Detection Program](/intrusion-detection-datasets/content/datasets/darpa98) +- [ISCX IDS 2012](/intrusion-detection-datasets/content/datasets/iscx_ids_2012) +- [KDD Cup 1999](/intrusion-detection-datasets/content/datasets/kdd_cup_1999) +- [NSL-KDD](/intrusion-detection-datasets/content/datasets/nsl_kdd_dataset) + +### Cybersecurity Research Datasets: Taxonomy and Empirical Analysis + +``` +Zheng, M., Robbins, H., Chai, Z., Thapa, P., & Moore, T. (2018). Cybersecurity research datasets: taxonomy and empirical analysis. In 11th USENIX Workshop on Cyber Security Experimentation and Test (CSET 18). +``` + +Tries to construct a taxonomy of the types of created and shared cybersecurity data(sets) by inspecting 965 related +papers. +Does not provide an actual list, rather aims to describe general observations, like the fact that only 6% of the +surveyed papers created a dataset *and* made it publicly available. + +## Other collections + +### Awesome Cybersecurity Datasets (2021) + +``` +https://github.com/shramos/Awesome-Cybersecurity-Datasets +(accessed 18.02.2024, last updated 23.01.2021) +``` + +A "curated" personal collection of various cybersecurity-related datasets or collections, grouped into several +categories such as "Network", "Software" or "Fraud". +Each entry is described in only one or two sentences, and most datasets are not, or only partially, suitable for IDS +research. +The list is somewhat deprecated and does especially lack meaningful host-based datasets. + +### SecRepo - Samples of Security Related Data (2020) + +``` +https://www.secrepo.com/ +(accessed 18.02.2024, last updated 01.10.2020) +``` + +An individuals effort to "keep a somewhat curated list of Security related data I've found, created, or was pointed to". +It contains several entries of the authors own creation, some of which are described in a bit more detail, as well as +121 "3rd party" entries from a broad range of topics, each described in a single sentence. +Some of them are usable for IDS related purposes. + +### Public Security Log Sharing Site (2010) + +``` +https://log-sharing.dreamhosters.com/ +(accessed 18.02.2024, last updated 11.08.2010) +``` + +A collection which started as an effort to collect various log samples, but seems to have been discontinued after +operating for about one year. +Currently, it consists of nine entries containing Linux syslogs, firewall logs, apache logs, and web proxy logs. \ No newline at end of file From cae39868dbd1dd691692239974c175d6ed2cb447 Mon Sep 17 00:00:00 2001 From: Philipp Boenninghausen Date: Mon, 19 Feb 2024 14:26:33 +0100 Subject: [PATCH 08/19] Add more related works --- content/related_work.md | 297 ++++++++++++++++++++++++++++++++++++---- 1 file changed, 273 insertions(+), 24 deletions(-) diff --git a/content/related_work.md b/content/related_work.md index 9248a96..ea9c7d6 100644 --- a/content/related_work.md +++ b/content/related_work.md @@ -7,19 +7,50 @@ Each entry consists of citation and a brief description of the surveys scope of Additionally, for publications, all datasets contained in the survey are also listed, linking to their respective entries on this website, if available. +Note that datasets are listed separately from collection, where a collection is any assembly of datasets that cannot +reasonably be grouped together by a common file type or scenario/origin, i.e., it cannot be adequately summarized in a +single entry on this website. + ## Contents - Publications - - [A Comprehensive Survey of Databases and Deep Learning Methods for Cybersecurity and Intrusion Detection Systems (2020)](#a-comprehensive-survey-of-databases-and-deep-learning-methods-for-cybersecurity-and-intrusion-detection-systems-2020) - [Pillars of Sand: The current state of Datasets in the field of Network Intrusion Detection (2022)](#pillars-of-sand-the-current-state-of-datasets-in-the-field-of-network-intrusion-detection-2022) + - [A Comprehensive Survey of Databases and Deep Learning Methods for Cybersecurity and Intrusion Detection Systems (2020)](#a-comprehensive-survey-of-databases-and-deep-learning-methods-for-cybersecurity-and-intrusion-detection-systems-2020) + - [A Survey of Intrusion Detection Systems leveraging Host Data (2019)](#a-survey-of-intrusion-detection-systems-leveraging-host-data-2019) - [Survey of Intrusion Detection Systems: Techniques, Datasets and Challenges (2019)](#survey-of-intrusion-detection-systems-techniques-datasets-and-challenges-2019) + - [Cybersecurity Research Datasets: Taxonomy and Empirical Analysis (2018)](#cybersecurity-research-datasets-taxonomy-and-empirical-analysis-2018) + - [A Detail Analysis on Intrusion Detection Datasets (2014)](#a-detail-analysis-on-intrusion-detection-datasets-2014) - Other collections + - [Malware Traffic Analysis (2024)](#malware-traffic-analysis-2024) + - [NETRESEC (2024)](#netresec-2024) + - [Digital Corpora (2023)](#digital-corpora-2023) - [Awesome Cybersecurity Datasets (2021)](#awesome-cybersecurity-datasets-2021) - - [Public Security Log Sharing Site](#public-security-log-sharing-site-2010) - - [SecRepo - Samples of Security Related Data](#secrepo---samples-of-security-related-data-2020) + - [IMPACT (2021)](#impact-2021) + - [Public Security Log Sharing Site (2020)](#public-security-log-sharing-site-2010) + - [The Honeynet Project Challenges (2015)](#the-honeynet-project-challenges-2015) + - [SecRepo - Samples of Security Related Data (2010)](#secrepo---samples-of-security-related-data-2020) + - [The Internet Traffic Archive (2008)](#the-internet-traffic-archive-2008) ## Publications +### Pillars of Sand: The current state of Datasets in the field of Network Intrusion Detection (2022) + +``` +Gints Engelen, Robert Flood, Lisa Liu, Vera Rimmer, Henry Clausen, David Aspinall, & Wouter Joosen. (2022). Pillars of Sand: The current state of Datasets in the field of Network Intrusion Detection. Zenodo. https://doi.org/10.5281/zenodo.7068716 +``` + +An analysis of the five most commonly used datasets for anomaly-based NIDS evaluation, focusing on highlighting flaws +and errors within these datasets, and discussing the lack of variability in benign and malicious traffic. +They also offer an allegedly improved version of one of the surveyed datasets, CSE-CIC-IDS 2018. + +Referenced datasets: + +- [CIC-IDS 2017](/intrusion-detection-datasets/content/datasets/cic_ids2017) +- [CSE-CIC-IDS 2018](/intrusion-detection-datasets/content/datasets/cse_cic_ids2018) +- [UNSW NB15](/intrusion-detection-datasets/content/datasets/unsw_nb15) +- TON-IoT +- IoT-23 + ### A Comprehensive Survey of Databases and Deep Learning Methods for Cybersecurity and Intrusion Detection Systems (2020) ``` @@ -32,45 +63,173 @@ either host-based (system calls) or network-based (pcaps and NetFlows). Each dataset is described in a couple of sentences, with the six most commonly used ones undergoing some more analysis regarding properties like feature and sample count or attack types. +Referenced datasets: + - [ASNM CDX](/intrusion-detection-datasets/content/datasets/asnm_datasets) -- CAIDA - [CDX CTF 2009](/intrusion-detection-datasets/content/datasets/cdx_2009) - [CIC-IDS 2017](/intrusion-detection-datasets/content/datasets/cic_ids2017) - [CSE-CIC-IDS 2018](/intrusion-detection-datasets/content/datasets/cse_cic_ids2018) - [CTU 13](/intrusion-detection-datasets/content/datasets/ctu_13) - CIC DoS - [DARPA'98 Intrusion Detection Program](/intrusion-detection-datasets/content/datasets/darpa98) -- DEFCON - Gure-KDD-Cup - [ISCX IDS 2012](/intrusion-detection-datasets/content/datasets/iscx_ids_2012) - ISOT - [KDD Cup 1999](/intrusion-detection-datasets/content/datasets/kdd_cup_1999) - [Kyoto Honeypot](/intrusion-detection-datasets/content/datasets/kyoto_honeypot) -- Lawrence Berkeley National Laboratory -- MAWI +- Lawrence Berkeley National Laboratory Traces - [NSL-KDD](/intrusion-detection-datasets/content/datasets/nsl_kdd_dataset) - [Twente 2009](/intrusion-detection-datasets/content/datasets/twente_2009) -- UMass - [UNSW NB15](/intrusion-detection-datasets/content/datasets/unsw_nb15) - Mentioned, but not further detailed:
Metrosec, UNIBS 2009, TUIDS, University of Napoli traffic dataset, CSIC 2010 - HTTP - dataset, UNM system call dataset + HTTP dataset, UNM system call dataset -### Pillars of Sand: The current state of Datasets in the field of Network Intrusion Detection (2022) +Referenced collections: + +- CAIDA +- DEFCON CTF Archive +- MAWILab +- UMass Trace Repository + +### A Review of the Advancements in Intrusion Detection Datasets (2019) ``` -Gints Engelen, Robert Flood, Lisa Liu, Vera Rimmer, Henry Clausen, David Aspinall, & Wouter Joosen. (2022). Pillars of Sand: The current state of Datasets in the field of Network Intrusion Detection. Zenodo. https://doi.org/10.5281/zenodo.7068716 +Thakkar, A., & Lohiya, R. (2020). A review of the advancement in intrusion detection datasets. Procedia Computer Science, 167, 636-645. ``` -An analysis of the five most commonly used datasets for anomaly-based NIDS evaluation, focusing on highlighting flaws -and errors within these datasets, and discussing the lack of variability in benign and malicious traffic. -They also offer an allegedly improved version of one of the surveyed datasets, CSE-CIC-IDS 2018. +This work focuses only on datasets used for the evaluation of network-based IDSs (mostly anomaly-based), presenting a +brief overview in the +form of feature count, attack types and one sentence of description. +It (very shortly) discusses some methods used in this field and goes into a bit more detail for two of the surveyed +datasets, CIC IDS 2017 and CSE CIC IDS 2018. +Strangely, it also includes some datasets that are not really network-related, like ADFA-LD. +Referenced datasets: + +- [ADFA-LD](/intrusion-detection-datasets/content/datasets/adfa_ld), [ADFA-WD](/intrusion-detection-datasets/content/datasets/adfa_wd) +- [CDX CTF 2009](/intrusion-detection-datasets/content/datasets/cdx_2009) - [CIC-IDS 2017](/intrusion-detection-datasets/content/datasets/cic_ids2017) - [CSE-CIC-IDS 2018](/intrusion-detection-datasets/content/datasets/cse_cic_ids2018) -- [UNSW NB15](/intrusion-detection-datasets/content/datasets/unsw_nb15) -- TON-IoT -- IoT-23 +- [DARPA'98 Intrusion Detection Program](/intrusion-detection-datasets/content/datasets/darpa98) +- [ISCX IDS 2012](/intrusion-detection-datasets/content/datasets/iscx_ids_2012) +- [KDD Cup 1999](/intrusion-detection-datasets/content/datasets/kdd_cup_1999) +- [Kyoto Honeypot](/intrusion-detection-datasets/content/datasets/kyoto_honeypot) +- Lawrence Berkeley National Laboratory Traces +- [NSL-KDD](/intrusion-detection-datasets/content/datasets/nsl_kdd_dataset) +- [Twente 2009](/intrusion-detection-datasets/content/datasets/twente_2009) + +Referenced Collections: + +- CAIDA +- DEFCON CTF Archive + +### A Survey of Intrusion Detection Systems leveraging Host Data (2019) + +``` +Bridges, R. A., Glass-Vanderlan, T. R., Iannacone, M. D., Vincent, M. S., & Chen, Q. (2019). A survey of intrusion detection systems leveraging host data. ACM Computing Surveys (CSUR), 52(6), 1-35. +``` + +This survey focuses on host based IDSs designed to detect attacks on enterprise networks, dividing them into categories +based on their approach. +In addition to this, it provides an overview of several existing datasets and collections of datasets to "accommodate +current researchers", describing each of them briefly. +Surprisingly, it also features datasets like KDD and NSL KDD, which positively do not feature any host data. + +Referenced datasets: + +- Active DNS Project +- [ADFA-LD](/intrusion-detection-datasets/content/datasets/adfa_ld), [ADFA-WD](/intrusion-detection-datasets/content/datasets/adfa_wd) +- [Comprehensive Multi-Source Cybersecurity Events](/intrusion-detection-datasets/content/datasets/comp_multi_source_cybersec_events) +- [CTU 13](/intrusion-detection-datasets/content/datasets/ctu_13) +- [DARPA'98 Intrusion Detection Program](/intrusion-detection-datasets/content/datasets/darpa98) +- GURE-KDD +- [KDD Cup 1999](/intrusion-detection-datasets/content/datasets/kdd_cup_1999) +- Malware Capture Facility Project +- [NSL-KDD](/intrusion-detection-datasets/content/datasets/nsl_kdd_dataset) +- UNM system call dataset +- [Unified Host and Network dataset](/intrusion-detection-datasets/content/datasets/unified_host_and_network_dataset) +- [UNSW-NB15](/intrusion-detection-datasets/content/datasets/unsw_nb15) +- User-Computer Authentication Associations in Time +- [Vast Challenge 2012]((/intrusion-detection-datasets/content/datasets/vast_2012)) +- Vast Challenge 2013 + +Referenced collections: + +- CAIDA +- [Digital Corpora Database](#digital-corpora-2023) +- [IMPACT](#impact-2021) +- [Malware Traffic Analysis](#malware-traffic-analysis-2024) +- [NETRESEC](#netresec-2024) +- [SecRepo](#secrepo---samples-of-security-related-data-2020) +- [The Honeypot Project](#the-honeynet-project) + +### A Survey of Network-based Intrusion Detection Data Sets + +``` +Ring, M., Wunderlich, S., Scheuring, D., Landes, D., & Hotho, A. (2019). A survey of network-based intrusion detection data sets. Computers & Security, 86, 147-167. +``` + +This survey focuses on datasets used for the evaluation of anomaly-based. +It does so by first defining 15 different properties to describe, such as year of creation, format, duration, or type of +network, and then applying this methodology to 32 network datasets, along with a short description for each one. +Additionally, several existing collections of datasets are listed. + +Referenced datasets: + +- AWID +- Booters Dataset +- ISCX Botnet 2014 +- [CDX CTF 2009](/intrusion-detection-datasets/content/datasets/cdx_2009) +- CIC DoS +- [CIC-IDS 2017](/intrusion-detection-datasets/content/datasets/cic_ids2017) +- CIDDS-001 & 002 +- [CSE-CIC-IDS 2018](/intrusion-detection-datasets/content/datasets/cse_cic_ids2018) +- [DARPA'98 Intrusion Detection Program](/intrusion-detection-datasets/content/datasets/darpa98) +- DDoS 2016 +- IRSC +- [ISCX IDS 2012](/intrusion-detection-datasets/content/datasets/iscx_ids_2012) +- ISOT +- [KDD Cup 1999](/intrusion-detection-datasets/content/datasets/kdd_cup_1999) +- Kent 2016 +- [Kyoto Honeypot](/intrusion-detection-datasets/content/datasets/kyoto_honeypot) +- Lawrence Berkeley National Laboratory Traces +- NDSec-1 +- [NGIDS-DS](/intrusion-detection-datasets/content/datasets/ngids_dataset) +- [NSL-KDD](/intrusion-detection-datasets/content/datasets/nsl_kdd_dataset) +- PU-IDS +- PUF +- SANTA +- SSENET 2011 +- SSENET 2014 +- SSHCure +- TRAbID +- TUIDS +- [Twente 2009](/intrusion-detection-datasets/content/datasets/twente_2009) +- UNIBS +- [Unified Host and Network dataset](/intrusion-detection-datasets/content/datasets/unified_host_and_network_dataset) +- [UNSW-NB15](/intrusion-detection-datasets/content/datasets/unsw_nb15) + +Referenced collections: + +- AZSecure +- CAIDA +- Contagiodump +- covert.io +- DEFCON CTF archive +- [IMPACT](#impact-2021) +- [Internet Traffic Archive](#the-internet-traffic-archive-2008) +- Kaggle +- [Malware Traffic Analysis](#malware-traffic-analysis-2024) +- Mid-Atlantic CCDC +- MAWILab +- [NETRESEC](#netresec-2024) +- OpenML +- RIPE Data Repository +- [SecRepo](#secrepo---samples-of-security-related-data-2020) +- Simple Web +- UMass Trace Repository +- Vast Challenges +- Waikato Internet Traffic Storage ### Survey of Intrusion Detection Systems: Techniques, Datasets and Challenges (2019) @@ -81,16 +240,20 @@ Khraisat, A., Gondal, I., Vamplew, P., & Kamruzzaman, J. (2019). Survey of intru Mainly focuses on commonly used detection methodology (especially anomaly-based), but also shortly describes eight datasets commonly used to evaluate these approaches. -- [ADFA-LD](/intrusion-detection-datasets/content/datasets/adfa_ld) -- [ADFA-WD](/intrusion-detection-datasets/content/datasets/adfa_wd) -- CAIDA +Referenced datasets: + +- [ADFA-LD](/intrusion-detection-datasets/content/datasets/adfa_ld), [ADFA-WD](/intrusion-detection-datasets/content/datasets/adfa_wd) - [CIC IDS 2017](/intrusion-detection-datasets/content/datasets/cic_ids_2017) - [DARPA'98 Intrusion Detection Program](/intrusion-detection-datasets/content/datasets/darpa98) - [ISCX IDS 2012](/intrusion-detection-datasets/content/datasets/iscx_ids_2012) - [KDD Cup 1999](/intrusion-detection-datasets/content/datasets/kdd_cup_1999) - [NSL-KDD](/intrusion-detection-datasets/content/datasets/nsl_kdd_dataset) -### Cybersecurity Research Datasets: Taxonomy and Empirical Analysis +Referenced collections: + +- CAIDA + +### Cybersecurity Research Datasets: Taxonomy and Empirical Analysis (2018) ``` Zheng, M., Robbins, H., Chai, Z., Thapa, P., & Moore, T. (2018). Cybersecurity research datasets: taxonomy and empirical analysis. In 11th USENIX Workshop on Cyber Security Experimentation and Test (CSET 18). @@ -101,8 +264,59 @@ papers. Does not provide an actual list, rather aims to describe general observations, like the fact that only 6% of the surveyed papers created a dataset *and* made it publicly available. +### A Detail Analysis on Intrusion Detection Datasets (2014) + +``` +Sahu, S. K., Sarangi, S., & Jena, S. K. (2014, February). A detail analysis on intrusion detection datasets. In 2014 ieee international advance computing conference (IACC) (pp. 1348-1353). IEEE. +``` + +This paper shortly analyzed three papers the authors deem suitable to test their novel preprocessing techniques, +which are supposed to improve the performance of various data mining algorithms. + +Referenced datasets: + +- GURE-KDD +- [KDD Cup 1999](/intrusion-detection-datasets/content/datasets/kdd_cup_1999) +- [NSL-KDD](/intrusion-detection-datasets/content/datasets/nsl_kdd_dataset) + ## Other collections +`Last updated` refers to the last time a new entry was added to the collection. + +### Malware Traffic Analysis (2024) + +``` +https://www.netresec.com/?page=PcapFiles +(accessed 19.02.2024, last updated 14.02.2024) +``` + +Various pcaps and malware samples stemming from individual campaigns or attack instances, but without any overall +categorization or even overview. +They are available as blog posts named something like "DarkGate activity" or "GootLoader Infection", which each one +listing some references and download links to any relevant files. + +### NETRESEC (2024) + +``` +https://www.netresec.com/?page=PcapFiles +(accessed 19.02.2024, last updated 04.01.2024) +``` + +A large collection of pcap files and other repositories which are hosting pcaps themselves. +They are categorized into CDX, Malware Traffic, Network Forensics, SCADA/ICS, CTF, Packet Injection/Man-on-the-Side, and +Uncategorized. + +### Digital Corpora (2023) + +``` +https://digitalcorpora.org/ +(accessed 19.02.2024, last updated 05.05.2023) +``` + +A collection of datasets mostly designed for the use in forensics education. +It consists of various disk images, memory dumps and pcaps, as well as a bunch of benign and malicious files. +It does not seem to contain actual log data. + ### Awesome Cybersecurity Datasets (2021) ``` @@ -116,6 +330,20 @@ Each entry is described in only one or two sentences, and most datasets are not, research. The list is somewhat deprecated and does especially lack meaningful host-based datasets. +### IMPACT (2021) + +``` +https://www.impactcybertrust.org/search +(accessed 19.02.2024, last updated 13.07.2021) +``` + +The "Information Marketplace for Policy and Analysis of Cyber-Risk and Trust" (IMPACT, formerly PREDICT), maintained by +the US Department of Homeland Security, contains 70 datasets. +These are for the mostly made up of network related files (pcaps and DNS logs) from a wide variety of scenarios (CTF +events, IoT, corpo networks, etc.), as well as some miscellaneous things like network shapefiles. +55 of these datasets were created by IMPACT, 15 are external (mostly CAIDA). Many datasets require prior authorization +to access them. + ### SecRepo - Samples of Security Related Data (2020) ``` @@ -124,10 +352,20 @@ https://www.secrepo.com/ ``` An individuals effort to "keep a somewhat curated list of Security related data I've found, created, or was pointed to". -It contains several entries of the authors own creation, some of which are described in a bit more detail, as well as +It contains several entries of the authors own creation, some of which are described in a bit more detail, as well as 121 "3rd party" entries from a broad range of topics, each described in a single sentence. Some of them are usable for IDS related purposes. +### The Honeynet Project Challenges (2015) + +``` +https://www.honeynet.org/challenges/ +(accessed 19.02.2024, last updated 18.03.2015) +``` + +A collection of 14 forensic challenges related to pcaps, malware and log files. +However, most resources, except for the two newest challenges, are no longer available. + ### Public Security Log Sharing Site (2010) ``` @@ -137,4 +375,15 @@ https://log-sharing.dreamhosters.com/ A collection which started as an effort to collect various log samples, but seems to have been discontinued after operating for about one year. -Currently, it consists of nine entries containing Linux syslogs, firewall logs, apache logs, and web proxy logs. \ No newline at end of file +Currently, it consists of nine entries containing Linux syslogs, firewall logs, apache logs, and web proxy logs. + +### The Internet Traffic Archive (2008) + +``` +https://ita.ee.lbl.gov/ +(accessed 18.02.2024, last updated 11.08.2010) +``` + +An archive hosting 16 different network data from various sources, such as WWW servers, web clients, and some custom +networks. +Most data is in the form of traces, some include http logs or traceroute measurements. \ No newline at end of file From 9040e8ed22e66c63cdd9d6a6b7a555f48cbd340c Mon Sep 17 00:00:00 2001 From: Philipp Boenninghausen Date: Mon, 19 Feb 2024 15:11:03 +0100 Subject: [PATCH 09/19] Add more related works --- content/related_work.md | 64 ++++++++++++++++++++++++++++++++++++++++- 1 file changed, 63 insertions(+), 1 deletion(-) diff --git a/content/related_work.md b/content/related_work.md index ea9c7d6..6d61fa3 100644 --- a/content/related_work.md +++ b/content/related_work.md @@ -16,10 +16,15 @@ single entry on this website. - Publications - [Pillars of Sand: The current state of Datasets in the field of Network Intrusion Detection (2022)](#pillars-of-sand-the-current-state-of-datasets-in-the-field-of-network-intrusion-detection-2022) - [A Comprehensive Survey of Databases and Deep Learning Methods for Cybersecurity and Intrusion Detection Systems (2020)](#a-comprehensive-survey-of-databases-and-deep-learning-methods-for-cybersecurity-and-intrusion-detection-systems-2020) + - [A Review of the Advancements in Intrusion Detection Datasets (2019)](#a-review-of-the-advancements-in-intrusion-detection-datasets-2019) - [A Survey of Intrusion Detection Systems leveraging Host Data (2019)](#a-survey-of-intrusion-detection-systems-leveraging-host-data-2019) + - [A Survey of Network-based Intrusion Detection Data Sets (2019)](#a-survey-of-network-based-intrusion-detection-data-sets-2019) - [Survey of Intrusion Detection Systems: Techniques, Datasets and Challenges (2019)](#survey-of-intrusion-detection-systems-techniques-datasets-and-challenges-2019) - [Cybersecurity Research Datasets: Taxonomy and Empirical Analysis (2018)](#cybersecurity-research-datasets-taxonomy-and-empirical-analysis-2018) + - [A survey of deep learning-based network anomaly detection (2017)](#a-survey-of-deep-learning-based-network-anomaly-detection-2017) + - [A Survey of Data Mining and Machine Learning Methods for Cyber Security Intrusion Detection (2015)](#a-survey-of-data-mining-and-machine-learning-methods-for-cyber-security-intrusion-detection-2015) - [A Detail Analysis on Intrusion Detection Datasets (2014)](#a-detail-analysis-on-intrusion-detection-datasets-2014) + - [Network anomaly detection: Methods, systems and tools (2013)](#network-anomaly-detection-methods-systems-and-tools-2013) - Other collections - [Malware Traffic Analysis (2024)](#malware-traffic-analysis-2024) - [NETRESEC (2024)](#netresec-2024) @@ -163,7 +168,7 @@ Referenced collections: - [SecRepo](#secrepo---samples-of-security-related-data-2020) - [The Honeypot Project](#the-honeynet-project) -### A Survey of Network-based Intrusion Detection Data Sets +### A Survey of Network-based Intrusion Detection Data Sets (2019) ``` Ring, M., Wunderlich, S., Scheuring, D., Landes, D., & Hotho, A. (2019). A survey of network-based intrusion detection data sets. Computers & Security, 86, 147-167. @@ -248,6 +253,7 @@ Referenced datasets: - [ISCX IDS 2012](/intrusion-detection-datasets/content/datasets/iscx_ids_2012) - [KDD Cup 1999](/intrusion-detection-datasets/content/datasets/kdd_cup_1999) - [NSL-KDD](/intrusion-detection-datasets/content/datasets/nsl_kdd_dataset) +- UNIBS Referenced collections: @@ -264,6 +270,37 @@ papers. Does not provide an actual list, rather aims to describe general observations, like the fact that only 6% of the surveyed papers created a dataset *and* made it publicly available. +### A survey of deep learning-based network anomaly detection (2017) + +``` +Kwon, D., Kim, H., Kim, J., Suh, S. C., Kim, I., & Kim, K. J. (2019). A survey of deep learning-based network anomaly detection. Cluster Computing, 22, 949-961. +``` + +This survey features various deep learning approaches in the field of anomaly-based intrusion detections. +Datasets, while acknowledged as an important factor, are only described in one section. +Weirdly, the two chosen datasets are quite out-of-date for a survey that has been published in 2017. + +Referenced datasets: + +- [KDD Cup 1999](/intrusion-detection-datasets/content/datasets/kdd_cup_1999) +- [NSL-KDD](/intrusion-detection-datasets/content/datasets/nsl_kdd_dataset) + +### A Survey of Data Mining and Machine Learning Methods for Cyber Security Intrusion Detection (2015) + +``` +Buczak, A. L., & Guven, E. (2015). A survey of data mining and machine learning methods for cyber security intrusion detection. IEEE Communications surveys & tutorials, 18(2), 1153-1176. +``` + +This survey only considers machine learning and datamining (i.e., anomaly-based) approaches and what they entail. +Given the relative recency of this work, the choice of described datasets, which were pretty much deprecated at the time +of writing, is surprising. + +Referenced datasets: + +- [DARPA'98 Intrusion Detection Program](/intrusion-detection-datasets/content/datasets/darpa98) +- DARPA 1999 +- [KDD Cup 1999](/intrusion-detection-datasets/content/datasets/kdd_cup_1999) + ### A Detail Analysis on Intrusion Detection Datasets (2014) ``` @@ -279,6 +316,31 @@ Referenced datasets: - [KDD Cup 1999](/intrusion-detection-datasets/content/datasets/kdd_cup_1999) - [NSL-KDD](/intrusion-detection-datasets/content/datasets/nsl_kdd_dataset) +### Network anomaly detection: Methods, systems and tools (2013) + +``` +Bhuyan, M. H., Bhattacharyya, D. K., & Kalita, J. K. (2013). Network anomaly detection: methods, systems and tools. Ieee communications surveys & tutorials, 16(1), 303-336. +``` + +This survey mainly focuses on different approaches towards network anomaly detection, encountered attacks, patterns, +etc. +Datasets that are suitable for this purpose are mentioned as a secondary talking point, and described only in +brief. + +Referenced datasets: + +- [DARPA'98 Intrusion Detection Program](/intrusion-detection-datasets/content/datasets/darpa98) +- [ISCX IDS 2012](/intrusion-detection-datasets/content/datasets/iscx_ids_2012) +- Lawrence Berkeley National Laboratory Traces +- [KDD Cup 1999](/intrusion-detection-datasets/content/datasets/kdd_cup_1999) +- [NSL-KDD](/intrusion-detection-datasets/content/datasets/nsl_kdd_dataset) +- TUIDS + +Referenced collections: + +- CAIDA +- DEFCON CTF archive + ## Other collections `Last updated` refers to the last time a new entry was added to the collection. From 4949fbd9795a86294ae67d531ea930ca86e2fc9b Mon Sep 17 00:00:00 2001 From: schlippe Date: Wed, 21 Feb 2024 10:44:25 +0100 Subject: [PATCH 10/19] Fix minor typos --- content/related_work.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/content/related_work.md b/content/related_work.md index 6d61fa3..b1f1ba2 100644 --- a/content/related_work.md +++ b/content/related_work.md @@ -401,7 +401,7 @@ https://www.impactcybertrust.org/search The "Information Marketplace for Policy and Analysis of Cyber-Risk and Trust" (IMPACT, formerly PREDICT), maintained by the US Department of Homeland Security, contains 70 datasets. -These are for the mostly made up of network related files (pcaps and DNS logs) from a wide variety of scenarios (CTF +These are for the most part made up of network related files (pcaps and DNS logs) from a wide variety of scenarios (CTF events, IoT, corpo networks, etc.), as well as some miscellaneous things like network shapefiles. 55 of these datasets were created by IMPACT, 15 are external (mostly CAIDA). Many datasets require prior authorization to access them. @@ -443,7 +443,7 @@ Currently, it consists of nine entries containing Linux syslogs, firewall logs, ``` https://ita.ee.lbl.gov/ -(accessed 18.02.2024, last updated 11.08.2010) +(accessed 18.02.2024, last updated 09.04.2010) ``` An archive hosting 16 different network data from various sources, such as WWW servers, web clients, and some custom From 580ea6ce662fee082934094701b544e0ae162c14 Mon Sep 17 00:00:00 2001 From: schlippe Date: Wed, 21 Feb 2024 10:44:42 +0100 Subject: [PATCH 11/19] Add blog post for new subpage --- _posts/2024-02-21-related-work.md | 33 +++++++++++++++++++++++++++++++ 1 file changed, 33 insertions(+) create mode 100644 _posts/2024-02-21-related-work.md diff --git a/_posts/2024-02-21-related-work.md b/_posts/2024-02-21-related-work.md new file mode 100644 index 0000000..c52adcd --- /dev/null +++ b/_posts/2024-02-21-related-work.md @@ -0,0 +1,33 @@ +--- +layout: post +title: Related Work added +subtitle: A collection of related surveys and other, non-scientific collections of IDS datasets +gh-repo: fkie-cad/intrusion-detection-datasets +gh-badge: [star, fork, follow] +tags: [website] +comments: true +author: Philipp Bönninghausen +--- + +This update adds a new subpage for "Related Work", intended to provide additional source material and accessible via the navbar (or [this link](/intrusion-detection-datasets/content/related_work)). +Contents are divided into "Publications" and "Other collections", where the former is any academic work that at least partially covers the topic of available IDS datasets. +Entries of this category, which are usually surveys, consist of the following: +- Publication title +- Citation +- Short description of the publication +- List of referenced datasets +- List of referenced collections + +Referenced datasets link to their respective entries on this webpage, if available. +Those that are not (which are quite a few) will be looked at and possibly be added to the Intrusion Detection Datasets collection in the future. + +The latter category, "Other collections", simply features dataset collections not backed by a scientific publication. +These are maintained by individuals or organizations, and cover different types of datasets, ranging from "only pcaps" to "anything cybersecurity-related". +Entries consist of: +- Collection name +- Link +- Date of last update, i.e., the last time a new entry was added +- Short description of the focus of this collection + +There is of course a significant overlap between the different publications/collections, for example, almost every survey references the age-old [KDD Cup 1999 dataset](/intrusion-detection-datasets/content/datasets/kdd_cup_1999). +The diversity of collections might nevertheless prove useful, as each resource provides a slightly different viewpoint upon the topic of IDS datasets. \ No newline at end of file From cfa49d418033fb9212c315a11e4d4a6847f6c58c Mon Sep 17 00:00:00 2001 From: schlippe Date: Wed, 21 Feb 2024 11:07:44 +0100 Subject: [PATCH 12/19] Add requested changes --- _posts/2024-02-21-related-work.md | 10 +- content/related_work.md | 148 ++++++++++-------------------- 2 files changed, 56 insertions(+), 102 deletions(-) diff --git a/_posts/2024-02-21-related-work.md b/_posts/2024-02-21-related-work.md index c52adcd..e73aec1 100644 --- a/_posts/2024-02-21-related-work.md +++ b/_posts/2024-02-21-related-work.md @@ -1,16 +1,16 @@ --- layout: post title: Related Work added -subtitle: A collection of related surveys and other, non-scientific collections of IDS datasets +subtitle: A collection of related surveys and non-scientific collections of IDS datasets gh-repo: fkie-cad/intrusion-detection-datasets -gh-badge: [star, fork, follow] -tags: [website] +gh-badge: [ star, fork, follow ] +tags: [ website ] comments: true author: Philipp Bönninghausen --- This update adds a new subpage for "Related Work", intended to provide additional source material and accessible via the navbar (or [this link](/intrusion-detection-datasets/content/related_work)). -Contents are divided into "Publications" and "Other collections", where the former is any academic work that at least partially covers the topic of available IDS datasets. +Contents are divided into "Publications" and "Collections", where the former is any academic work that at least partially covers the topic of available IDS datasets. Entries of this category, which are usually surveys, consist of the following: - Publication title - Citation @@ -21,7 +21,7 @@ Entries of this category, which are usually surveys, consist of the following: Referenced datasets link to their respective entries on this webpage, if available. Those that are not (which are quite a few) will be looked at and possibly be added to the Intrusion Detection Datasets collection in the future. -The latter category, "Other collections", simply features dataset collections not backed by a scientific publication. +The latter category, "Collections", simply features dataset collections not backed by a scientific publication. These are maintained by individuals or organizations, and cover different types of datasets, ranging from "only pcaps" to "anything cybersecurity-related". Entries consist of: - Collection name diff --git a/content/related_work.md b/content/related_work.md index b1f1ba2..d92e358 100644 --- a/content/related_work.md +++ b/content/related_work.md @@ -2,14 +2,12 @@ title: Related Work --- -This page lists publications and other collections covering IDS datasets, sorted by their year of release. -Each entry consists of citation and a brief description of the surveys scope of selected datasets. -Additionally, for publications, all datasets contained in the survey are also listed, linking to their respective -entries on this website, if available. +This page lists publications and collections covering IDS datasets. +Related publications, sorted by year or release, are any academic work that at least partially covers the topic of available IDS datasets. +Collections, sorted alphabetically, simply features agglomerations of IDS-related datasets not backed by a scientific publication. -Note that datasets are listed separately from collection, where a collection is any assembly of datasets that cannot -reasonably be grouped together by a common file type or scenario/origin, i.e., it cannot be adequately summarized in a -single entry on this website. +Each entry consists of citation and a brief description of the survey's scope of selected datasets. +Additionally, for publications, all datasets discussed in the survey are also listed, linking to their respective entries on this website, if available. ## Contents @@ -26,15 +24,15 @@ single entry on this website. - [A Detail Analysis on Intrusion Detection Datasets (2014)](#a-detail-analysis-on-intrusion-detection-datasets-2014) - [Network anomaly detection: Methods, systems and tools (2013)](#network-anomaly-detection-methods-systems-and-tools-2013) - Other collections - - [Malware Traffic Analysis (2024)](#malware-traffic-analysis-2024) - - [NETRESEC (2024)](#netresec-2024) - - [Digital Corpora (2023)](#digital-corpora-2023) - - [Awesome Cybersecurity Datasets (2021)](#awesome-cybersecurity-datasets-2021) - - [IMPACT (2021)](#impact-2021) - - [Public Security Log Sharing Site (2020)](#public-security-log-sharing-site-2010) - - [The Honeynet Project Challenges (2015)](#the-honeynet-project-challenges-2015) - - [SecRepo - Samples of Security Related Data (2010)](#secrepo---samples-of-security-related-data-2020) - - [The Internet Traffic Archive (2008)](#the-internet-traffic-archive-2008) + - [Awesome Cybersecurity Datasets](#awesome-cybersecurity-datasets) + - [Digital Corpora](#digital-corpora) + - [IMPACT](#impact) + - [Malware Traffic Analysis](#malware-traffic-analysis) + - [NETRESEC](#netresec) + - [Public Security Log Sharing Site](#public-security-log-sharing-site) + - [SecRepo - Samples of Security Related Data](#secrepo---samples-of-security-related-data) + - [The Honeynet Project Challenges](#the-honeynet-project-challenges) + - [The Internet Traffic Archive](#the-internet-traffic-archive) ## Publications @@ -44,12 +42,10 @@ single entry on this website. Gints Engelen, Robert Flood, Lisa Liu, Vera Rimmer, Henry Clausen, David Aspinall, & Wouter Joosen. (2022). Pillars of Sand: The current state of Datasets in the field of Network Intrusion Detection. Zenodo. https://doi.org/10.5281/zenodo.7068716 ``` -An analysis of the five most commonly used datasets for anomaly-based NIDS evaluation, focusing on highlighting flaws -and errors within these datasets, and discussing the lack of variability in benign and malicious traffic. +An analysis of the five most commonly used datasets for anomaly-based NIDS evaluation, focusing on highlighting flaws and errors within these datasets, and discussing the lack of variability in benign and malicious traffic. They also offer an allegedly improved version of one of the surveyed datasets, CSE-CIC-IDS 2018. Referenced datasets: - - [CIC-IDS 2017](/intrusion-detection-datasets/content/datasets/cic_ids2017) - [CSE-CIC-IDS 2018](/intrusion-detection-datasets/content/datasets/cse_cic_ids2018) - [UNSW NB15](/intrusion-detection-datasets/content/datasets/unsw_nb15) @@ -63,13 +59,10 @@ Gümüşbaş, D., Yıldırım, T., Genovese, A., & Scotti, F. (2020). A comprehe ``` This survey focuses on machine learning methods for intrusion detection, especially those based on deep learning. -Alongside this, the authors present a list of datasets used to benchmark these approaches, which they categorize into -either host-based (system calls) or network-based (pcaps and NetFlows). -Each dataset is described in a couple of sentences, with the six most commonly used ones undergoing some more analysis -regarding properties like feature and sample count or attack types. +Alongside this, the authors present a list of datasets used to benchmark these approaches, which they categorize into either host-based (system calls) or network-based (pcaps and NetFlows). +Each dataset is described in a couple of sentences, with the six most commonly used ones undergoing some more analysis regarding properties like feature and sample count or attack types. Referenced datasets: - - [ASNM CDX](/intrusion-detection-datasets/content/datasets/asnm_datasets) - [CDX CTF 2009](/intrusion-detection-datasets/content/datasets/cdx_2009) - [CIC-IDS 2017](/intrusion-detection-datasets/content/datasets/cic_ids2017) @@ -86,11 +79,9 @@ Referenced datasets: - [NSL-KDD](/intrusion-detection-datasets/content/datasets/nsl_kdd_dataset) - [Twente 2009](/intrusion-detection-datasets/content/datasets/twente_2009) - [UNSW NB15](/intrusion-detection-datasets/content/datasets/unsw_nb15) -- Mentioned, but not further detailed:
Metrosec, UNIBS 2009, TUIDS, University of Napoli traffic dataset, CSIC 2010 - HTTP dataset, UNM system call dataset +- Mentioned, but not further detailed:
Metrosec, UNIBS 2009, TUIDS, University of Napoli traffic dataset, CSIC 2010 HTTP dataset, UNM system call dataset Referenced collections: - - CAIDA - DEFCON CTF Archive - MAWILab @@ -102,15 +93,11 @@ Referenced collections: Thakkar, A., & Lohiya, R. (2020). A review of the advancement in intrusion detection datasets. Procedia Computer Science, 167, 636-645. ``` -This work focuses only on datasets used for the evaluation of network-based IDSs (mostly anomaly-based), presenting a -brief overview in the -form of feature count, attack types and one sentence of description. -It (very shortly) discusses some methods used in this field and goes into a bit more detail for two of the surveyed -datasets, CIC IDS 2017 and CSE CIC IDS 2018. +This work focuses only on datasets used for the evaluation of network-based IDSs (mostly anomaly-based), presenting a brief overview in the form of feature count, attack types and one sentence of description. +It (very shortly) discusses some methods used in this field and goes into a bit more detail for two of the surveyed datasets, CIC IDS 2017 and CSE CIC IDS 2018. Strangely, it also includes some datasets that are not really network-related, like ADFA-LD. Referenced datasets: - - [ADFA-LD](/intrusion-detection-datasets/content/datasets/adfa_ld), [ADFA-WD](/intrusion-detection-datasets/content/datasets/adfa_wd) - [CDX CTF 2009](/intrusion-detection-datasets/content/datasets/cdx_2009) - [CIC-IDS 2017](/intrusion-detection-datasets/content/datasets/cic_ids2017) @@ -124,7 +111,6 @@ Referenced datasets: - [Twente 2009](/intrusion-detection-datasets/content/datasets/twente_2009) Referenced Collections: - - CAIDA - DEFCON CTF Archive @@ -134,14 +120,11 @@ Referenced Collections: Bridges, R. A., Glass-Vanderlan, T. R., Iannacone, M. D., Vincent, M. S., & Chen, Q. (2019). A survey of intrusion detection systems leveraging host data. ACM Computing Surveys (CSUR), 52(6), 1-35. ``` -This survey focuses on host based IDSs designed to detect attacks on enterprise networks, dividing them into categories -based on their approach. -In addition to this, it provides an overview of several existing datasets and collections of datasets to "accommodate -current researchers", describing each of them briefly. +This survey focuses on host based IDSs designed to detect attacks on enterprise networks, dividing them into categories based on their approach. +In addition to this, it provides an overview of several existing datasets and collections of datasets to "accommodate current researchers", describing each of them briefly. Surprisingly, it also features datasets like KDD and NSL KDD, which positively do not feature any host data. Referenced datasets: - - Active DNS Project - [ADFA-LD](/intrusion-detection-datasets/content/datasets/adfa_ld), [ADFA-WD](/intrusion-detection-datasets/content/datasets/adfa_wd) - [Comprehensive Multi-Source Cybersecurity Events](/intrusion-detection-datasets/content/datasets/comp_multi_source_cybersec_events) @@ -159,7 +142,6 @@ Referenced datasets: - Vast Challenge 2013 Referenced collections: - - CAIDA - [Digital Corpora Database](#digital-corpora-2023) - [IMPACT](#impact-2021) @@ -175,12 +157,10 @@ Ring, M., Wunderlich, S., Scheuring, D., Landes, D., & Hotho, A. (2019). A surve ``` This survey focuses on datasets used for the evaluation of anomaly-based. -It does so by first defining 15 different properties to describe, such as year of creation, format, duration, or type of -network, and then applying this methodology to 32 network datasets, along with a short description for each one. +It does so by first defining 15 different properties to describe, such as year of creation, format, duration, or type of network, and then applying this methodology to 32 network datasets, along with a short description for each one. Additionally, several existing collections of datasets are listed. Referenced datasets: - - AWID - Booters Dataset - ISCX Botnet 2014 @@ -215,7 +195,6 @@ Referenced datasets: - [UNSW-NB15](/intrusion-detection-datasets/content/datasets/unsw_nb15) Referenced collections: - - AZSecure - CAIDA - Contagiodump @@ -242,11 +221,9 @@ Referenced collections: Khraisat, A., Gondal, I., Vamplew, P., & Kamruzzaman, J. (2019). Survey of intrusion detection systems: techniques, datasets and challenges. Cybersecurity, 2(1), 1-22. ``` -Mainly focuses on commonly used detection methodology (especially anomaly-based), but also shortly describes eight -datasets commonly used to evaluate these approaches. +Mainly focuses on commonly used detection methodology (especially anomaly-based), but also shortly describes eight datasets commonly used to evaluate these approaches. Referenced datasets: - - [ADFA-LD](/intrusion-detection-datasets/content/datasets/adfa_ld), [ADFA-WD](/intrusion-detection-datasets/content/datasets/adfa_wd) - [CIC IDS 2017](/intrusion-detection-datasets/content/datasets/cic_ids_2017) - [DARPA'98 Intrusion Detection Program](/intrusion-detection-datasets/content/datasets/darpa98) @@ -256,7 +233,6 @@ Referenced datasets: - UNIBS Referenced collections: - - CAIDA ### Cybersecurity Research Datasets: Taxonomy and Empirical Analysis (2018) @@ -265,10 +241,8 @@ Referenced collections: Zheng, M., Robbins, H., Chai, Z., Thapa, P., & Moore, T. (2018). Cybersecurity research datasets: taxonomy and empirical analysis. In 11th USENIX Workshop on Cyber Security Experimentation and Test (CSET 18). ``` -Tries to construct a taxonomy of the types of created and shared cybersecurity data(sets) by inspecting 965 related -papers. -Does not provide an actual list, rather aims to describe general observations, like the fact that only 6% of the -surveyed papers created a dataset *and* made it publicly available. +Tries to construct a taxonomy of the types of created and shared cybersecurity data(sets) by inspecting 965 related papers. +Does not provide an actual list, rather aims to describe general observations, like the fact that only 6% of the surveyed papers created a dataset *and* made it publicly available. ### A survey of deep learning-based network anomaly detection (2017) @@ -281,7 +255,6 @@ Datasets, while acknowledged as an important factor, are only described in one s Weirdly, the two chosen datasets are quite out-of-date for a survey that has been published in 2017. Referenced datasets: - - [KDD Cup 1999](/intrusion-detection-datasets/content/datasets/kdd_cup_1999) - [NSL-KDD](/intrusion-detection-datasets/content/datasets/nsl_kdd_dataset) @@ -292,11 +265,9 @@ Buczak, A. L., & Guven, E. (2015). A survey of data mining and machine learning ``` This survey only considers machine learning and datamining (i.e., anomaly-based) approaches and what they entail. -Given the relative recency of this work, the choice of described datasets, which were pretty much deprecated at the time -of writing, is surprising. +Given the relative recency of this work, the choice of described datasets, which were pretty much deprecated at the time of writing, is surprising. Referenced datasets: - - [DARPA'98 Intrusion Detection Program](/intrusion-detection-datasets/content/datasets/darpa98) - DARPA 1999 - [KDD Cup 1999](/intrusion-detection-datasets/content/datasets/kdd_cup_1999) @@ -307,11 +278,9 @@ Referenced datasets: Sahu, S. K., Sarangi, S., & Jena, S. K. (2014, February). A detail analysis on intrusion detection datasets. In 2014 ieee international advance computing conference (IACC) (pp. 1348-1353). IEEE. ``` -This paper shortly analyzed three papers the authors deem suitable to test their novel preprocessing techniques, -which are supposed to improve the performance of various data mining algorithms. +This paper shortly analyzed three papers the authors deem suitable to test their novel preprocessing techniques, which are supposed to improve the performance of various data mining algorithms. Referenced datasets: - - GURE-KDD - [KDD Cup 1999](/intrusion-detection-datasets/content/datasets/kdd_cup_1999) - [NSL-KDD](/intrusion-detection-datasets/content/datasets/nsl_kdd_dataset) @@ -322,13 +291,10 @@ Referenced datasets: Bhuyan, M. H., Bhattacharyya, D. K., & Kalita, J. K. (2013). Network anomaly detection: methods, systems and tools. Ieee communications surveys & tutorials, 16(1), 303-336. ``` -This survey mainly focuses on different approaches towards network anomaly detection, encountered attacks, patterns, -etc. -Datasets that are suitable for this purpose are mentioned as a secondary talking point, and described only in -brief. +This survey mainly focuses on different approaches towards network anomaly detection, encountered attacks, patterns, etc. +Datasets that are suitable for this purpose are mentioned as a secondary talking point, and described only in brief. Referenced datasets: - - [DARPA'98 Intrusion Detection Program](/intrusion-detection-datasets/content/datasets/darpa98) - [ISCX IDS 2012](/intrusion-detection-datasets/content/datasets/iscx_ids_2012) - Lawrence Berkeley National Laboratory Traces @@ -337,7 +303,6 @@ Referenced datasets: - TUIDS Referenced collections: - - CAIDA - DEFCON CTF archive @@ -345,19 +310,17 @@ Referenced collections: `Last updated` refers to the last time a new entry was added to the collection. -### Malware Traffic Analysis (2024) +### Malware Traffic Analysis ``` -https://www.netresec.com/?page=PcapFiles +https://www.malware-traffic-analysis.net/ (accessed 19.02.2024, last updated 14.02.2024) ``` -Various pcaps and malware samples stemming from individual campaigns or attack instances, but without any overall -categorization or even overview. -They are available as blog posts named something like "DarkGate activity" or "GootLoader Infection", which each one -listing some references and download links to any relevant files. +Various pcaps and malware samples stemming from individual campaigns or attack instances, but without any overall categorization or even overview. +They are available as blog posts named something like "DarkGate activity" or "GootLoader Infection", which each one listing some references and download links to any relevant files. -### NETRESEC (2024) +### NETRESEC ``` https://www.netresec.com/?page=PcapFiles @@ -365,10 +328,9 @@ https://www.netresec.com/?page=PcapFiles ``` A large collection of pcap files and other repositories which are hosting pcaps themselves. -They are categorized into CDX, Malware Traffic, Network Forensics, SCADA/ICS, CTF, Packet Injection/Man-on-the-Side, and -Uncategorized. +They are categorized into CDX, Malware Traffic, Network Forensics, SCADA/ICS, CTF, Packet Injection/Man-on-the-Side, and Uncategorized. -### Digital Corpora (2023) +### Digital Corpora ``` https://digitalcorpora.org/ @@ -379,34 +341,29 @@ A collection of datasets mostly designed for the use in forensics education. It consists of various disk images, memory dumps and pcaps, as well as a bunch of benign and malicious files. It does not seem to contain actual log data. -### Awesome Cybersecurity Datasets (2021) +### Awesome Cybersecurity Datasets ``` https://github.com/shramos/Awesome-Cybersecurity-Datasets (accessed 18.02.2024, last updated 23.01.2021) ``` -A "curated" personal collection of various cybersecurity-related datasets or collections, grouped into several -categories such as "Network", "Software" or "Fraud". -Each entry is described in only one or two sentences, and most datasets are not, or only partially, suitable for IDS -research. +A "curated" personal collection of various cybersecurity-related datasets or collections, grouped into several categories such as "Network", "Software" or "Fraud". +Each entry is described in only one or two sentences, and most datasets are not, or only partially, suitable for IDS research. The list is somewhat deprecated and does especially lack meaningful host-based datasets. -### IMPACT (2021) +### IMPACT ``` https://www.impactcybertrust.org/search (accessed 19.02.2024, last updated 13.07.2021) ``` -The "Information Marketplace for Policy and Analysis of Cyber-Risk and Trust" (IMPACT, formerly PREDICT), maintained by -the US Department of Homeland Security, contains 70 datasets. -These are for the most part made up of network related files (pcaps and DNS logs) from a wide variety of scenarios (CTF -events, IoT, corpo networks, etc.), as well as some miscellaneous things like network shapefiles. -55 of these datasets were created by IMPACT, 15 are external (mostly CAIDA). Many datasets require prior authorization -to access them. +The "Information Marketplace for Policy and Analysis of Cyber-Risk and Trust" (IMPACT, formerly PREDICT), maintained by the US Department of Homeland Security, contains 70 datasets. +These are for the most part made up of network related files (pcaps and DNS logs) from a wide variety of scenarios (CTF events, IoT, corpo networks, etc.), as well as some miscellaneous things like network shapefiles. +55 of these datasets were created by IMPACT, 15 are external (mostly CAIDA). Many datasets require prior authorization to access them. -### SecRepo - Samples of Security Related Data (2020) +### SecRepo - Samples of Security Related Data ``` https://www.secrepo.com/ @@ -414,11 +371,10 @@ https://www.secrepo.com/ ``` An individuals effort to "keep a somewhat curated list of Security related data I've found, created, or was pointed to". -It contains several entries of the authors own creation, some of which are described in a bit more detail, as well as -121 "3rd party" entries from a broad range of topics, each described in a single sentence. +It contains several entries of the authors own creation, some of which are described in a bit more detail, as well as 121 "3rd party" entries from a broad range of topics, each described in a single sentence. Some of them are usable for IDS related purposes. -### The Honeynet Project Challenges (2015) +### The Honeynet Project Challenges ``` https://www.honeynet.org/challenges/ @@ -428,24 +384,22 @@ https://www.honeynet.org/challenges/ A collection of 14 forensic challenges related to pcaps, malware and log files. However, most resources, except for the two newest challenges, are no longer available. -### Public Security Log Sharing Site (2010) +### Public Security Log Sharing Site ``` https://log-sharing.dreamhosters.com/ (accessed 18.02.2024, last updated 11.08.2010) ``` -A collection which started as an effort to collect various log samples, but seems to have been discontinued after -operating for about one year. +A collection which started as an effort to collect various log samples, but seems to have been discontinued after operating for about one year. Currently, it consists of nine entries containing Linux syslogs, firewall logs, apache logs, and web proxy logs. -### The Internet Traffic Archive (2008) +### The Internet Traffic Archive ``` https://ita.ee.lbl.gov/ (accessed 18.02.2024, last updated 09.04.2010) ``` -An archive hosting 16 different network data from various sources, such as WWW servers, web clients, and some custom -networks. +An archive hosting 16 different network data from various sources, such as WWW servers, web clients, and some custom networks. Most data is in the form of traces, some include http logs or traceroute measurements. \ No newline at end of file From e6dc385b416af40dbfd29ec9bfd22f142cb36ad9 Mon Sep 17 00:00:00 2001 From: schlippe Date: Wed, 21 Feb 2024 11:23:17 +0100 Subject: [PATCH 13/19] Reduce spacing before MD lists --- assets/css/compact_lists.css | 9 +++++++++ content/related_work.md | 8 ++++++++ 2 files changed, 17 insertions(+) create mode 100644 assets/css/compact_lists.css diff --git a/assets/css/compact_lists.css b/assets/css/compact_lists.css new file mode 100644 index 0000000..d3c8959 --- /dev/null +++ b/assets/css/compact_lists.css @@ -0,0 +1,9 @@ +ul { + margin-block-start: 0.2em; + margin-block-end: 0.2em; +} + +p { + margin-block-start: 0.2em; + margin-block-end: 0.2em; +} \ No newline at end of file diff --git a/content/related_work.md b/content/related_work.md index d92e358..31bd2a7 100644 --- a/content/related_work.md +++ b/content/related_work.md @@ -1,7 +1,15 @@ --- title: Related Work +css: assets/css/compact.css --- + + This page lists publications and collections covering IDS datasets. Related publications, sorted by year or release, are any academic work that at least partially covers the topic of available IDS datasets. Collections, sorted alphabetically, simply features agglomerations of IDS-related datasets not backed by a scientific publication. From 61d71a1760a83e711248ac655ff9a108ea08ae9c Mon Sep 17 00:00:00 2001 From: schlippe Date: Wed, 21 Feb 2024 11:41:51 +0100 Subject: [PATCH 14/19] Modify list spacing --- assets/css/compact_lists.css | 14 +++++++------- content/related_work.md | 7 ------- 2 files changed, 7 insertions(+), 14 deletions(-) diff --git a/assets/css/compact_lists.css b/assets/css/compact_lists.css index d3c8959..3e79b8c 100644 --- a/assets/css/compact_lists.css +++ b/assets/css/compact_lists.css @@ -1,9 +1,9 @@ -ul { - margin-block-start: 0.2em; - margin-block-end: 0.2em; -} - +p.maps-to-line { + margin: 0px 0px 3px + } p { - margin-block-start: 0.2em; - margin-block-end: 0.2em; + margin: 0px 0px 3px + } +ul { + margin: 9px 0px 26px 25.5px } \ No newline at end of file diff --git a/content/related_work.md b/content/related_work.md index 31bd2a7..175f212 100644 --- a/content/related_work.md +++ b/content/related_work.md @@ -3,13 +3,6 @@ title: Related Work css: assets/css/compact.css --- - - This page lists publications and collections covering IDS datasets. Related publications, sorted by year or release, are any academic work that at least partially covers the topic of available IDS datasets. Collections, sorted alphabetically, simply features agglomerations of IDS-related datasets not backed by a scientific publication. From 7378d5af1f09d975687fe20bbfdf1e26a7343695 Mon Sep 17 00:00:00 2001 From: schlippe Date: Wed, 21 Feb 2024 11:59:29 +0100 Subject: [PATCH 15/19] Modify list spacing --- assets/css/beautifuljekyll.css | 4 ++++ assets/css/compact_lists.css | 9 --------- content/related_work.md | 1 - 3 files changed, 4 insertions(+), 10 deletions(-) delete mode 100644 assets/css/compact_lists.css diff --git a/assets/css/beautifuljekyll.css b/assets/css/beautifuljekyll.css index 3980141..fd21fdf 100644 --- a/assets/css/beautifuljekyll.css +++ b/assets/css/beautifuljekyll.css @@ -73,6 +73,10 @@ hr.small { border-color: inherit; border-radius: 0.1875rem; } +ul { + margin-block-start: 0.2em; + margin-block-end: 0.2em; +} /* fix in-page anchors to not be behind fixed header */ :target:before { diff --git a/assets/css/compact_lists.css b/assets/css/compact_lists.css deleted file mode 100644 index 3e79b8c..0000000 --- a/assets/css/compact_lists.css +++ /dev/null @@ -1,9 +0,0 @@ -p.maps-to-line { - margin: 0px 0px 3px - } -p { - margin: 0px 0px 3px - } -ul { - margin: 9px 0px 26px 25.5px -} \ No newline at end of file diff --git a/content/related_work.md b/content/related_work.md index 175f212..d92e358 100644 --- a/content/related_work.md +++ b/content/related_work.md @@ -1,6 +1,5 @@ --- title: Related Work -css: assets/css/compact.css --- This page lists publications and collections covering IDS datasets. From eb6fced1e90f803b8a2d95404e518f2b269d7cc9 Mon Sep 17 00:00:00 2001 From: schlippe Date: Wed, 21 Feb 2024 12:08:06 +0100 Subject: [PATCH 16/19] Modify list spacing --- assets/css/beautifuljekyll.css | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/assets/css/beautifuljekyll.css b/assets/css/beautifuljekyll.css index fd21fdf..7b79598 100644 --- a/assets/css/beautifuljekyll.css +++ b/assets/css/beautifuljekyll.css @@ -31,6 +31,8 @@ body > main { p { line-height: 1.5; margin: 1.875rem 0; + margin-block-start: 0.2em; + margin-block-end: 0.2em; } h1,h2,h3,h4,h5,h6 { font-family: 'Open Sans', 'Helvetica Neue', Helvetica, Arial, sans-serif; @@ -74,8 +76,8 @@ hr.small { border-radius: 0.1875rem; } ul { - margin-block-start: 0.2em; - margin-block-end: 0.2em; + margin-block-start: 0.2em; + margin-block-end: 0.2em; } /* fix in-page anchors to not be behind fixed header */ From ce859110255f8184187e9437b4b23f186e4e28ad Mon Sep 17 00:00:00 2001 From: schlippe Date: Wed, 21 Feb 2024 12:24:23 +0100 Subject: [PATCH 17/19] Adjust paragraph margins --- assets/css/beautifuljekyll.css | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/assets/css/beautifuljekyll.css b/assets/css/beautifuljekyll.css index 7b79598..d73fe4a 100644 --- a/assets/css/beautifuljekyll.css +++ b/assets/css/beautifuljekyll.css @@ -31,8 +31,8 @@ body > main { p { line-height: 1.5; margin: 1.875rem 0; - margin-block-start: 0.2em; - margin-block-end: 0.2em; + margin-block-start: 0.2rem; + margin-block-end: 0.8rem; } h1,h2,h3,h4,h5,h6 { font-family: 'Open Sans', 'Helvetica Neue', Helvetica, Arial, sans-serif; From c833a3ceeda0244671eef5007a8109008804ae3a Mon Sep 17 00:00:00 2001 From: schlippe Date: Wed, 21 Feb 2024 12:42:47 +0100 Subject: [PATCH 18/19] Adjust paragraph margins --- assets/css/beautifuljekyll.css | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/assets/css/beautifuljekyll.css b/assets/css/beautifuljekyll.css index d73fe4a..7cee5ff 100644 --- a/assets/css/beautifuljekyll.css +++ b/assets/css/beautifuljekyll.css @@ -31,8 +31,8 @@ body > main { p { line-height: 1.5; margin: 1.875rem 0; - margin-block-start: 0.2rem; - margin-block-end: 0.8rem; + margin-block-start: 0.6rem; + margin-block-end: 0.6rem; } h1,h2,h3,h4,h5,h6 { font-family: 'Open Sans', 'Helvetica Neue', Helvetica, Arial, sans-serif; From ad287298b61883ee6be5edce50f9a5fe8552f3a7 Mon Sep 17 00:00:00 2001 From: schlippe Date: Wed, 21 Feb 2024 12:51:58 +0100 Subject: [PATCH 19/19] Adjust paragraph margins --- assets/css/beautifuljekyll.css | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/assets/css/beautifuljekyll.css b/assets/css/beautifuljekyll.css index 7cee5ff..9d83bcb 100644 --- a/assets/css/beautifuljekyll.css +++ b/assets/css/beautifuljekyll.css @@ -32,7 +32,7 @@ p { line-height: 1.5; margin: 1.875rem 0; margin-block-start: 0.6rem; - margin-block-end: 0.6rem; + margin-block-end: 0.8rem; } h1,h2,h3,h4,h5,h6 { font-family: 'Open Sans', 'Helvetica Neue', Helvetica, Arial, sans-serif; @@ -77,7 +77,7 @@ hr.small { } ul { margin-block-start: 0.2em; - margin-block-end: 0.2em; + margin-block-end: 0.6em; } /* fix in-page anchors to not be behind fixed header */