From 308606214243bd5b4d90bf362ed75edfe218d50c Mon Sep 17 00:00:00 2001 From: Maspital Date: Tue, 4 Jun 2024 09:05:55 +0200 Subject: [PATCH 1/4] Remove deprecated files --- content/datasets/botsv3.md | 33 --------------------------------- 1 file changed, 33 deletions(-) delete mode 100644 content/datasets/botsv3.md diff --git a/content/datasets/botsv3.md b/content/datasets/botsv3.md deleted file mode 100644 index 21811f5..0000000 --- a/content/datasets/botsv3.md +++ /dev/null @@ -1,33 +0,0 @@ ---- -title: BOTSv3 [UNLISTED ENTRY] ---- - -- [Overview](#overview) -- [Environment](#environment) -- [Activity](#activity) -- [Contained Data](#contained-data) -- [Links](#links) - -### Overview - -The Boss of the SOC (BOTS) dataset is associated with Splunk's BOTS competition. -Splunk is a software platform that allows users to search, monitor, and analyze machine-generated big data, via a -web-style interface. -The dataset is provided for participants of the competition to answer questions and solve challenges. - -### Environment - -A description of the environment the dataset originated from, including networks, operating systems, running services, -etc. - -### Activity - -What kind of activity, benign and malicious, was performed during the period of data collection. - -### Contained Data - -What kind of data was collected and how it is present in the dataset, including any processing and labeling. - -### Links - -[1] [List of included data sourcetypes](https://github.com/splunk/botsv2#data-sourcetypes-included) \ No newline at end of file From 9b2ce0e06e4031005f8b89a79c24be8143140993 Mon Sep 17 00:00:00 2001 From: Maspital Date: Tue, 4 Jun 2024 09:10:10 +0200 Subject: [PATCH 2/4] Update blog post tags --- _posts/2024-01-23-initial-post.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_posts/2024-01-23-initial-post.md b/_posts/2024-01-23-initial-post.md index 49d4ed4..c5a618d 100644 --- a/_posts/2024-01-23-initial-post.md +++ b/_posts/2024-01-23-initial-post.md @@ -4,7 +4,7 @@ title: First release of Intrusion Detection Datasets subtitle: 43 datasets described in detail, with more to come! gh-repo: fkie-cad/intrusion-detection-datasets gh-badge: [star, fork, follow] -tags: [datasets, webpage] +tags: [datasets, features] comments: true author: Philipp Bönninghausen --- From 3ff56839a404e55e9775d4e56b88ac296f6ba4bc Mon Sep 17 00:00:00 2001 From: Maspital Date: Tue, 4 Jun 2024 09:42:39 +0200 Subject: [PATCH 3/4] Add blog post for v1.4.0 --- _posts/2024-06-04-version-1-5.md | 47 ++++++++++++++++++++++++++++++++ 1 file changed, 47 insertions(+) create mode 100644 _posts/2024-06-04-version-1-5.md diff --git a/_posts/2024-06-04-version-1-5.md b/_posts/2024-06-04-version-1-5.md new file mode 100644 index 0000000..d237de7 --- /dev/null +++ b/_posts/2024-06-04-version-1-5.md @@ -0,0 +1,47 @@ +--- +layout: post +title: News in v1.4.0 +subtitle: Plots, automated asset generation, three new dataset entries +gh-repo: fkie-cad/intrusion-detection-datasets +gh-badge: [star, fork, follow] +tags: [dataset, features, HERE] +comments: true +author: Philipp Bönninghausen +--- + +TL;DR: +- New "Statistics" subpage with plots +- Assets generation is now part of automated deployment +- Main dataset table is now ordered by year instead of alphabetically +- Renamed "Ground Truth" to "Indirect Labeling" for clarity +- Three new datasets: UNIBS, ISOT Botnet, UWF-ZeekData22 +- Added related work +- Updated/improved some entries + +This update adds a new *Statistics* subpage, accessible via the navbar or [this link](/intrusion-detection-datasets/content/statistics). +There, you can find plots visualizing various aspects of the surveyed datasets, along with detailed explanations. +Plots are automatically generated from the CSV file added in v1.3.0. + +Speaking of "automatically", asset generation - meaning CSV data and plots - is now *actually* automated. +While these files were already generated by scripts, they had to be executed manually. +Now, this process is integrated into the deployment of the website itself, ensuring that all datasets are actually included in the generated files (removing the human element of potentially forgetting to do that). + +The main table itself has been updated in two ways: +First, it is now ordered by year of creation as opposed to alphabetically. +We feel like this makes more sense, as datasets do deprecate over time, which does not fit with the rigid structure imposed by the latter method. +The new method also makes it easy to recognize any newly released datasets. +Secondly, the three-class label for "Labeled?" has been changed from [Labeled, Ground Truth, No Labels] to [Direct, Indirect, No Labels] along with updated descriptions. +The original naming was unclear, since labels itself are also a form of ground truth. + +New dataset entries: +- [ISOT Botnet](/intrusion-detection-datasets/content/datasets/isot_botnet) +- [UNIBS](/intrusion-detection-datasets/content/datasets/unibs) +- [UWF-ZeekData22](/intrusion-detection-datasets/content/datasets/uwf_zeekdata22) + +Added related work: +- [Kenyon et al.: Are public intrusion datasets fit for purpose characterising the state of the art in intrusion event datasets (2020)](/intrusion-detection-datasets/content/related_work/#are-public-intrusion-datasets-fit-for-purpose-characterising-the-state-of-the-art-in-intrusion-event-datasets-2020) +- [Yang et al.: A systematic literature review of methods and datasets for anomaly-based network intrusion detection (2022)](/intrusion-detection-datasets/content/related_work/#a-systematic-literature-review-of-methods-and-datasets-for-anomaly-based-network-intrusion-detection-2022) + +Changed entries (major): +- [All entries]: Normalized description of benign user activity +- Completely overhauled entry for CSE-CIC-IDS2018 \ No newline at end of file From f5bb1e3128ac8805a5dcf7af3fa0cb1585611c6b Mon Sep 17 00:00:00 2001 From: Maspital Date: Wed, 5 Jun 2024 11:03:34 +0200 Subject: [PATCH 4/4] Fix minor errors --- _posts/{2024-06-04-version-1-5.md => 2024-06-04-version-1-4.md} | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) rename _posts/{2024-06-04-version-1-5.md => 2024-06-04-version-1-4.md} (98%) diff --git a/_posts/2024-06-04-version-1-5.md b/_posts/2024-06-04-version-1-4.md similarity index 98% rename from _posts/2024-06-04-version-1-5.md rename to _posts/2024-06-04-version-1-4.md index d237de7..1868217 100644 --- a/_posts/2024-06-04-version-1-5.md +++ b/_posts/2024-06-04-version-1-4.md @@ -4,7 +4,7 @@ title: News in v1.4.0 subtitle: Plots, automated asset generation, three new dataset entries gh-repo: fkie-cad/intrusion-detection-datasets gh-badge: [star, fork, follow] -tags: [dataset, features, HERE] +tags: [dataset, features] comments: true author: Philipp Bönninghausen ---