Add blog post for v1.4.0

fkie-cad · Jun 4, 2024 · 3ff5683 · 3ff5683
1 parent 9b2ce0e
commit 3ff5683
Showing 1 changed file with 47 additions and 0 deletions.
diff --git a/_posts/2024-06-04-version-1-5.md b/_posts/2024-06-04-version-1-5.md
@@ -0,0 +1,47 @@
+---
+layout: post
+title: News in v1.4.0
+subtitle: Plots, automated asset generation, three new dataset entries
+gh-repo: fkie-cad/intrusion-detection-datasets
+gh-badge: [star, fork, follow]
+tags: [dataset, features, HERE]
+comments: true
+author: Philipp Bönninghausen
+---
+
+TL;DR:
+- New "Statistics" subpage with plots
+- Assets generation is now part of automated deployment
+- Main dataset table is now ordered by year instead of alphabetically
+- Renamed "Ground Truth" to "Indirect Labeling" for clarity
+- Three new datasets: UNIBS, ISOT Botnet, UWF-ZeekData22
+- Added related work
+- Updated/improved some entries
+
+This update adds a new *Statistics* subpage, accessible via the navbar or [this link](/intrusion-detection-datasets/content/statistics).
+There, you can find plots visualizing various aspects of the surveyed datasets, along with detailed explanations.
+Plots are automatically generated from the CSV file added in v1.3.0.
+
+Speaking of "automatically", asset generation - meaning CSV data and plots - is now *actually* automated.
+While these files were already generated by scripts, they had to be executed manually.
+Now, this process is integrated into the deployment of the website itself, ensuring that all datasets are actually included in the generated files (removing the human element of potentially forgetting to do that).
+
+The main table itself has been updated in two ways:
+First, it is now ordered by year of creation as opposed to alphabetically.
+We feel like this makes more sense, as datasets do deprecate over time, which does not fit with the rigid structure imposed by the latter method.
+The new method also makes it easy to recognize any newly released datasets.
+Secondly, the three-class label for "Labeled?" has been changed from [Labeled, Ground Truth, No Labels] to [Direct, Indirect, No Labels] along with updated descriptions.
+The original naming was unclear, since labels itself are also a form of ground truth.
+
+New dataset entries:
+- [ISOT Botnet](/intrusion-detection-datasets/content/datasets/isot_botnet)
+- [UNIBS](/intrusion-detection-datasets/content/datasets/unibs)
+- [UWF-ZeekData22](/intrusion-detection-datasets/content/datasets/uwf_zeekdata22)
+
+Added related work:
+- [Kenyon et al.: Are public intrusion datasets fit for purpose characterising the state of the art in intrusion event datasets (2020)](/intrusion-detection-datasets/content/related_work/#are-public-intrusion-datasets-fit-for-purpose-characterising-the-state-of-the-art-in-intrusion-event-datasets-2020)
+- [Yang et al.: A systematic literature review of methods and datasets for anomaly-based network intrusion detection (2022)](/intrusion-detection-datasets/content/related_work/#a-systematic-literature-review-of-methods-and-datasets-for-anomaly-based-network-intrusion-detection-2022)
+
+Changed entries (major):
+- [All entries]: Normalized description of benign user activity
+- Completely overhauled entry for CSE-CIC-IDS2018