Skip to content

Commit

Permalink
Merge pull request #66 from fkie-cad/blog-posts
Browse files Browse the repository at this point in the history
Add blog post for v1.4.0
  • Loading branch information
ru37z authored Jun 5, 2024
2 parents be72871 + f5bb1e3 commit d75a5cd
Show file tree
Hide file tree
Showing 3 changed files with 48 additions and 34 deletions.
2 changes: 1 addition & 1 deletion _posts/2024-01-23-initial-post.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ title: First release of Intrusion Detection Datasets
subtitle: 43 datasets described in detail, with more to come!
gh-repo: fkie-cad/intrusion-detection-datasets
gh-badge: [star, fork, follow]
tags: [datasets, webpage]
tags: [datasets, features]
comments: true
author: Philipp Bönninghausen
---
Expand Down
47 changes: 47 additions & 0 deletions _posts/2024-06-04-version-1-4.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
---
layout: post
title: News in v1.4.0
subtitle: Plots, automated asset generation, three new dataset entries
gh-repo: fkie-cad/intrusion-detection-datasets
gh-badge: [star, fork, follow]
tags: [dataset, features]
comments: true
author: Philipp Bönninghausen
---

TL;DR:
- New "Statistics" subpage with plots
- Assets generation is now part of automated deployment
- Main dataset table is now ordered by year instead of alphabetically
- Renamed "Ground Truth" to "Indirect Labeling" for clarity
- Three new datasets: UNIBS, ISOT Botnet, UWF-ZeekData22
- Added related work
- Updated/improved some entries

This update adds a new *Statistics* subpage, accessible via the navbar or [this link](/intrusion-detection-datasets/content/statistics).
There, you can find plots visualizing various aspects of the surveyed datasets, along with detailed explanations.
Plots are automatically generated from the CSV file added in v1.3.0.

Speaking of "automatically", asset generation - meaning CSV data and plots - is now *actually* automated.
While these files were already generated by scripts, they had to be executed manually.
Now, this process is integrated into the deployment of the website itself, ensuring that all datasets are actually included in the generated files (removing the human element of potentially forgetting to do that).

The main table itself has been updated in two ways:
First, it is now ordered by year of creation as opposed to alphabetically.
We feel like this makes more sense, as datasets do deprecate over time, which does not fit with the rigid structure imposed by the latter method.
The new method also makes it easy to recognize any newly released datasets.
Secondly, the three-class label for "Labeled?" has been changed from [Labeled, Ground Truth, No Labels] to [Direct, Indirect, No Labels] along with updated descriptions.
The original naming was unclear, since labels itself are also a form of ground truth.

New dataset entries:
- [ISOT Botnet](/intrusion-detection-datasets/content/datasets/isot_botnet)
- [UNIBS](/intrusion-detection-datasets/content/datasets/unibs)
- [UWF-ZeekData22](/intrusion-detection-datasets/content/datasets/uwf_zeekdata22)

Added related work:
- [Kenyon et al.: Are public intrusion datasets fit for purpose characterising the state of the art in intrusion event datasets (2020)](/intrusion-detection-datasets/content/related_work/#are-public-intrusion-datasets-fit-for-purpose-characterising-the-state-of-the-art-in-intrusion-event-datasets-2020)
- [Yang et al.: A systematic literature review of methods and datasets for anomaly-based network intrusion detection (2022)](/intrusion-detection-datasets/content/related_work/#a-systematic-literature-review-of-methods-and-datasets-for-anomaly-based-network-intrusion-detection-2022)

Changed entries (major):
- [All entries]: Normalized description of benign user activity
- Completely overhauled entry for CSE-CIC-IDS2018
33 changes: 0 additions & 33 deletions content/datasets/botsv3.md

This file was deleted.

0 comments on commit d75a5cd

Please sign in to comment.