Skip to content

Commit

Permalink
Merge branch 'main' into issue-13-add-datasets
Browse files Browse the repository at this point in the history
  • Loading branch information
Maspital committed Apr 3, 2024
2 parents 83dbd06 + 8d85d16 commit 18cf960
Show file tree
Hide file tree
Showing 4 changed files with 31 additions and 22 deletions.
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,6 @@ We intentionally omit datasets from very different environments such as industri
Any kind of contribution is most welcome, both in the form of adding new entries and improving existing ones!
For more information, please refer to the [Contribution Guide](https://fkie-cad.github.io/intrusion-detection-datasets/content/contributing/).

## Acknowledgments
## Further Information

The website was made using [Beautiful Jekyll](https://beautifuljekyll.com/).
For more information, please see the [About page](https://fkie-cad.github.io/intrusion-detection-datasets/content/about/).
17 changes: 16 additions & 1 deletion content/about.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,21 @@ Additional information includes:
- Related published papers
- Related links (homepages, download sources, documentation, etc.)

### Credits
### Citing this Work

If you would like to cite this overview in your (academic) work, we recommend to cite the exact release that the cited information refers to, e.g.,
<!-- {% raw %} -->
```
@misc{idd100,
author = {{Intrusion Detection Datasets} contributors},
title = {{Intrusion Detection Datasets v1.0.0 -- GitHub}},
year = {2024},
howpublished = {\url{https://github.com/fkie-cad/intrusion-detection-datasets/releases/tag/v1.0.0}},
note = {[Online; accessed DD-MMM-YYYY]},
}
```
<!-- {% endraw %} -->

### Acknowledgments

The webpage was made using [Beautiful Jekyll](https://beautifuljekyll.com/).
26 changes: 8 additions & 18 deletions content/datasets/nsl_kdd_dataset.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ title: NSL-KDD
| | |
| **Packed Size** | 6 MB |
| **Unpacked Size** | 19 MB |
| **Download Link** | [goto](http://205.174.165.80/CICDataset/NSL-KDD/Dataset/NSL-KDD.zip) |
| **Download Link** | [goto](https://github.com/HoaNP/NSL-KDD-DataSet) |

***

Expand All @@ -42,28 +42,15 @@ all.

### Environment

The simulated Air Force base consists of a small number of hosts, leveraging "custom software" to appear as if they were
1000s of hosts with different IP addresses.
Refer to the underlying [DARPA'98 Intrusion Detection Program](darpa98.md).

### Activity

Within the network, automated users perform an array of tasks such as sending mails, browsing, or using services like
FTP, telnet or SNMP.
The total duration of this simulation was nine weeks.
Any protective devices such as firewalls are omitted, as "the focus was on detecting attacks, and not preventing
attacks".
All attacks are performed from the outside of this network, and a sniffer is located at the entry point of the network
to capture this traffic.
Attacks belong to one of four categories:

- DoS
- Remote to Local
- User to Root
- Surveillance/Probing
Refer to the underlying [DARPA'98 Intrusion Detection Program](darpa98.md).

### Contained Data

The original version contained a large number of redundant/duplicate records, which was problematic for two reasons:
The original version - the KDD Cup 1999 dataset - contained a large number of redundant/duplicate records, which was problematic for two reasons:

- In the training set, it caused classifiers to be biased towards those more frequent records
- In the test set, it caused evaluation to be biased towards learners having better detection rates on artificially
Expand All @@ -78,6 +65,8 @@ group is inversely proportional to the number of records in the original dataset
evaluation.
The difficulty is available as a new feature of each event (the last one).

Note that the original download source is now longer accessible, however, an unofficial copy is available via an individuals GitHub repository.

### Papers

- [A detailed analysis of the KDD CUP 99 data set (2009)](https://doi.org/10.1109/cisda.2009.5356528)
Expand All @@ -86,11 +75,12 @@ The difficulty is available as a new feature of each event (the last one).
### Links

- [Homepage](https://www.unb.ca/cic/datasets/nsl.html)
- [Download Page](http://205.174.165.80/CICDataset/NSL-KDD/Dataset/)
- [Unofficial Download Source](https://github.com/HoaNP/NSL-KDD-DataSet)

### Related Entries

- [KDD Cup 1999](kdd_cup_1999.md)
- [DARPA'98 Intrusion Detection Program](darpa98.md)

### Data Examples

Expand Down
6 changes: 5 additions & 1 deletion docs/new_entry_template.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,10 @@ What kind of data was collected and how it is present in the dataset, including
### Data Examples
Snippet from the dataset, ideally one for each data type.
Note that multi-word annotations (like `json lines`) will not render properly on GitHub Pages.
Wrapping these snippets with `raw`/`endraw` is not strictly required, but prevents Liquid from parsing anything it shouldn't.

<!-- {% raw %} -->
```
data example
```
```
<!-- {% endraw %} -->

0 comments on commit 18cf960

Please sign in to comment.