LLMs_for_log_parsing

This is the replication repository for the paper SoK: LLM-based Log Parsing (arXiv). For this systemization of knowledge (SoK), 30 papers, concerning LLM-based log parsing, were reviewed. The extracted features of each work can be found in the excel sheet categories.xlsx. The general process of LLM-based log parsing, derived from the reviewed papers, can be depicted as follows:

Licenses

Note: Most approaches do not provide a license. Please note that especially for preprint papers, the license might change or, if absent, be added in the future (status 7-March-2025).

Approach	License	Preprint?
Cui et al. (LogEval)	N/A	❌
Ji et al. (SuperLog)	Apache License Version 2.0, January 2004	❌
Jiang et al. (LILAC)	N/A
Liu et al. (LogPrompt)	N/A
Ma et al. (LLMParser)	N/A
Ma et al. (OpenLogParser)	N/A	❌
Mehrabi et al.	N/A
Pei et al. (SelfLog)	N/A
Sun et al. (Semirald)	N/A	❌
Vaarandi et al. (LLM-TD)	GNU General Public License version 2	❌
Xiao et al. (LogBatcher)	MIT License 2024
Xu et al. (DivLog)	Apache License Version 2.0, January 2004
Yu et al. (LogGenius)	N/A
Zhang et al. (Lemur)	Apache License Version 2.0, January 2004	❌

Setup

Before you can run the parsers and the evaluation you need to execute the setup script:

./setup.sh

Copy your API keys for OpenAI and TogetherAI into the corresponding files in keys/. CodeLlama was run via Ollama.

Code Execution

To run all baseline parsers (non-LLM) on the LogHub-2k datasets please execute:

python3 parser_run-no-LLM.py

To run all LLM-based parsers on the LogHub-2k datasets please execute:

python3 parser_run.py

To run all parsers on the LogHub-2.0 datasets please execute:

./download.sh
python3 parser_run-full.py

In each script you can adjust the used parsers, datasets, LLM, etc.

The results can be found in zip files in the folders output/ and output-full/.

If you do not want to rerun all the parsers you can unzip the output zip files by executing:

unzip output.zip
unzip output-full.zip

They also contain the result files of the evaluation.

Evaluation

All plots are given in the notebook file run_evaluation.ipynb and are produced from the csv files within the output folders. For reproduction simply rerun the code within. If you don't want to re-evaluate (it takes some time), simply comment out all the evaluate() functions to reuse the existing results.

To evaluate everything and produce the result files and the plots you can also run:

python3 run_evaluation.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLMs_for_log_parsing

Licenses

Setup

Code Execution

Evaluation

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
audit_templates		audit_templates
data/2k		data/2k
documentation		documentation
keys		keys
output-full		output-full
output		output
parsers		parsers
plots		plots
run_parser		run_parser
utils		utils
.gitignore		.gitignore
README.md		README.md
download.sh		download.sh
output-full.zip		output-full.zip
output.zip		output.zip
parser_run-full.py		parser_run-full.py
parser_run-no-LLM.py		parser_run-no-LLM.py
parser_run.py		parser_run.py
requirements.txt		requirements.txt
run_evaluation.ipynb		run_evaluation.ipynb
run_evaluation.py		run_evaluation.py
setup.sh		setup.sh

ait-aecid/LLM-log-parsing

Folders and files

Latest commit

History

Repository files navigation

LLMs_for_log_parsing

Licenses

Setup

Code Execution

Evaluation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages