CLP core is the low-level component that performs compression, decompression, and search.
- We have built and tested CLP on the OSes listed
below.
- If you have trouble building for another OS, file an issue, and we may be able to help.
- A compiler that supports C++17 (e.g., gcc-8)
- To build, we require some source dependencies, packages from package managers, and libraries built from source.
We use both git submodules and third-party source packages. To download all, you can run this script:
tools/scripts/deps-download/download-all.sh
This will download:
- ANTLR v4 (v4.11.1)
- Catch2 (v2.13.7)
- date (v3.0.1)
- json (v3.10.4)
- SQLite3 (v3.36.0)
- yaml-cpp (v0.7.0)
A handful of packages and libraries are required to build CLP. There are two options to use them:
- Install them on your machine and build CLP natively
- Build CLP within a prebuilt docker container that contains the libraries; However, this won't work if you need additional libraries that aren't already in the container.
See the relevant README for your OS:
Want to build natively on an OS not listed here? You can file a feature request.
You can use these commands to start a container in which you can build and run CLP:
# Make sure to change /path/to/clp/components/core and /path/to/my/logs below
docker run --rm -it \
--name 'clp-build-env' \
-u$(id -u):$(id -g) \
-v$(readlink -f /path/to/clp/components/core):/mnt/clp \
-v$(readlink -f /path/to/my/logs):/mnt/logs \
ghcr.io/y-scope/clp/clp-core-dependencies-x86-ubuntu-focal:main \
/bin/bash -l
cd /mnt/clp
Make sure to change /path/to/clp/components/core
and /path/to/my/logs
to
the relevant paths on your machine.
-
Configure the cmake project:
mkdir build cd build cmake ../
-
Build:
make -j
- CLP contains two core executables:
clp
andclg
clp
is used for compressing and extracting logsclg
is used for performing wildcard searches on the compressed logs
To compress some logs without a schema file:
./clp c archives-dir /home/my/logs
archives-dir
is where compressed logs should be outputclp
will create a number of files and directories within, so it's best if this directory is empty- You can use the same directory repeatedly and
clp
will add to the compressed logs within.
/home/my/logs
is any log file or directory containing log files- In this mode,
clp
will use heuristics to determine what are the variables in each uncompressed message.- The heuristics roughly correspond to the example schema file in
config/schemas.txt
.
- The heuristics roughly correspond to the example schema file in
To compress with a user-defined schema file:
./clp c --schema-path path-to-schema-file archives-dir /home/my/logs
path-to-schema-file
is the location of a schema file. For more details on schema files, see README-Schema.md.
To decompress those logs:
./clp x archive-dir decompressed
archives-dir
is where the compressed logs were previously storeddecompressed
is a directory where they will be decompressed to
You can also decompress a specific file:
./clp x archive-dir decompressed /my/file/path.log
/my/file/path.log
is the uncompressed file's path (the one that was passed toclp
for compression)
More usage instructions can be found by running:
./clp --help
To search the compressed logs:
./clg archives-dir " a *wildcard* search phrase "
archives-dir
is where the compressed logs were previously stored- For archives compressed without a schema file:
- The search phrase can contain the
*
wildcard which matches 0 or more characters, or the?
wildcard which matches any single character.
- The search phrase can contain the
- For archives compressed using a schema file:
*
may only represent non-delimiter characters.
Similar to clp
, clg
can search a single file:
./clg archives-dir " a *wildcard* search phrase " /my/file/path.log
/my/file/path.log
is the uncompressed file's path (the one that was passed toclp
for compression)
More usage instructions can be found by running:
./clg --help
If you'd like to convert the dictionaries of an individual archive into a human-readable form, you
can use make-dictionaries-readable
.
./make-dictionaries-readable archive-path <output dir>
archive-path
is a path to a specific archive (insidearchives-dir
)
See the make-dictionaries-readable
README for
details on the output format.
By default, clp
uses an embedded SQLite database, so each directory containing archives can only
be accessed by a single clp
instance.
To enable parallel compression to the same archives directory, clp
/clg
can be configured to
use a MySQL-type database (MariaDB) as follows:
- Install and configure MariaDB using the instructions for your platform
- Create a user that has privileges to create databases, create tables, insert records, and delete records.
- Copy and change
config/metadata-db.yml
, setting the type tomysql
and uncommenting the MySQL parameters. - Install the MariaDB and PyYAML Python packages
pip3 install mariadb PyYAML
- This is necessary to run the database initialization script. If you prefer, you can run the
SQL statements in
tools/scripts/db/init-db.py
directly.
- This is necessary to run the database initialization script. If you prefer, you can run the
SQL statements in
- Run
tools/scripts/db/init-db.py
with the updated config file. This will initialize the database CLP requires. - Run
clp
orclg
as before, with the addition of the--db-config-file
option pointing at the updated config file. - To compress in parallel, simply run another instance of
clp
concurrently.
Note that currently, decompression (clp x
) and search (clg
) can only be run with a single
instance. We are in the process of open-sourcing parallelized versions of these as well.