-
Notifications
You must be signed in to change notification settings - Fork 7
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add more dev tools and update readme (#27)
This PR adds 1. A gRPC code generation script (`make gen`) 2. A cleaner script (`make clean`) 3. A sample data downloader (`make get-dataset`) 4. A major update of README.md after OSPP and GSOC submission. 5. Include Log-analysis architecture/design diagrams. Signed-off-by: Superskyyy <[email protected]>
- Loading branch information
1 parent
eae743d
commit 8402746
Showing
14 changed files
with
466 additions
and
60 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,80 +1,88 @@ | ||
# SkyWalking AIOps Engine | ||
**An AIOps Engine for Observability.** | ||
|
||
A usable open-source AIOps framework for the domain of cloud computing observability. | ||
*A practical open-source AIOps engine for the | ||
era of cloud computing.* | ||
|
||
### Why this project matters? | ||
We could answer this from the following progressive questions: | ||
1. Are there existing algorithms for telemetry data? | ||
## Why do we build this project? | ||
|
||
**We strongly believe that this project will bring value | ||
to AIOps practitioners and researchers.** | ||
<details> | ||
<summary>Towards better Observability</summary> | ||
We could reason this from the following progressive questions: | ||
|
||
1. Are there existing algorithms for telemetry data? | ||
- **Abundant.** | ||
|
||
2. Are the existing algorithms empirically verified? | ||
|
||
- **Most proposed algorithms are not empirically verified** | ||
|
||
3. Are there AIOps tools that embed machine learning algorithms? | ||
2. Are the existing algorithms empirically verified? | ||
|
||
- **Most algorithms are not verified in production** | ||
|
||
|
||
3. Are there practical AIOps frameworks? | ||
- **Limited, often out of maintenance or commercialized.** | ||
|
||
4. Are there open-source AIOps solutions that integrates with popular backends? | ||
|
||
|
||
4. Are there open-source AIOps solutions that offers Out-of-Box integrations? | ||
- **Hardly any.** | ||
|
||
|
||
5. Why would I need that? | ||
1. For developers & organizations curious for AIOps: | ||
- a. Just install and start using it, saves budget, saves head-scratching. | ||
- a. Just install and start using it, saves budget, prevents head-scratching. | ||
- b. Treat this project as a good (or bad) reference for your own AIOps pipeline. | ||
2. For researchers in the AIOps domain: | ||
- a. For software engineering researchers - sample for AIOps evolution and empirical study. | ||
- b. For algorithm researchers - playground for new algorithms, solid case studies. | ||
|
||
|
||
The above is where we place the value of this project, though our current aim is to become the official AIOps engine | ||
of [Apache SkyWalking](https://github.com/apache/skywalking), each component could be easily swapped given its | ||
plugable design. | ||
</details> | ||
|
||
|
||
Click the above section to find out where we place the value of this project, | ||
though our current aim is to become the official AIOps engine | ||
of [Apache SkyWalking](https://github.com/apache/skywalking), | ||
each component could be easily swapped, extended and scaled to fit your own needs. | ||
|
||
### Current Goal | ||
|
||
At the current stage, it serves as an **anomaly detection** engine, in the future, we will also explore root cause analysis and | ||
automatic problem recovery. | ||
At the current stage, it targets at Logs and Metrics analysis, | ||
in the future, we will also explore root cause analysis and | ||
automatic problem recovery based on Traces. | ||
|
||
This is also the tentative repository for OSPP 2022 and GSOC 2022 student project outcomes. | ||
This is also the repository for | ||
OSPP 2022 and GSOC 2022 student research outcomes. | ||
|
||
Project `Exploration of Advanced Metrics Anomaly Detection & Alerts with Machine Learning in Apache SkyWalking` | ||
1. `Exploration of Advanced Metrics Anomaly Detection & Alerts with Machine Learning in Apache SkyWalking` | ||
|
||
Project `Log Outlier Detection in Apache SkyWalking` | ||
2. `Log Outlier Detection in Apache SkyWalking` | ||
|
||
### Architecture | ||
|
||
**TBA** | ||
**Log Clustering and Log Trend Analysis** | ||
|
||
**Data pulling:** | ||
 | ||
|
||
The current data pulling and retention rely on a common set of ingestion methods, with a | ||
first focus on SkyWalking OAP GraphQL and static file loader. We maintain a local storage for processed data. | ||
 | ||
|
||
**Alert component:** | ||
**Metric Anomaly Detection and Visualizations** | ||
|
||
An anomaly does not directly trigger an alert, it | ||
goes through a tolerance mechanism. | ||
TBD - Soon to be added | ||
|
||
### Roadmap | ||
|
||
Phase 0 (current) | ||
1. [ ] Implement essential development infrastructure. | ||
2. [ ] Implement naive algorithms as baseline & pipline POC (on existing datasets). | ||
3. [ ] Implement a SkyWalking `GraphQLDataLoaderProvider` to test data pulling. | ||
|
||
Phase 1 (summer -> fall 2022, OSPP & GSOC period) | ||
1. [ ] Implement the remaining core default providers. | ||
2. [ ] **Research and implement algorithms with OSPP & GSOC students.** | ||
3. [ ] Integrate with Apache Airflow for orchestration. | ||
5. [ ] Evaluation based on benchmark microservices systems (anomaly injection). | ||
6. [ ] MVP ready without UI-side changes. | ||
|
||
Phase 2 (fall -> end of 2022) | ||
1. [ ] Join as an Apache SkyWalking subproject. | ||
2. [ ] Integrate with SkyWalking Backend & rule-based alert module. | ||
3. [ ] Propose and request SkyWalking UI-side changes. | ||
4. [ ] First release for end-user testing. | ||
|
||
Phase Next | ||
For the details of our progress, please refer to our project dashboard | ||
[Here](https://github.com/SkyAPM/aiops-engine-for-skywalking/projects?query=is%3Aopen). | ||
|
||
Phase Current (fall -> end of 2022) | ||
|
||
0. [ ] Finish POC stage and start implementing dashboards for first stage users. (demo purposes) | ||
1. [ ] Real-world data testing and chaos engineering benchmark experiments. | ||
2. [ ] Join Apache Software Foundation as an Apache SkyWalking subproject. | ||
3. [ ] Integrate with SkyWalking Backend (Export analytics results to SkyWalking) | ||
4. [ ] Propose and request SkyWalking UI-side changes. | ||
5. [ ] First release for SkyWalking end-user testing. | ||
|
||
Phase Next | ||
|
||
1.[ ] Towards production-ready. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,33 @@ | ||
# Get a range of common datasets for testing and development | ||
|
||
At the root of project, run `make get-dataset name=<name>` to get them, | ||
the datasets will be extracted to the `assets/datasets` folder. | ||
|
||
Use the following `names` to download the batch of the datasets you need: | ||
|
||
1. `gaia`: the [GAIA](https://github.com/CloudWise-OpenSource/GAIA-DataSet) dataset. | ||
- 4+ GB with log, trace and metric data. | ||
2. `log_s`: small [LogHub](https://github.com/logpai/loghub) datasets. | ||
1. SSH.tar.gz: (Server) | ||
2. Hadoop.tar.gz: (Distributed system) | ||
3. Apache.tar.gz: (Server) | ||
4. HealthApp.tar.gz: (Mobile application) | ||
5. Zookeeper.tar.gz: (Distributed system) | ||
6. HPC.tar.gz: (Supercomputer) | ||
3. `log_m`: medium [LogHub](https://github.com/logpai/loghub) datasets. | ||
1. Android.tar.gz = 1,555,005 logs (183MB Mobile system) | ||
2. BGL.tar.gz = 4,747,963 logs (700MB Supercomputer) | ||
3. Spark.tar.gz = 33,236,604 logs (2.7GB Distributed system) | ||
4. `log_l`: large [LogHub](https://github.com/logpai/loghub) datasets. | ||
1. HDFS_2.tar.gz = 71,118,073 logs (16GB Distributed system) | ||
2. Thunderbird.tar.gz = 211,212,192 logs (30GB Supercomputer) | ||
|
||
**Note large dataset require substantial disk space and memory to extract** | ||
|
||
## To remove the datasets/zip/tar files | ||
|
||
If you want to keep all zip/tar files after extracting, pass additional `save=True` | ||
to `make get-dataset name=log_m save=True` . | ||
|
||
If you want to remove all datasets, run `make prune-dataset` | ||
|
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
# Convenient developer tools | ||
|
||
Tools in this folder should only be run via the `make <target>` command. | ||
|
||
1. grpc_gen.py => generate grpc code from proto files at any depth. | ||
2. get_data.py => download and extract some sample datasets from the web. | ||
3. cleaner.py => cleans up things like pycache and local installation manifests. | ||
|
||
## Future | ||
|
||
The above tools will be replaced will Poetry-based scripts in the future (A dev CLI) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
# Copyright 2022 SkyAPM org | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,36 @@ | ||
# Copyright 2022 SkyAPM org | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
import os | ||
import shutil | ||
|
||
|
||
def find_and_clean(folders_to_remove: list, root='.') -> None: | ||
""" | ||
Find and clean all files in the given folder list | ||
:param folders_to_remove: list of directories to remove | ||
:param root: from which folder to start searching, default current | ||
:return: | ||
""" | ||
exclude: set = {'.venv'} | ||
for path, dirs, _ in os.walk(root): | ||
dirs[:] = [d for d in dirs if d not in exclude] | ||
for folder in folders_to_remove: | ||
if any(folder in d for d in dirs): | ||
shutil.rmtree(removed := os.path.join(path, folder)) | ||
print(f'Removed {removed}') | ||
|
||
|
||
if __name__ == '__main__': | ||
find_and_clean(folders_to_remove=['__pycache__', 'generated', 'build', 'dist', 'egg-info', 'pytest_cache', '.pyc'], | ||
root='.') |
Oops, something went wrong.