HALO: Hierarchy-aware Fault Localization for Cloud Systems
I am not the author of the paper. All the content of the code comes from the paper. If you have any questions, welcome
to discuss.
The paper is mainly divided into two parts.
-
The first part uses the inclusion relationship of dimension attributes to generate DAG by calculating entropy and conditional entropy, and generates search path by random walk
-
then defines exception score and root cause score, traverses the dimension path from top to bottom, and obtains the root cause dimension combination
The code passed the test on Python 3.8 and 3.9. In addition to [typing and dataclass], earlier versions should also support
Most of the code follows the paper, except for the following
- Maybe it's my understanding. I think the denominator of the following formula may be 0 (when the layer has only one attribute), so I added 1 after it
- By default, only one path will be sampled for search
- [Reverse Truncation] is not implemented because I personally think fault information should be given as much as possible, which is not necessarily redundant
-
cd HALO/ pip install requirements.txt //Installation dependency
-
cd HALO/ python3 main.py //test case
Because the entropy relationship of the sample data provided in the paper is inconsistent with AHG, several values of API Attr are modified in the test data to increase its entropy. But there are still some small differences between the final results and the paper.