LEMMA-RCA Datasets
This work is licensed under a Creative Commons Attribution-NoDerivatives 4.0 International License.
Datasets
Due to the anonymity requirement during the review process, we temporarily withhold access to the Huggingface repository where the datasets were hosted originally. We provide pre-processed data through anonymous Google Drive links. If you have any question regarding the data, feel free to email us at lemma-rca@gmail.com
Dataset | Domain | Modality | Original Size | # Faults | Ave # Entities per Fault | Download Link |
---|---|---|---|---|---|---|
Product Review | Microservice | Multiple | 765G | 4 | 216 | [Raw Data][Preprocessed Data] |
Cloud Computing | Microservice | Multiple | 540G | 6 | 168 | [Raw Data][Preprocessed Data] |
SWaT | Water Treatment | Single | 4.47G | 16 | 51 | [Raw Data] |
WADI | Water Treatment | Single | 5.67G | 9 | 123 | [Raw Data] |
For a detailed descriptions on system architecture, fault types, feature extraction, etc. Please check our paper and the Github repo.
About SWaT and WADI
The original SWaT and WADI data can be downloaded from iTrust website. We provide a step-by-step guidance with codes on how to transform the original data to match the settings of the root cause analysis tasks, available on Github.