LEMMA-RCA Datasets


Creative Commons License
This work is licensed under a Creative Commons Attribution-NoDerivatives 4.0 International License.
Datasets

Due to the anonymity requirement during the review process, we temporarily withhold access to the Huggingface repository where the datasets were hosted originally. We provide pre-processed data through anonymous Google Drive links. If you have any question regarding the data, feel free to email us at lemma-rca@gmail.com

Dataset Domain Modality Original Size # Faults Ave # Entities per Fault Download Link
Product Review Microservice Multiple 765G 4 216 [Raw Data][Preprocessed Data]
Cloud Computing Microservice Multiple 540G 6 168 [Raw Data][Preprocessed Data]
SWaT Water Treatment Single 4.47G 16 51 [Raw Data]
WADI Water Treatment Single 5.67G 9 123 [Raw Data]

For a detailed descriptions on system architecture, fault types, feature extraction, etc. Please check our paper and the Github repo.

About SWaT and WADI

The original SWaT and WADI data can be downloaded from iTrust website. We provide a step-by-step guidance with codes on how to transform the original data to match the settings of the root cause analysis tasks, available on Github.