KG-SaF: Data and Workflow Documentation
KG-SaF-Data and KG-SaF-JDeX from “Diliso, I., Barile, R., d’Amato, C., & Fanizzi, N. (2025). KG-SaF: Building Complete and Curated Datasets for Machine Learning and Reasoning on Knowledge Graphs (Version 0.0.0.1) [Computer software]. https://doi.org/10.5281/zenodo.17817931”
KG-SaF provides a workflow (KG-SaF-JDeX) and curated datasets (KG-SaF-Data) for knowledge graph refinement (KGR) research. The resource includes datasets with both schema (ontologies) and ground facts, making it ready for machine learning and reasoning services.
Key Features
🗂️ Extracts datasets from RDF-based KGs with expressive schemas (RDFS/OWL2)
📦 Provides datasets in OWL and TSV formats, easily loadable in both PyTorch and Protege
⚡ Handles inconsistencies and leverages reasoning to infer implicit knowledge
🤖 Provides ML-ready tensor representations compatible with PyTorch and PyKEEN
🧩 Offers schema decomposition into themed partitions (modularization of ontology components)
Contents:
Appendix: