KG-SaF: Data and Workflow Documentation

KG-SaF-Data and KG-SaF-JDeX from “Diliso, I., Barile, R., d’Amato, C., & Fanizzi, N. (2025). KG-SaF: Building Complete and Curated Datasets for Machine Learning and Reasoning on Knowledge Graphs (Version 0.0.0.1) [Computer software]. https://doi.org/10.5281/zenodo.17817931

KG-SaF provides a workflow (KG-SaF-JDeX) and curated datasets (KG-SaF-Data) for knowledge graph refinement (KGR) research. The resource includes datasets with both schema (ontologies) and ground facts, making it ready for machine learning and reasoning services.

Key Features

  • 🗂️ Extracts datasets from RDF-based KGs with expressive schemas (RDFS/OWL2)

  • 📦 Provides datasets in OWL and TSV formats, easily loadable in both PyTorch and Protege

  • ⚡ Handles inconsistencies and leverages reasoning to infer implicit knowledge

  • 🤖 Provides ML-ready tensor representations compatible with PyTorch and PyKEEN

  • 🧩 Offers schema decomposition into themed partitions (modularization of ontology components)

Appendix: