Quick Start Guide
Software Requirements
Before using this project, ensure the following software is installed:
Tip
It is highly recommended to use pyenv to manage Python versions. This allows easy switching between versions and avoids system conflicts.
Python 3.9 or later. If you already have Python 3.9+ installed system-wide, pyenv is optional but recommended.
Java JDK 21 or later. Make sure your
JAVA_HOMEenvironment variable is set, otherwise some tools may not work correctly.ROBOT OBO Utility is required for reasoning and ontology manipulation. Install following instruction from officila guide ROBOT. Version
1.9.xwas used in the development of this tool. (Requires Java)
Installation Instructions
Clone the repository
git clone https://github.com/ivandiliso/kg-saf.git
cd kg-saf
Install Python dependencies
pip install -r requirements.txt
Verify requirements are installed
python --version
java -version
java -jar robot.jar --help
System Requirements
Warning
Running this project with less than 8 GB of RAM may cause crashes or slow performance.
Memory: 8 GB minimum (16GB reccomended for reasoning services)
Disk Space: 15 GB for dataset and ontologies
Unpack Released Datasets
The released datasets and ontologies are distributed in compressed ZIP files due to GitHub storage limitations. Some secondary files were removed to reduce size, but they can be reconstructed using the provided unpacking utility.
Open the provided dataset unpacking notebook (
kgsaf_jdex/utils/unpack.ipynb).Execute all cells sequentially.
The notebook will automatically perform the following steps:
Unpack all compressed datasets and ontologies into an
kgsaf_data/datasets/*/unpackandkgsaf_data/ontologies/*/unpackfolder.Re-merge object property assertion files for each dataset.
Merge the full knowledge graph (TBox, RBox, and ABox) using a reasoner (Robot OBO Tool).
Convert N-Triples files to TSV format, ready for use with ML libraries such as PyKEEN.
Convert Schema files to JSON (taxonomy, roles, class assertions) for easier loading and manipulation in Python.