Quick Start Guide

Software Requirements

Before using this project, ensure the following software is installed:

Tip

It is highly recommended to use pyenv to manage Python versions. This allows easy switching between versions and avoids system conflicts.

  1. Python 3.9 or later. If you already have Python 3.9+ installed system-wide, pyenv is optional but recommended.

  2. Java JDK 21 or later. Make sure your JAVA_HOME environment variable is set, otherwise some tools may not work correctly.

  3. ROBOT OBO Utility is required for reasoning and ontology manipulation. Install following instruction from officila guide ROBOT. Version 1.9.x was used in the development of this tool. (Requires Java)

Installation Instructions

  1. Clone the repository

git clone https://github.com/ivandiliso/kg-saf.git
cd kg-saf
  1. Install Python dependencies

pip install -r requirements.txt
  1. Verify requirements are installed

python --version
java -version
java -jar robot.jar --help

System Requirements

Warning

Running this project with less than 8 GB of RAM may cause crashes or slow performance.

  • Memory: 8 GB minimum (16GB reccomended for reasoning services)

  • Disk Space: 15 GB for dataset and ontologies

Unpack Released Datasets

The released datasets and ontologies are distributed in compressed ZIP files due to GitHub storage limitations. Some secondary files were removed to reduce size, but they can be reconstructed using the provided unpacking utility.

  1. Open the provided dataset unpacking notebook (kgsaf_jdex/utils/unpack.ipynb).

  2. Execute all cells sequentially.

  3. The notebook will automatically perform the following steps:

    • Unpack all compressed datasets and ontologies into an kgsaf_data/datasets/*/unpack and kgsaf_data/ontologies/*/unpackfolder.

    • Re-merge object property assertion files for each dataset.

    • Merge the full knowledge graph (TBox, RBox, and ABox) using a reasoner (Robot OBO Tool).

    • Convert N-Triples files to TSV format, ready for use with ML libraries such as PyKEEN.

    • Convert Schema files to JSON (taxonomy, roles, class assertions) for easier loading and manipulation in Python.