Real-life data for object-centric processing mining
View the project on GitHub LienBosmans/pystackt
Watch the demo on Youtube PyStack't Demo BPM 2025
In this tutorial, you’ll extract an object-centric event log from a real GitHub repository. You don’t need to know much about process mining or GitHub yet — we’ll walk through it step by step.
By the end, you will:
Let’s get started!
Before we begin, make sure you have the following:
Python 3.12 or higher installed on your computer.
You can check this by running python --version
in your terminal.
If you don’t have it yet, download and install it from python.org.
A GitHub account.
If you don’t already have one, you can create it for free at github.com.
We’ll need it to generate an access token later.
With these in place, you’re ready to proceed.
First, we need to install the library we’ll be using: PyStack’t.
Open a terminal (or command prompt) and run:
pip install pystackt
This will download and install the package from PyPI.
If you see errors about pip not being found, try running python -m pip install pystackt
instead.
Once installed, you can test it worked by running:
python -c "import pystackt; print('PyStackt is installed!')"
If you see the confirmation message, you’re ready to move on.
PyStack’t needs permission to read from GitHub repositories. For this, we use a personal access token.
Now let’s extract event data from a real repository.
Open a Python file (for example extract_log.py
) or a Jupyter Notebook, and paste in the following code:
from pystackt import *
get_github_log(
GITHUB_ACCESS_TOKEN="insert_your_github_access_token_here",
repo_owner="LienBosmans",
repo_name="stack-t",
max_issues=None, # None returns all issues, can also be set to an integer to extract a limited data set
quack_db="./stackt.duckdb",
schema="main"
)
👉 Replace insert_your_github_access_token_here
with the token you generated earlier.
When you run this code, PyStack’t will connect to GitHub, fetch issue and pull request data from the stackt-t repository, and store it locally in a DuckDB database file called stackt.duckdb
.
Now that the raw data is stored, let’s export it into the OCEL 2.0 format. This is a common format for object-centric event logs.
Add the following code to your script or notebook:
export_to_ocel2(
quack_db="./stackt.duckdb",
schema_in="main",
schema_out="ocel2",
sqlite_db="./ocel2_stackt.sqlite"
)
This will create a new SQLite database file called ocel2_stackt.sqlite
that contains the event log in OCEL 2.0 format.
You now have a portable log that can be used in other analysis tools.
Finally, let’s open the log in the analysis tool Ocelot.
ocel2_stackt.sqlite
in the Event Log Import window.