PyStack't Documentation

Real-life data for object-centric processing mining

PyStack't logo

Read the BPM demo paper on CEUR L. Bosmans, J. Peeperkorn, J. De Smedt, PyStack’t: Real-Life Data for Object-Centric Process Mining

View the project on GitHub LienBosmans/pystackt

Watch the demo on Youtube PyStack't Demo BPM 2025

View releases on PyPi pip install pystackt

Consider contributing to PyStack't Contributing guide

Tutorial: Extracting your first object-centric event log from a GitHub repository

In this tutorial, you’ll extract an object-centric event log from a real GitHub repository. You don’t need to know much about process mining or GitHub yet — we’ll walk through it step by step.

By the end, you will:

Let’s get started!

Prerequisites

Before we begin, make sure you have the following:

With these in place, you’re ready to proceed.

Install PyStack’t

First, we need to install the library we’ll be using: PyStack’t.
Open a terminal (or command prompt) and run:

pip install pystackt

This will download and install the package from PyPI. If you see errors about pip not being found, try running python -m pip install pystackt instead.

Once installed, you can test it worked by running:

python -c "import pystackt; print('PyStackt is installed!')"

If you see the confirmation message, you’re ready to move on.

Generate GitHub access token

PyStack’t needs permission to read from GitHub repositories. For this, we use a personal access token.

  1. Log in to your GitHub account.
  2. Go to GitHub Developer Settings.
  3. Click “Generate new token (classic)”.
  4. Don’t select any scopes (leave all checkboxes unchecked).
  5. Generate the token and copy it. GitHub will only show it once, so store it somewhere safe. We’ll use this token in the next step.
    • If you lose your token, you can always generate a new one.
    • If you accidently share your token, for example by comitting it to Git, it’s good practice to delete it and generate a new token.

Extract object-centric event data from GitHub repository

Now let’s extract event data from a real repository.

Open a Python file (for example extract_log.py) or a Jupyter Notebook, and paste in the following code:

from pystackt import *

get_github_log(
    GITHUB_ACCESS_TOKEN="insert_your_github_access_token_here",
    repo_owner="LienBosmans",
    repo_name="stack-t",
    max_issues=None, # None returns all issues, can also be set to an integer to extract a limited data set
    quack_db="./stackt.duckdb",
    schema="main"
)

👉 Replace insert_your_github_access_token_here with the token you generated earlier.

When you run this code, PyStack’t will connect to GitHub, fetch issue and pull request data from the stackt-t repository, and store it locally in a DuckDB database file called stackt.duckdb.

Export to OCEL 2.0

Now that the raw data is stored, let’s export it into the OCEL 2.0 format. This is a common format for object-centric event logs.

Add the following code to your script or notebook:

export_to_ocel2(
    quack_db="./stackt.duckdb",
    schema_in="main",
    schema_out="ocel2",
    sqlite_db="./ocel2_stackt.sqlite"
)

This will create a new SQLite database file called ocel2_stackt.sqlite that contains the event log in OCEL 2.0 format. You now have a portable log that can be used in other analysis tools.

Load OCEL 2.0 log in Ocelot

Finally, let’s open the log in the analysis tool Ocelot.

  1. Go to https://ocelot.pm/
  2. Drag and drop ocel2_stackt.sqlite in the Event Log Import window.
  3. You should now see the object-centric event log extracted from the GitHub repository! 🎉