# Explainable AI and Interpretable Machine Learning

**Last updated**: 2025-10-31

Stefan Szeider's research in explainable AI focuses on the computational complexity of generating explanations from machine learning models and learning interpretable models from data.

## Core Research Questions

How can we make machine learning models transparent and interpretable? What are the computational limits of generating explanations? Can we learn models that are both accurate and inherently interpretable?

This research addresses these questions through parameterized complexity analysis and exact methods based on SAT solving.

## Key Contributions

### Learning Interpretable Models

**Decision Trees**: SAT-based methods for learning smallest decision trees from data. The hybrid approach combines heuristic methods with exact SAT encodings, scaling to large datasets while minimizing tree depth and size.

**General Framework**: A theoretical framework showing that learning smallest symbolic models (decision trees, decision sets, decision lists, binary decision diagrams) is fixed-parameter tractable under structural restrictions.

**Structural Restrictions**: Algorithms that exploit rank-width, twin-width, and other graph-theoretic parameters of the data to make learning tractable.

### Generating Explanations

**Abductive Explanations**: Finding minimal feature sets that justify a model's classification decision.

**Contrastive Explanations**: Identifying features that distinguish one classification from another.

**Local vs Global**: Both instance-specific explanations (why this example?) and model-wide explanations (how does the model work?).

### Complexity Analysis

**Hardness Results**: Showing which explanation problems are NP-hard and which admit efficient parameterized algorithms.

**Fixed-Parameter Algorithms**: Designing algorithms that are practical when certain parameters (explanation size, structural properties) are small.

## Publications

### 2024

**SAT-based Decision Tree Learning for Large Data Sets**
Andre Schidler, Stefan Szeider
*Journal of Artificial Intelligence Research (JAIR)*, 2024, DOI: 10.1613/jair.1.15956
Hybrid approach combining SAT encodings with heuristics for learning smallest decision trees on datasets with thousands of samples.

**Explaining Decisions in ML Models: A Parameterized Complexity Analysis**
Sebastian Ordyniak, Giacomo Paesani, Mateusz Rychlicki, Stefan Szeider
*Proceedings of KR 2024*, DOI: 10.24963/kr.2024/53
Comprehensive study of explanation problems (abductive and contrastive, local and global) across decision trees, decision sets, decision lists, OBDDs, random forests, and Boolean circuits.

**Revisiting Causal Discovery from a Complexity-Theoretic Perspective**
Robert Ganian, Viktoriia Korchemna, Stefan Szeider
*Proceedings of IJCAI 2024*, DOI: 10.24963/ijcai.2024/374
Characterization of which causal graphs admit efficient constraint-based discovery and a new fixed-parameter algorithm for causal discovery.

**A General Theoretical Framework for Learning Smallest Interpretable Models**
Sebastian Ordyniak, Giacomo Paesani, Mateusz Rychlicki, Stefan Szeider
*Proceedings of AAAI 2024*, DOI: 10.1609/aaai.v38i9.28937
Fixed-parameter tractability for minimizing decision trees, decision sets, decision lists, binary decision diagrams, and ensembles.

**Learning Small Decision Trees for Data of Low Rank-Width**
Konrad K. Dabrowski, Eduard Eiben, Sebastian Ordyniak, Giacomo Paesani, Stefan Szeider
*Proceedings of AAAI 2024*, DOI: 10.1609/aaai.v38i9.28916
Algorithm using NLC decomposition for learning smallest decision trees when data has low rank-width.

### 2023

**The Computational Complexity of Concise Hypersphere Classification**
Eduard Eiben, Robert Ganian, Iyad A. Kanj, Sebastian Ordyniak, Stefan Szeider
*Proceedings of ICML 2023*, PMLR 202:9060-9070
Parameterized complexity analysis and algorithms for hypersphere classification of binary data.

**Learning Small Decision Trees with Large Domain**
Eduard Eiben, Sebastian Ordyniak, Giacomo Paesani, Stefan Szeider
*Proceedings of IJCAI 2023*, DOI: 10.24963/ijcai.2023/355
Fixed-parameter tractability for learning smallest decision trees without bounding domain size.

**The Parameterized Complexity of Finding Concise Local Explanations**
Sebastian Ordyniak, Giacomo Paesani, Stefan Szeider
*Proceedings of IJCAI 2023*, DOI: 10.24963/ijcai.2023/369
Complexity landscape for finding smallest anchors (local explanations) for black-box model classifications.

### 2021

**SAT-based Decision Tree Learning for Large Data Sets**
Andre Schidler, Stefan Szeider
*Proceedings of AAAI 2021*, DOI: 10.1609/aaai.v35i5.16509
Original SAT-based hybrid method for decision tree learning scaling to large datasets.

**The Parameterized Complexity of Clustering Incomplete Data**
Eduard Eiben, Robert Ganian, Iyad Kanj, Sebastian Ordyniak, Stefan Szeider
*Proceedings of AAAI 2021*, DOI: 10.1609/aaai.v35i8.16896
Fixed-parameter tractability for clustering with missing data.

**Learning Fast-Inference Bayesian Networks**
Vaidyanathan Peruvemba Ramaswamy, Stefan Szeider
*Proceedings of NeurIPS 2021*
Learning Bayesian networks optimized for fast probabilistic inference using maximum state space size.

## Research Themes

### Interpretability vs Accuracy Trade-off

Small decision trees are interpretable but may be less accurate. This research develops methods that find provably smallest models that correctly represent data, avoiding unnecessary complexity while maintaining accuracy.

### Parameterized Complexity Perspective

Many explanation and learning problems are NP-hard. Parameterized complexity asks: which parameters make these problems tractable? Parameters include model size, explanation size, and structural properties of data (rank-width, twin-width, maximum difference).

### SAT-Based Exact Methods

SAT solvers provide exact solutions but don't scale to large instances. The hybrid approach applies SAT locally to parts of a heuristically-generated solution, combining scalability with exactness.

### Multiple Model Types

Research covers decision trees, decision sets, decision lists, binary decision diagrams, Bayesian networks, random forests, Boolean circuits, and hypersphere classifiers. Each model type presents unique algorithmic challenges.

## Collaborators

This research involves extensive collaboration with:
- Sebastian Ordyniak (University of Leeds)
- Giacomo Paesani (Sapienza University of Rome, formerly Leeds)
- Eduard Eiben (Royal Holloway, University of London)
- Robert Ganian (TU Wien)
- Andre Schidler (TU Wien)
- Mateusz Rychlicki (University of Leeds)

## Applications

**Healthcare**: Interpretable models for medical diagnosis where explanations are required.

**Finance**: Explainable credit scoring and fraud detection.

**Legal**: Models that can justify decisions for regulatory compliance.

**Science**: Discovering interpretable patterns in scientific data.

## Software

**SAT-based Decision Tree Learner**: Implementation of hybrid SAT-heuristic method for learning smallest decision trees.
GitHub: (available upon request)

## Future Directions

- Extending methods to deep learning models
- Interactive explanation systems
- Explanations for temporal and structured data
- Real-world deployment and evaluation

**Contact**: sz@ac.tuwien.ac.at | [DBLP](https://dblp.org/pid/s/StefanSzeider.html)