Seminar: (Auto-)ML for tabular data

What is tabular data? And which model would you use for it? Why is tabular data challenging for machine learning? And how would you compare learning approaches on tabular data?

TL;DR Tabular data is omnipresent and tabular ML offers many solutions.
This seminar will navigate the landscape of ML models for tabular data (which is the ideal playground for AutoML). We will read recent research papers in the field of tabular ML with a focus on large- and pretrained neural networks defining model tabular ML. To get excited, you can have a look at this position paper on why we need more tabular foundation models.

Course Title(Auto-)ML for tabular data 
Course IDML4501f 
RegistrationILIAS 
ECTS3 
TimeThursdays, 14:15-15:45 
Languageenglish 
#participantsmax 14 
Locationin-person at Maria-von-Linden-Straße 6; seminar room ground floor 
organized byKatharina Eggensperger, Amir Rezaei Balef, Mykhailo Koshil 

Why should you attend this seminar?

Tabular data is everywhere any probably you have heard about it in your first machine learning lecture. But what is tabular data? And why is it challenging for machine learning? And what are recent models on this modality?

In this seminar, we will discuss these any many more questions. Additionally, besides learning about this topic and practicing your scientific communication skills, you will also

Requirements

We strongly recommend that you know the foundations of machine learning and deep learning, including modern neural architectures and transformer models. Ideally, you also have some experience in applying ML to get the most out of this seminar.

Topics

The seminar focuses on understanding the challenges of learning from tabular representations. We will discuss research papers trying to understand what makes tabular data a challenging data modality for some model classes and state-of-the-art ML methods build to excel on this data modality.

DateContent
17.10.2024Orga) / How to give a good presentation
24.10.2024no meeting
31.10.2024Intro I
07.11.2024no meeting
21.11.2024#1 Tabular Foundation Models [Position / Elephant]
28.11.2024no meeting
05.12.2024#2 Interpretability [GAM X LLM / TabNet]
12.12.2024#3 In-Context Learning [ForestPFN / MotherNet]
19.12.2024no meeting
26.12.2024🌲 no meeting
02.01.2025🎆 no meeting
09.01.2025no meeting
16.01.2025#4 Wrap-Up
23.01.2025buffer / no meeting
30.01.2025buffer / no meeting
06.02.2025buffer / no meeting
  1. [Position] Van Breugel et al. Why Tabular Foundation Models Should Be a Research Priority
  2. [Elephant] Bordt et al. Elephants Never Forget: Memorization and Learning of Tabular Data in Large Language Models (arxiv’24)
  3. [GAM X LLM] Bordt et al. Data Science with LLMs and Interpretable Models XAI@AAAI’24, Lou et al. Accurate intelligible models with pairwise interactions (KDD’13)
  4. [TabNet] Arik et al. TabNet: Attentive Interpretable Tabular Learning (AAAI’21)
  5. [ForestPFN] Breejen et al. Why In-Context Learning Transformers are Tabular Data Classifiers (arxiv’24)
  6. [MotherNet] Müller et al. MotherNet: A Foundational Hypernetwork for Tabular Classification (arxiv’23)

How the seminar will look like?

We will meet each week (with a few exceptions). In the first few weeks, we will start with introductory lectures on ML for tabular data (why is this an exciting data modality and why we need AutoML for this) and how to critically review and present research papers. After that, each week, we will have presentations, followed by discussions.

Other Important information

Registration: Please register on ILIAS. The signup will kept open and unlimited until the first meeting. The registration opens on September 30th, 12:00, noon. In the first meeting, I will give an introduction to the topic and the papers. Afterward, we’ll do will do the final and also binding registration and assignment. So, please come to the first lecture!

Grading/Presentations: Grades will be based on your presentation, slides, active participation and a short report. Further details will be discussed in the introductory sessions .