Seminar: Tabular Machine Learning
Tabular data is everywhere and often at the core of data science tasks, from healthcare to e-commerce and the natural sciences. Yet it comes with unique challenges and research questions for machine learning:
- What makes tabular data different from text or images?
- Which models work best, and why is it hard to beat simple baselines?
- How do recent advances in large and pre-trained models reshape the field?
- What is the role of LLMs in the field of tabular tasks?
In this seminar, we will explore the evolving landscape of ML for tabular data, with a special focus on predictive tasks and the rise of foundation models. We will read and discuss recent research papers and critically examine approaches. As I am new to the department, I am especially excited to use this seminar to dive into an active research area and get to know many of you.
Requirements. Familiarity with basic machine learning concepts (e.g., supervised learning, training/validation/test splits, overfitting), standard ML models, and modern DL architectures. Motivation to read (state-of-the-art) research papers in machine learning.
Interested in a teaser? Check out this position paper on why we need more tabular foundation models: https://proceedings.mlr.press/v235/van-breugel24a.html
How the seminar will look like?
We will regularly throughout the semester. In the first few weeks, we will start with introductory lectures on ML for tabular data and how to critically review and present research papers. After that, we will have several sessions with presentations, followed by discussions.
Other Important information
Grading/Presentations: Grades will be based on your presentation, slides, active participation and a short report. Further details will be discussed in the introductory sessions .