https://avatars.githubusercontent.com/u/6041070?v=4

Tilburg Algorithm Observatory, Assistant Professor at the Department of Intelligent Systems, Tilburg University.

🔗 Quick Links

🌍 Home Page

📚 Publications

🦣 Social

🦋 Social

Data Processing — Course Page

Language & AI — Course Page ‘24-’25

Reproducibility & Model Deployment — Course Page

Thesis Page

👋 Hi

I’m an Assistant Professor at the Department of Intelligent Systems, part of the Research Center for Cognitive Science & Artifical Intelligence of Tilburg University.

Tilburg Algorithm Observatory | Tilburg University

I work on algorithmic monitoring and auditing as part of the Tilburg Algorithm Observatory, and am interested in the (harmful) effects of intelligent systems on our lives; systems that uncover our personal information, monitor and change our behavior, subtly restrict our exposure to information, and treat us unfairly.

I defended my dissertation “User-centered Security in Natural Language Processing” in January 2023, supervised by Grzegorz Chrupała, Eric Postma, and Walter Daelemans.

I'm a member of the shool council (previously List DCA.I., currently List TSHD) and the Data Science and Society program committee.

⚗️ Research

I have a multidisciplinary background in humanities and computer science. My primary area of expertise is algorithm monitoring and auditing; i.e., identifying, recording, and evaluating (harmful) inferences made through Machine Learning (ML, such as Large Language Models). During my PhD, I mainly worked on adversarial attacks on Deep Learning algorithms trained on language data (Natural Language Processing or NLP), with a focus on privacy and security. My work critically analyzes the current, and more distant impact such algorithms have on society. I'm a strong advocate of a user-centered, open-source approach to ML, and the automation of society in general.

Within NLP, I have worked on various topics such as (adversarial) stylometry (or author profiling), cyberbullying/toxicity detection, bias, data augmentation, language generation, machine translation, and more generally scientific development of reproducible research pipelines. Here are a few selected papers to give you an idea:

The Role of Search Engines in the Amplification and Suppression of...

SOBR: A Corpus for Stylometry, Obfuscation, and Bias on Reddit

Adversarial Stylometry in the Wild: Transferable Lexical Substitution Attacks on Author Profiling

Current limitations in cyberbullying detection: On evaluation criteria, reproducibility, and data scarcity

🏫 Teaching

I’m currently the course coordinator and designer for both Programming for Data Science (1sem), and Reproducibility & Model Deployment (1sem). Previously, I taught Data Mining (5y), Data Processing (4y), Text Mining (1sem), and Spatiotemporal Data Analysis (1sem) in context of our Data Science master. I also coordinated and designed Language & AI (4y) for our joint Data Science bachelor with TU/e (JADS), which was given two excellent course evaluation certificates (2023-2025).

I focus on innovating the courses I am involved in, primarily by connecting theory to practice through problem-based learning. I believe this makes the lectures more fun, and easier to conceptualize the utility of the material. I also actively promote the use of open-source and open-science practices to shape students’ future careers. An example is my EDUiLAB project to familiarize students with code versioning, repositories, and build servers using GitHub, which lay the foundation for Reproducibility & Model Deployment.

Here are the associated course pages (all on Notion):