https://avatars.githubusercontent.com/u/6041070?v=4

Tilburg Algorithm Observatory, Assistant Professor at the Department of Cognitive Science & AI, Tilburg University

🔗 Quick Links

🌍 Home Page

📚 Publications

🦣 Social

🦋 Social

Data Processing — Course Page

Language & AI — Course Page

Thesis Page

👋 Hi

I’m an Assistant Professor at the Department of Cognitive Science and Artificial Intelligence of Tilburg University.

Tilburg Algorithm Observatory | Tilburg University

I work on algorithmic monitoring and auditing as part of the Tilburg Algorithm Observatory, and am interested in the (harmful) effects of intelligent systems on our lives; systems that uncover our personal information, monitor and change our behavior, subtly restrict our exposure to information, and treat us unfairly.

I defended my dissertation “User-centered Security in Natural Language Processing” in January 2023, supervised by Grzegorz Chrupała, Eric Postma, and Walter Daelemans.

I'm a member of the shool council (previously List DCA.I., currently List TSHD) and chair the Data Science and Society program committee.

⚗️ Research

I have a multidisciplinary background in humanities and computer science. My primary area of expertise is algorithm monitoring and auditing; i.e., identifying, recording, and evaluating (harmful) inferences made through Machine Learning (ML, such as Large Language Models). During my PhD, I mainly worked on adversarial attacks on Deep Learning algorithms trained on language data (Natural Language Processing or NLP), with a focus on privacy and security. My work critically analyzes the current, and more distant impact such algorithms have on society. I'm a strong advocate of a user-centered, open-source approach to ML, and the automation of society in general.

Within NLP, I have worked on various topics such as (adversarial) stylometry (or author profiling), cyberbullying/toxicity detection, bias, data augmentation, language generation, machine translation, and more generally scientific development of reproducible research pipelines. Here are a few selected papers to give you an idea:

SOBR: A Corpus for Stylometry, Obfuscation, and Bias on Reddit

Adversarial Stylometry in the Wild: Transferable Lexical Substitution Attacks on Author Profiling

Current limitations in cyberbullying detection: On evaluation criteria, reproducibility, and data scarcity

Towards Replication in Computational Cognitive Modeling: a Machine Learning Perspective

🏫 Teaching

I’m currently the course coordinator for both Programming for Data Science, and Reproducibility & Model deployment. Previously, I taught Data Mining (five years), Data Processing (four years), Text Mining, and Spatiotemporal Data Analysis (both one semester), all in context of our Data Science master, and Language & AI (4 years) for our joint Data Science bachelor with TU/e (JADS).

I focus on innovating the courses I am involved in, primarily by connecting theory to practical use cases. I believe this makes the lectures more fun, and easier to conceptualize the utility of the material. It also provides a soft introduction to applications students might see in their future careers. A recent example is my EDUiLAB project to familiarize Data Processing students with code versioning, repositories, and build servers using GitHub.

Here are the associated course pages (all on Notion):

🔗 Quick Links

👋 Hi

⚗️ Research

🏫 Teaching

📫 Contact