We welcome Master’s students who are looking for thesis opportunities in natural language processing and related fields to work with us. If you are enthusiastic about solving real-world challenges with NLP, particularly in the areas of multilingual and multimodal systems, or exploring societal impacts such as mental healthcare applications, consider joining our team.
What We Offer:
- Supervision and guidance from leading experts in NLP and AI.
- Access to state-of-the-art research tools, datasets, and computational resources.
- Opportunities to publish in high-impact venues and present at conferences.
Eligibility Criteria:
- Must be enrolled in a Master’s program at TU Darmstadt.
- Strong academic background in computer science, machine learning, natural language processing or a related field.
- Experience or coursework in natural language processing is preferred.
How to Apply:
- A brief description of your research interests and how they align with our work.
- Your CV and academic transcript.
Please send your application to shaoxiong.ji@tu-darmstadt.de
Previous MSc Thesis Topics
-
Henna Roinisto (MSc, University of Helsinki, jointly with Metsä Group)
Integrating Open-Source Retrieval-Augmented Generation with Large Language Models for Business, Market and Responsibility Insights, 2024. -
Ya Gao (MSc, Aalto University, now PhD candidate at Aalto University)
Joint entity and relation extraction via contrastive learning on knowledge-augmented graph embeddings, 2023. -
Tuulia Denti (MSc, Aalto University, jointly with HUS, now Data Analyst at HUS)
Natural Language Processing with Topic Models for Clinical Texts of Prostate Cancer Patients, 2022. -
Wei Sun (MSc, Aalto University, jointly with HUS, now PhD candidate at KU Leuven, Belgium)
Extracting Medical Entities from Radiology Reports with Ontology-based Distant Supervision, 2022.
Previous MSc Research Projects
- An empirical study of language modeling and translation as multilingual pretraining objectives (2023-24 at University of Helsinki)
- Deep learning for medical code assignment from clinical notes (2020-2022 at Aalto University)
- Deep model fusion in federated learning (2020-2021)
- Conversational/multimodal sentiment analysis (2020-2021 at Aalto University)
- NLP for mental health (e.g, depression detection and suicidal ideation detection) (2021-2023 at Aalto University)
- Adverse drug event detection and extraction (2021-2022 at Aalto University)
- Multilingual complex named entity recognition at SemEval shared tasks (2021-2022 at Aalto University)
- Risk adjustment for healthcare plan payment (2019-2020 at Aalto University)
Previous BSc Thesis Topics
- Risk adjustment for health plan payment (2019 Winter at Aalto University)
- Deep learning for cyberbullying detection (2020 Summer at Aalto University)
- Pretrained language models for diagnosis code prediction (2020 Summer at Aalto University)
- Federated learning (2020 Fall at Aalto University)
- Depression detection from social content (2021 Spring at Aalto University)
- Biomedical text classification (2022 Spring at Aalto University)
Published Project Reports
We are proud of our students’ exceptional research, which has played a key role in advancing language technology for societal impact. Through their dedication, they have co-authored papers in top scientific venues, developed novel NLP models, and contributed to impactful, real-world projects. Their work reflects the collaborative environment we foster. Below is a list of project reports that were published after revisions in scientific venues.
- Zihao Li, Shaoxiong Ji, Timothee Mickus, Vincent Segonne, and Jörg Tiedemann. A Comparison of Language Modeling and Translation as Multilingual Pretraining Objectives. In Proceedings of EMNLP, 2024.
- Ya Gao, Shaoxiong Ji, Tongxuan Zhang, Prayag Tiwari and Pekka Marttinen. Contextualized Graph Embeddings for Adverse Drug Event Detection. ECML-PKDD, 2022.
- Aapo Pietiläinen and Shaoxiong Ji. AaltoNLP at SemEval-2022 Task 11: Ensembling Task-adaptive Pretrained Transformers for Multilingual Complex NER. Proceedings of International Workshop on Semantic Evaluation (SemEval), 2022.
- Luna Ansari, Shaoxiong Ji, Qian Chen, and Erik Cambria. Ensemble Hybrid Learning Methods for Automated Depression Detection. IEEE Transactions on Computational Social Science, 2022.
- Wei Sun, Shaoxiong Ji, Erik Cambria, and Pekka Marttinen. Multitask Recalibrated Aggregation Network for Medical Code Prediction. ECML-PKDD, 2021.
Photo by Patrick Tomasso on Unsplash