Our Mission

The research group focuses on advancing human language technologies and fostering societal impact through outreach efforts. Our work bridges technical sophistication with real-world applications, exploring fundamental research questions in natural language processing (NLP) while building large language models.

We are dedicated to creating multilingual and multimodal NLP systems that extend beyond the boundaries. Our research and innovations contribute to societal domains, such as mental healthcare and diagnostic support, where advanced methods in NLP can make a difference. In a broader sense, outreach aims to extend help, knowledge, or resources to communities, fostering inclusion, education, and positive social impact. 

Group Members

Shaoxiong Ji

Group Leader

  • Independent research group leader at TU Darmstadt
  • PhD in computer science from Aalto University, Finland, 2023
  • Natural language processing, health informatics, affective computing, and large language models.

Doan Nam Long Vu

Doctoral Researcher (1.1.2025-)

  • MSc and BSc at TU Darmstadt
  • Natural language processing, machine translation, and differential privacy

Zihao Li

MSc student at University of Helsinki

  • Research assistant 1.7.2024 - 30.9.2024
  • Now working on MSc thesis
  • Multilingual NLP, large language models, and machine translation.

Ongoing Projects

Project A: DYNAMIC for Mental Healthcare

The DYNAMIC project (Dynamic Network Approach of Mental Health to Stimulate Innovations for Change) aims to leverage advanced technologies, particularly in natural language processing (NLP) and knowledge discovery, to foster innovations in mental health care. This initiative focuses on several key research topics, including the application of large language models (LLMs) for clinical purposes and the analysis of multimodal clinical data.

Join in our discord server for further information and collaborations as we strive to revolutionize mental healthcare with cutting-edge AI technology.

Project B: MaLA for Massive Language Adaptation of LLMs

MaLA focuses on the adaptation of large language models (LLMs) to better understand and generate human language across diverse contexts and applications in the massively multilingual scenario. It aims to enhance the capabilities of existing LLMs by integrating massive datasets and employing advanced techniques to improve their performance in various linguistic tasks. The initiative recognizes the importance of adapting these models to different languages and dialects, ensuring that they can effectively serve a broader audience.

For those interested in joining the conversation or learning more about the project, you can connect with the community on Discord: MaLA-LM Discord.

Alumnus

University of Helsinki

Jaakko Paavola

  • Research assistant at University of Helsinki, 2024
  • Now Quantitative Analyst at OP Financial Group

Henna Roinisto

  • MSc thesis, University of Helsinki, jointly with Metsä Group
  • Thesis: Integrating Open-Source Retrieval-Augmented Generation with Large Language Models for Business, Market and Responsibility Insights, 2024.

Aalto University

Ya Gao

  • MSc thesis, Aalto University
  • Now PhD candidate at Aalto University
  • Thesis: Joint entity and relation extraction via contrastive learning on knowledge-augmented graph embeddings, 2023.

Tuulia Denti

  • MSc thesis, Aalto University, jointly with HUS
  • Now Data Analyst at HUS
  • Thesis: Natural Language Processing with Topic Models for Clinical Texts of Prostate Cancer Patients, 2022.

Wei Sun

  • MSc thesis, Aalto University, jointly with HUS,
  • Now PhD candidate at KU Leuven, Belgium
  • Thesis: Extracting Medical Entities from Radiology Reports with Ontology-based Distant Supervision, 2022.

Join Us