ML Scientist

Connecting Scholars with the Latest Academic News and Career Paths

Conference CallsFeatured

LLMs4Subjects: A Novel Shared Task for Developing Bilingual Language Modeling in Understanding Technical Documents

Join the first SemEval 2025 shared task, LLMs4Subjects, and develop cutting-edge large language model-based semantic solutions for subject tagging of technical documents in both German and English.

We are excited to announce the first call for participation in the LLMs4Subjects shared task, organized as part of SemEval 2025. This task invites the research community to develop cutting-edge large language model (LLM) based semantic solutions for subject tagging of the Leibniz University’s Technical Library’s open-access collection. The shared task provides an opportunity for the research community to demonstrate bilingual language modeling in understanding technical documents in both German and English. Successful solutions may be directly integrated into the operational workflows of the TIB Leibniz Information Centre for Science and Technology University Library.

Participants will be provided with a human-readable form of a subject’s taxonomy (GND or Gemeinsame Normdatei) and a large collection of technical records tagged with these subjects from the TIB’s open-access collection called TIBKAT. The task defines three sub-tasks: Task 1 – Learn the GND, Task 2 – Align subject tagging to the TIBKAT collection, and (Optional) Task 3 – Develop Elegant Frontend Interfaces for Subject Tagging.

The shared task will have three separate evaluations: Evaluation 1 – Quantitative Metrics-based Evaluations, Evaluation 2 – Qualitative Evaluations by Human Subject Specialists, and (Optional) Evaluation 3 – Human-Computer Interaction (HCI) evaluations for subject indexing interfaces submitted.

Tags: LLMs4Subjects, SemEval 2025, shared task, large language models, bilingual language modeling, subject tagging, GND, TIBKAT, Technical Library, Leibniz University

Leave a Reply

Your email address will not be published. Required fields are marked *