Mu-SHROOM: A Multilingual Shared Task on Hallucinations in Language Models
Join the Mu-SHROOM shared task on hallucination detection in language models in a multilingual context. Develop systems to identify and mitigate hallucinations and submit your results by 31.01.2025. Stay informed by joining our Google group or Slack workspace.
Mu-SHROOM: A Multilingual Shared Task on Hallucinations in Language Models
We are excited to announce the Mu-SHROOM shared task on hallucination detection in a multilingual context. This shared task aims to advance the state-of-the-art in detecting hallucinated content in the outputs of instruction-tuned language models (LLMs) in 10 different languages. Participants are invited to develop systems that can accurately identify and mitigate hallucinations in generated content. The task will be held in collaboration with the SemEval 2025 workshop.
Join our Google group mailing list or Slack workspace to stay informed about the latest updates and developments. We welcome participants from all backgrounds and look forward to your contributions to this exciting research area.
Register your team and submit your results on our platform before the deadline. System description papers should be submitted by 28.02.2025. Follow us on Twitter for the latest news and updates.
Key Dates:
- Dev set available by: 02.09.2024
- Test set available by: 01.01.2025
- Evaluation phase ends: 31.01.2025
- System description papers due: 28.02.2025 (TBC)
Evaluation Metrics:
Participants will be ranked along two (character-level) metrics: intersection-over-union of characters marked as hallucinations in the gold reference vs. predicted as such, and how well the probability assigned by the participants’ system that a character is part of a hallucination correlates with the empirical probabilities observed in our annotations.
How to Participate:
- Register: Please register your team before making a submission on our platform
- Submit results: use our platform to submit your results before 31.01.2025
- Submit your system description: system description papers should be submitted by 28.02.2025 (TBC, further details will be announced at a later date).
Tags: Mu-SHROOM, shared task, hallucination detection, language models, multilingual, SemEval 2025, LLMs, intersection-over-union, probability correlation