FIRE 2024 Task: CoLI-Dravidian – Word-level Code-Mixed Language Identification in Dravidian Languages
The 16th meeting of Forum for Information Retrieval Evaluation (FIRE 2024) is hosting a shared task, CoLI-Dravidian, focused on Word-level Code-Mixed Language Identification in Dravidian Languages. The task aims to address the challenges of language identification in code-mixed Dravidian languages, which are widely spoken in southern India but under-resourced. The shared task will encourage the development of advanced language identification models for four languages: Kannada, Tamil, Malayalam, and Tulu.
Participants will be allowed to make a maximum of 10 submissions in the training phase and 5 submissions in the testing phase. The evaluation will be conducted through CodaLab, and the best submission will be selected for ranking. The training and testing data are available for download at this link.
Important dates for the shared task are as follows:
- 14th June 2024 – open track websites and training data release
- 25th July – run submission deadline
- 27th July – results declared
- 27th August – Working notes due
- 10th September – Reviews
- 30th October – Camera-ready copies of working notes
The organizing committee includes researchers from various institutions, such as Mangalore University, Portland State University, CIC, IPN, and Tecnológico de Monterrey. For more information, visit the task website at https://sites.google.com/view/coli-dravidian-2024/datasets?authuser=0.