First Workshop on Data Contamination (CONDA) @ ACL 2024: New Submission Deadlines
We invite you to participate and submit your work to the First Workshop on Data Contamination (CONDA) co-located with ACL 2024 in Bangkok, Thailand. Data contamination, where evaluation data is inadvertently included in pre-training corpora, has become a concern in recent times. The growing scale of both models and data has led to the inclusion of segments from evaluation benchmarks in the pre-training data of LMs. The scale of internet data makes it difficult to prevent this contamination from happening, or even detect when it has happened. Crucially, when evaluation data becomes part of pre-training data, it introduces biases and can artificially inflate the performance of LMs on specific tasks or benchmarks. This poses a challenge for fair and unbiased evaluation of models, as their performance may not accurately reflect their generalization capabilities.
The deadline for paper submissions and ARR commitment has been extended. Please find the updated deadlines below:
- Paper submission deadline: May 31, 2024
- ARR pre-reviewed commitment deadline: June 14, 2024
- Notification of acceptance: June 17, 2024
- Camera-ready paper due: July 1, 2024
- Workshop date: August 16, 2024
We welcome paper submissions on all topics related to data contamination, including but not limited to:
- Definitions, taxonomies, and gradings of contamination
- Contamination detection (both manual and automatic)
- Community efforts to discover, report, and organize contamination events
- Documentation frameworks for datasets or models
- Methods to avoid data contamination
- Methods to forget contaminated data
- Scaling laws and contamination
- Memorization and contamination
- Policies to avoid impact of contamination in publication venues and open source communities
- Reproducing and attributing results from previous work to data contamination
- Survey work on data contamination research
- Data contamination in other modalities
For more information, please visit the CONDA workshop website.