๐บ๐ธ Dishcovery Mission II Challenge
Location: United States
Develop a Vision Language Model for food image-text matching
The Dishcovery Mission II Challenge is part of the 3rd MetaFood Workshop at CVPR 2026, aiming to develop a Vision Language Model that can accurately understand food images and match them to the correct textual descriptions.
- ~400,000 food imageโcaption pairs
- Realistic multi-modal noise and fine-grained dish ambiguity
- Focus on efficient and scalable VLM architectures
- Global leader board visibility
Top solutions will be invited to present at the MetaFood Workshop. Visit the challenge website and submit your solution through the submission portal.
Tags: CVPR 2026, MetaFood Workshop, Dishcovery Mission II Challenge, Vision Language Model, Multimodal AI, Food Computing, United States