ML Scientist

Connecting Scholars with the Latest Academic News and Career Paths

FeaturedNews

Universal Dependencies v2.17: Multilingual Treebank Collection Released

Universal Dependencies releases v2.17 with 339 treebanks across 186 languages, enhancing multilingual parser development and research.

The Universal Dependencies project has released its 23rd version, v2.17, featuring 339 treebanks across 186 languages. The release is available at https://universaldependencies.org/.

The project aims to facilitate multilingual parser development and parsing research through cross-linguistically consistent treebank annotation.

  • Key features include 2313529 sentences, 37133071 surface tokens, and 37877213 syntactic words.
  • The release includes significant updates to 44 treebanks, with new additions such as Central Kurdish Mukri and Old Occitan CorAG.

Tags: Universal Dependencies, multilingual treebank, NLP, natural language processing, computational linguistics