ML Scientist

Connecting Scholars with the Latest Academic News and Career Paths

FeaturedNews

New Named Entity Corpus for Occupational Substance Exposure Assessment Now Available

A new annotated corpus, named Corpora-List, has been released for occupational substance exposure assessment. The corpus contains selected sections of scientific research articles related to diesel exhaust and respirable crystalline silica exposures. Experts in the field have annotated the article sections with six categories of named entities relevant to job exposure matrices assessment. The corpus and associated annotation guidelines can be downloaded from the provided link. NER models and associated code are also available on GitHub.

The development of the corpus and NER models is described in detail in the following article: Thompson, P., Ananiadou, S., Basinas I., Brinchmann, B. C., Cramer, C., Galea, K. S., Ge, C., Georgiadis, P., Kirkeleit, J., Kuijpers, E., Nguyen, N., Nuñez, R., Schlünssen, V., Stokholm, Z. A., Taher, E. A., Tinnerberg, H., Van Tongeren, M. and Xie, Q. (2024). Supporting the working life exposome: annotating occupational exposure for enhanced literature search.

Leave a Reply

Your email address will not be published. Required fields are marked *