Please use this identifier to cite or link to this item:
https://hdl.handle.net/20.500.11851/8378
Title: | TOBB ETU at CheckThat! 2021: Data engineering for detecting check-worthy claims | Authors: | Zengin, M.S. Kartal, Y.S. Kutlu, M. |
Keywords: | Check worthiness Data engineering Fact checking Computer aided language translation Cross-lingual Data augmentation Data engineering Machine translations Transformer models Turkishs Under-sampling Learning to rank |
Publisher: | CEUR-WS | Abstract: | In this paper, we present our participation in CLEF 2021 CheckThat! Lab's Task 1 on check-worthiness estimation in tweets. We explore how to fine-tune transformer models effectively by changing the train set. The methods we explore include language-specific training, weak supervision, data augmentation by machine translation, undersampling, and cross-lingual training. As our primary model submitted for official results, we fine-tune language-specific BERT-based models using cleaned tweets for each language. Our models ranked 1st in Spanish and Turkish datasets. However, our rank in Arabic, Bulgarian, and English datasets is 6t?, 4t?, and 10th, respectively. © 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). | Description: | 2021 Working Notes of CLEF - Conference and Labs of the Evaluation Forum, CLEF-WN 2021 -- 21 September 2021 through 24 September 2021 -- 171327 | URI: | https://hdl.handle.net/20.500.11851/8378 | ISSN: | 1613-0073 |
Appears in Collections: | Bilgisayar Mühendisliği Bölümü / Department of Computer Engineering Scopus İndeksli Yayınlar Koleksiyonu / Scopus Indexed Publications Collection |
Show full item record
CORE Recommender
Items in GCRIS Repository are protected by copyright, with all rights reserved, unless otherwise indicated.