Please use this identifier to cite or link to this item:
https://hdl.handle.net/20.500.11851/10978
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Zengin, Muhammed Said | - |
dc.contributor.author | Yenisey, Berk Utku | - |
dc.contributor.author | Kutlu, Mucahid | - |
dc.date.accessioned | 2024-01-21T09:24:24Z | - |
dc.date.available | 2024-01-21T09:24:24Z | - |
dc.date.issued | 2023 | - |
dc.identifier.issn | 1300-0632 | - |
dc.identifier.issn | 1303-6203 | - |
dc.identifier.uri | https://doi.org/10.55730/1300-0632.4043 | - |
dc.identifier.uri | https://hdl.handle.net/20.500.11851/10978 | - |
dc.description.abstract | Stance detection has garnered considerable attention from researchers due to its broad range of applications, including fact-checking and social computing. While state-of-the-art stance detection models are usually based on supervised machine learning methods, their effectiveness is heavily reliant on the quality of training data. This problem is more prevalent in stance detection task because the stance of a text is intimately tied to the target under consideration. While numerous datasets exist for stance detection, determining their suitability for a specific target can be challenging. In this work, we focus on Turkish stance detection and explore the impact of training data on the model performance. In particular, we fine-tune BERT model with various datasets and assess their performance when the test data is the same/different compared to the training data in terms of target and domain. In addition, given the scarcity of resources for Turkish stance detection, we investigate i) whether we can use existing datasets in other languages in a cross-lingual setup, and ii) the effectiveness of data augmentation with simple automatic labeling methods. In order to conduct our experiments, we also create new Turkish stance detection datasets for various targets in different domains. In our comprehensive experiments, our findings are as follows. 1) Using training data with multiple targets in the same domain yields high performance as the model is able to learn more characteristics of expressing stance with additional data. 2) The domain of the training data plays a crucial role in achieving high performance. 3) Automatically generated data enhances performance when combined with manually annotated data. 4) Training solely on Turkish data outperforms training with the combination of Turkish and English data. Overall, our study points out the importance of creating Turkish annotated datasets for different domains to achieve high performance in stance detection. | en_US |
dc.description.sponsorship | Scientific and Technological Research Council of Tuerkiye (TUBITAK) [ARDEB 3501, 120E514] | en_US |
dc.description.sponsorship | This study was funded by the Scientific and Technological Research Council of Tuerkiye (TUBITAK) ARDEB 3501 Grant No 120E514. The statements made herein are solely the responsibility of the authors. | en_US |
dc.language.iso | en | en_US |
dc.publisher | Tubitak Scientific & Technological Research Council Turkey | en_US |
dc.relation.ispartof | Turkish Journal of Electrical Engineering and Computer Sciences | en_US |
dc.rights | info:eu-repo/semantics/closedAccess | en_US |
dc.subject | Stance detection | en_US |
dc.subject | natural language processing | en_US |
dc.subject | Turkish | en_US |
dc.title | Exploring the Impact of Training Datasets on Turkish Stance Detection | en_US |
dc.type | Article | en_US |
dc.department | TOBB ETÜ | en_US |
dc.identifier.volume | 31 | en_US |
dc.identifier.issue | 7 | en_US |
dc.identifier.startpage | 1206 | en_US |
dc.identifier.endpage | 1222 | en_US |
dc.identifier.wos | WOS:001115009000006 | en_US |
dc.identifier.scopus | 2-s2.0-85180009303 | en_US |
dc.institutionauthor | … | - |
dc.identifier.doi | 10.55730/1300-0632.4043 | - |
dc.authorscopusid | 57226399864 | - |
dc.authorscopusid | 58769223300 | - |
dc.authorscopusid | 35299304300 | - |
dc.relation.publicationcategory | Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı | en_US |
dc.identifier.trdizinid | 1220953 | en_US |
item.openairetype | Article | - |
item.languageiso639-1 | en | - |
item.grantfulltext | none | - |
item.fulltext | No Fulltext | - |
item.openairecristype | http://purl.org/coar/resource_type/c_18cf | - |
item.cerifentitytype | Publications | - |
crisitem.author.dept | 02.3. Department of Computer Engineering | - |
Appears in Collections: | Scopus İndeksli Yayınlar Koleksiyonu / Scopus Indexed Publications Collection WoS İndeksli Yayınlar Koleksiyonu / WoS Indexed Publications Collection |
CORE Recommender
WEB OF SCIENCETM
Citations
1
checked on Aug 31, 2024
Page view(s)
84
checked on Dec 23, 2024
Google ScholarTM
Check
Altmetric
Items in GCRIS Repository are protected by copyright, with all rights reserved, unless otherwise indicated.