İddiaların teyit gerekliliğine göre önceliklendirilmesi

Kartal, Yavuz Selim

Please use this identifier to cite or link to this item: https://hdl.handle.net/20.500.11851/7842

Full metadata record

DC Field	Value	Language
dc.contributor.advisor	Kutlu, Mücahid	-
dc.contributor.author	Kartal, Yavuz Selim	-
dc.date.accessioned	2021-12-02T17:20:12Z	-
dc.date.available	2021-12-02T17:20:12Z	-
dc.date.issued	2021	-
dc.identifier.citation	Kartal, Yavuz Selim. (2021). İddiaların teyit gerekliliğine göre önceliklendirilmesi. (Yayınlanmamış Yüksek Lisans Tezi). TOBB Ekonomi ve Teknoloji Üniversitesi, Fen Bilimleri Enstitüsü, Bilgisayar Mühendisliği Ana Bilim Dalı	en_US
dc.identifier.uri	https://tez.yok.gov.tr/UlusalTezMerkezi/TezGoster?key=tqUiYt63sTQLTpozMJ92QuGmQ3xKNXz430zrPbFAITDSQdwbZ6AqXQWep2SdVXsW	-
dc.identifier.uri	https://hdl.handle.net/20.500.11851/7842	-
dc.description	YÖK Tez No: 691960	en_US
dc.description.abstract	Yanlış bilgiler, internette inanılmaz bir şekilde her gün yayılmaktadır ve toplumlar üzerindeki olumsuz etkileri tehlikeli seviyelere ulaşmıştır. Yanlış bilgilerin en önemli düşmanı doğruluk kontrolü yapanlardır. Ancak yanlış bilgilerin yayılma hızı göz önüne alındığında, doğruluk kontrolü yapmak yavaş olduğundan tüm iddiaların kontrol edilmesi mümkün olmamaktadır. Bu yüzden, iddiaları teyit gerekliliklerine göre önceliklendirerek doğruluk kontrolü yapanlara yardımcı olacak sistemlerin geliştirilmesi ve bu konuda farkındalık oluşturulması büyük önem taşımaktadır. Bu alandaki bir diğer problem ise, geliştirilecek sistemler için kullanılabilecek veri kaynaklarının çoğunlukla İngilizce olmak üzere sınırlı olmasıdır. Bu tez çalışmasında öncelikle Türkçe için ilk teyit gerektiren iddia veri kümesi olan TrClaim-19 hazırlanmıştır. TrClaim-19, 2287 tane etiketli tweet içermenin yanı sıra, teyit gerektirme özelliklerinin daha iyi anlaşılmasını sağlayacak olan teyit gerektirme gerekçeleri de sunulmuştur. Bu gerekçeler, iddiaların konularının ve muhtemel negatif etkilerinin teyit gerektirmeye sebep olan ana etkenler olduğunu öne sürmektedir. Tez çalışmasında ayrıca, iddiaları teyit gerekliliklerine göre önceliklendirmek için BERT modelinin ve çeşitli özniteliklerin kullanıldığı karma bir model de önerilmiştir. Kullanılan öznitelikler, yerel bölgeye özgü tartışmalı konular, kelime vektörleri, POS etiketleri ve daha fazlasını içermektedir. Buna ek olarak, teyit gerektiren verileri artırma, aktif öğrenme ve farklı dillerde verileri kullanma gibi veri kümesi boyutunu artırmanın farklı yolları üzerine çalışmalar yapılmıştır. Kapsamlı deneyler sonucunda, modelimizin, CLEF Check That! Lab 2018 and 2019 test koleksiyonlarındaki en iyi modellerden daha başarılı olduğu gözlemlenmiştir. Modelimiz, eğitim verilerindeki teyit gerektiren örnekler artırıldığında, Check That! Lab 2020'in test koleksiyonu için de şimdiye kadar bildirilen en iyi MAP puanını elde etmiştir. Çok dilli eğitimin ise Arapça ve Türkçe iddiaları önceliklendirmek için etkili olduğu, ancak bunun İngilizce için geçerli olmadığını gözlemlenmiştir.	en_US
dc.description.abstract	The massive amount of misinformation spreading on the Internet on a daily basis has enormous negative impacts on societies. In order to combat against misinformation and its negative outcomes, fact-checking websites detect the veracity of claims . However, fact-checking is an extremely time-consuming process and human fact-checkers are not able detect the veracity of all claims spread on the Internet. Therefore, we need systems to help fact-checkers in the combat against misinformation and to raise public awareness of this important problem. Another problem is that available data resources to develop effective systems are limited and the vast majority of them is for English. In this thesis, we introduce TrClaim-19, which is the very first labeled dataset for Turkish check-worthy claims. TrClaim-19 consists of labeled 2287 Turkish tweets with annotator rationales, enabling us to better understand the characteristics of check-worthy claims. The rationales we collected suggest that claims' topics and their possible negative impacts are the main factors affecting their check-worthiness. In this thesis, we also propose a hybrid model which combines BERT model with various features to prioritize claims based on their check-worthiness. Features we use include domain-specific controversial topics, word embeddings, POS tags, and others. In addition, we explore various ways of increasing labeled data size to effectively train the models such as increasing positive samples, active learning, and utilizing labeled data in other languages. In our extensive experiments, we show that our model outperforms all state-of-the-art models in test collections of CLEF Check That! Lab 2018 and 2019. In addition, when positive samples are increased in the training set, our model achieves the best MAP score reported so far for the test collection of Check That! Lab 2020. Furthermore, we show that cross-lingual training is effective for prioritizing Arabic and Turkish claims, but not for English.	en_US
dc.language.iso	tr	en_US
dc.publisher	TOBB Ekonomi ve Teknoloji Üniversitesi	en_US
dc.rights	info:eu-repo/semantics/openAccess	en_US
dc.subject	Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol	en_US
dc.subject	Computer Engineering and Computer Science and Control	en_US
dc.title	İddiaların teyit gerekliliğine göre önceliklendirilmesi	en_US
dc.title.alternative	Prioritizing check-worthy claims	en_US
dc.type	Master Thesis	en_US
dc.department	Institutes, Graduate School of Engineering and Science, Computer Engineering Graduate Programs	en_US
dc.department	Enstitüler, Fen Bilimleri Enstitüsü, Bilgisayar Mühendisliği Ana Bilim Dalı	tr_TR
dc.identifier.startpage	1	en_US
dc.identifier.endpage	70	en_US
dc.institutionauthor	Kartal, Yavuz Selim	-
dc.relation.publicationcategory	Tez	en_US
item.fulltext	With Fulltext	-
item.openairecristype	http://purl.org/coar/resource_type/c_18cf	-
item.languageiso639-1	tr	-
item.cerifentitytype	Publications	-
item.openairetype	Master Thesis	-
item.grantfulltext	open	-
Appears in Collections:	Bilgisayar Mühendisliği Yüksek Lisans Tezleri / Computer Engineering Master Theses

Files in This Item:

File	Size	Format
691960.pdf	1.09 MB	Adobe PDF	View/Open

Show simple item record

CORE Recommender

Page view(s)

138

checked on Apr 22, 2024

Download(s)

42

checked on Apr 22, 2024

Google Scholar^TM

Check

Files in This Item:

Page view(s)

Download(s)

Google ScholarTM

Google Scholar^TM