Dinamik Sosyal Ağlarda Akan ve Çok Boyutlu Veri Üzerinden Analiz ve Tahmin Yapılması

Sert, Onur Can

Please use this identifier to cite or link to this item: https://hdl.handle.net/20.500.11851/7840

Full metadata record

DC Field	Value	Language
dc.contributor.advisor	Özyer, Tansel	-
dc.contributor.author	Sert, Onur Can	-
dc.date.accessioned	2021-12-02T17:20:11Z	-
dc.date.available	2021-12-02T17:20:11Z	-
dc.date.issued	2020	-
dc.identifier.citation	Sert, Onur Can. (2020). Dinamik sosyal ağlarda akan ve çok boyutlu veri üzerinden analiz ve tahmin yapılması. (Yayınlanmamış Doktora Tezi). TOBB Ekonomi ve Teknoloji Üniversitesi, Fen Bilimleri Enstitüsü, Bilgisayar Mühendisliği Ana Bilim Dalı	en_US
dc.identifier.uri	https://tez.yok.gov.tr/UlusalTezMerkezi/TezGoster?key=Eb5EkakJlp3olBdo_wNEGc58RlAVgbZwpAVVea0RA6TeOcRFB7tDBza_6Ly26jNp	-
dc.identifier.uri	https://hdl.handle.net/20.500.11851/7840	-
dc.description	YÖK Tez No: 629189	en_US
dc.description.abstract	Makine öğrenmesi teknikleri ve bu tekniklerin uygulanabilir olduğu alanlar, veri miktarının artması ve veriye ulaşımın kolaylaşması ile birlikte oldukça ön plana çıkmıştır. Veri kümeleri üzerinde bu yöntemler kullanılarak farklı alanlara yönelik tahmin modellerinin geliştirilmesi mümkündür. Bunun yanında doğal dil işleme yöntemleri, metin verisinin analiz edilmesi ve anlamlandırılması noktasında birçok farklı yöntemi içerisinde bulundurmaktadır. Yapılan çalışmada, doğal dil işleme yöntemleri kullanılarak, haber ve sosyal medya verisi analiz edilmiştir ve analiz sonuçlarından öznitelik kümeleri oluşturulmuştur. Oluşturulan öznitelik kümeleri ile sayısı fazla olan seyrek öznitelik kümeleri için ölçeklenebilir bir eğitim ve tahmin sistemi ortaya konmuştur. Sistemin geliştirilmesi için, 1 yıllık zaman aralığı içerisinde New York Times web sayfasından 12.560 adet makale ve 4 aylık zaman aralığı içerisinde Twitter isimli sosyal medya platformundan 2.854.333 adet paylaşım toplanmıştır. Toplanan veri üzerinden varlık isimleri tanımlanmış, düşünce analizi yapılmış ve konu modelleri oluşturulmuştur. Geliştirilen sistemin bir başka çıktısı olarak, analizi yapılan metin verileri üzerinden sosyal ağların oluşturulmasını sağlanmıştır ve üretilen sosyal ağların farklı zaman aralıklarındaki değişimleri gözlemlenmiştir. Elde edilen analiz sonuçları ve sosyal ağlar doğrultusunda öznitelik kümeleri oluşturulmuş ve bu öznitelik kümeleri ile elastik ağ regresyonu temelli bir eğitim yöntemi geliştirilmiştir. Önerilen bu sistem ile birçok farklı veri kümesinin analiz edilebileceği ve bu analizler doğrultusunda farklı değerleri tahmin etmeye yönelik tahmin modellerinin geliştirilebileceği görülmüştür. Bunun bir örneğini ortaya koymak adına Dow Jones endeksinin yönünün tahmini bir vaka olarak seçilmiştir. Önerilen eğitim yöntemi ile farklı modeller eğitilmiş ve eğitilen bu modeller ile Dow Jones endeksinin hareket yönünün tahmin edilmesine yönelik deneyler yapılmıştır. Bu deneyler sonucunda, önerilen eğitim yönteminin, umut vaat edici sonuçlar veren tahmin modelleri ortaya koyduğu gözlemlenmiştir. Farklı deney gruplarının sonucunda, yüksek oranda tutarlı (70,90% değerine varan) sonuçlar elde edilmiştir. Elde edilen tahmin sonuçlarının aynı zamanda gerçek Dow Jones endeks değerleri ile pozitif bir korelasyon (0,2315 korelasyon katsayına değerine varan) içerisinde olduğu da gözlemlenmiştir. Son kısımda, farklı öznitelik kümeleri ile eğitilen tahmin modellerinin sonuçları birbiri ile karşılaştırılmış ve öne çıkan zaman aralıkları ve öznitelik kümeleri analiz edilmiştir. Deney sonuçları, haber ve sosyal medya verisinin, doğal dil işleme yöntemleri ile analiz edilmesinin ve analiz sonuçlarının tahmin modellerinin eğitimi için kullanılmasının finans alanında tahminler yapmak için değerli olduğunu göstermiştir.	en_US
dc.description.abstract	Machine learning techniques and applications of these techniques became very popular after the incremental of different data sources and with the ease of accessing the data. Prediction models can be trained with using these datasets which are collected from different sources. In addition, natural language processing techniques are also very useful for data mining and information extraction on text based data. In this study, with using natural language processing techniques, a large collection of news and social media data is analysed and feature sets are created with results. Then, a scalable prediction system for sparse and high dimensional feature sets to predict stock market movements is built with these feature sets and results. For building that prediction system, 12,560 articles from New York Times covering 1 year time period and 2,854,333 tweets from Twitter covering 4 month time period are collected. The collected data are analysed with named entity recognition, sentiment analysis and topic modelling techniques. As another output of the designed system, social networks are created and analysed according to the various range of timeframes. Feature sets are created and elastic network regression based prediction models are trained with using the natural languages processing results, analysis results and social networks. With using the proposed approach, different dataset can be analysed and different prediction systems can be created. To show an example of this, predicting direction of the Dow Jones Index, is selected as a case. Different prediction models are trained and used for predicting to stock market movements for Dow Jones Index. As a result of different sets of experiments, the models which are created with the proposed method made promising predictions. In different sets of experiments, highly accurate (up to 70.90% accuracy) predictions are made by the proposed approach. These predicted values also correlated (up to 0.2315 correlation coefficient value) with real Dow Jones Index values. Further, performance tests are made to show scalability of proposed method for various prediction models that are trained with different set of features. Experiment results show that it is possible to make reasonable stock movement prediction by integrating news and related social media data, analysing them using named entity recognition, sentiment analysis and topic modelling techniques together with prediction models which use features that are created from these analysis results.	en_US
dc.language.iso	tr	en_US
dc.publisher	TOBB Ekonomi ve Teknoloji Üniversitesi	en_US
dc.rights	info:eu-repo/semantics/openAccess	en_US
dc.subject	Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol	en_US
dc.subject	Computer Engineering and Computer Science and Control	en_US
dc.subject	Büyük veri	en_US
dc.subject	Big data ; Makine öğrenmesi	en_US
dc.subject	Machine learning ; Veri madenciliği	en_US
dc.subject	Data mining	en_US
dc.title	Dinamik Sosyal Ağlarda Akan ve Çok Boyutlu Veri Üzerinden Analiz ve Tahmin Yapılması	en_US
dc.title.alternative	Analysis and Prediction in Sparse and High Dimensional Data With Using Dynamic Social Networks	en_US
dc.type	Doctoral Thesis	en_US
dc.department	Institutes, Graduate School of Engineering and Science, Computer Engineering Graduate Programs	en_US
dc.department	Enstitüler, Fen Bilimleri Enstitüsü, Bilgisayar Mühendisliği Ana Bilim Dalı	en_US
dc.identifier.startpage	1	en_US
dc.identifier.endpage	148	en_US
dc.institutionauthor	Sert, Onur Can	-
dc.relation.publicationcategory	Tez	en_US
dc.identifier.scopusquality	N/A	-
dc.identifier.wosquality	N/A	-
item.openairetype	Doctoral Thesis	-
item.fulltext	With Fulltext	-
item.languageiso639-1	tr	-
item.grantfulltext	open	-
item.cerifentitytype	Publications	-
item.openairecristype	http://purl.org/coar/resource_type/c_18cf	-
Appears in Collections:	Bilgisayar Mühendisliği Doktora Tezleri / Computer Engineering PhD Theses

Files in This Item:

File	Size	Format
629189.pdf	3.16 MB	Adobe PDF	View/Open

Show simple item record

CORE Recommender

Page view(s)

478

checked on Aug 4, 2025

Download(s)

124

checked on Aug 4, 2025

Google Scholar^TM

Check

Files in This Item:

Page view(s)

Download(s)

Google ScholarTM

Google Scholar^TM