Please use this identifier to cite or link to this item:
https://hdl.handle.net/20.500.11851/2004
Title: | High quality clustering of big data and solving empty-clustering problem with an evolutionary hybrid algorithm | Authors: | Karimov, Jeyhun Özbayoğlu, Ahmet Murat |
Keywords: | clustering k-means evolutionary algorithms Cuckoo search Fireworks algorithm Hadoop Mapreduce |
Publisher: | IEEE | Source: | Karimov, J., & Ozbayoglu, M. (2015, October). High quality clustering of big data and solving empty-clustering problem with an evolutionary hybrid algorithm. In 2015 IEEE International Conference on Big Data (Big Data) (pp. 1473-1478). IEEE. | Abstract: | Achieving high quality clustering is one of the most well-known problems in data mining. k-means is by far the most commonly used clustering algorithm. It converges fairly quickly, but achieving a good solution is not guaranteed. The clustering quality is highly dependent on the selection of the initial centroid selections. Moreover, when the number of clusters increases, it starts to suffer from "empty clustering". The motivation in this study is two-fold. We not only aim at improving the k-means clustering quality, but at the same time not being effected by the empty cluster issue. For achieving this purpose, we developed a hybrid model, H(EC)S-2, Hybrid Evolutionary Clustering with Empty Clustering Solution. Firstly, it selects representative points to eliminate Empty Clustering problem. Then, the hybrid algorithm uses only these points during centroid selection. The proposed model combines Fireworks and Cuckoo-search based evolutionary algorithm with some centroid-calculation heuristics. The model is implemented using a Hadoop Mapreduce algorithm for achieving scalability when faced with a Big Data clustering problem. The advantages of the developed model is particularly attractive when the amount, dimensionality and number of cluster parameters tend to increase. The results indicate that considerable clustering quality performance improvement is achieved using the proposed model. | Description: | 3rd IEEE International Conference on Big Data, IEEE Big Data (2015 : Santa Clara; United States) | URI: | https://ieeexplore.ieee.org/document/7363909 https://hdl.handle.net/20.500.11851/2004 |
ISBN: | 978-1-4799-9925-5 |
Appears in Collections: | Bilgisayar Mühendisliği Bölümü / Department of Computer Engineering Scopus İndeksli Yayınlar Koleksiyonu / Scopus Indexed Publications Collection WoS İndeksli Yayınlar Koleksiyonu / WoS Indexed Publications Collection |
Show full item record
CORE Recommender
SCOPUSTM
Citations
10
checked on Nov 2, 2024
WEB OF SCIENCETM
Citations
13
checked on Nov 2, 2024
Page view(s)
94
checked on Nov 4, 2024
Google ScholarTM
Check
Altmetric
Items in GCRIS Repository are protected by copyright, with all rights reserved, unless otherwise indicated.