Please use this identifier to cite or link to this item: https://hdl.handle.net/20.500.11851/2004
Title: High quality clustering of big data and solving empty-clustering problem with an evolutionary hybrid algorithm
Authors: Karimov, Jeyhun
Özbayoğlu, Ahmet Murat
142991
Keywords: clustering
k-means
evolutionary algorithms
Cuckoo search
Fireworks algorithm
Hadoop
Mapreduce
Issue Date: 2015
Publisher: IEEE
Source: Karimov, J., & Ozbayoglu, M. (2015, October). High quality clustering of big data and solving empty-clustering problem with an evolutionary hybrid algorithm. In 2015 IEEE International Conference on Big Data (Big Data) (pp. 1473-1478). IEEE.
Abstract: Achieving high quality clustering is one of the most well-known problems in data mining. k-means is by far the most commonly used clustering algorithm. It converges fairly quickly, but achieving a good solution is not guaranteed. The clustering quality is highly dependent on the selection of the initial centroid selections. Moreover, when the number of clusters increases, it starts to suffer from "empty clustering". The motivation in this study is two-fold. We not only aim at improving the k-means clustering quality, but at the same time not being effected by the empty cluster issue. For achieving this purpose, we developed a hybrid model, H(EC)S-2, Hybrid Evolutionary Clustering with Empty Clustering Solution. Firstly, it selects representative points to eliminate Empty Clustering problem. Then, the hybrid algorithm uses only these points during centroid selection. The proposed model combines Fireworks and Cuckoo-search based evolutionary algorithm with some centroid-calculation heuristics. The model is implemented using a Hadoop Mapreduce algorithm for achieving scalability when faced with a Big Data clustering problem. The advantages of the developed model is particularly attractive when the amount, dimensionality and number of cluster parameters tend to increase. The results indicate that considerable clustering quality performance improvement is achieved using the proposed model.
Description: 3rd IEEE International Conference on Big Data, IEEE Big Data (2015 : Santa Clara; United States)
URI: https://ieeexplore.ieee.org/document/7363909
https://hdl.handle.net/20.500.11851/2004
ISBN: 978-1-4799-9925-5
Appears in Collections:Bilgisayar Mühendisliği Bölümü / Department of Computer Engineering
Scopus İndeksli Yayınlar Koleksiyonu / Scopus Indexed Publications Collection
WoS İndeksli Yayınlar Koleksiyonu / WoS Indexed Publications Collection

Show full item record

CORE Recommender

SCOPUSTM   
Citations

10
checked on Sep 23, 2022

Page view(s)

26
checked on Dec 26, 2022

Google ScholarTM

Check

Altmetric


Items in GCRIS Repository are protected by copyright, with all rights reserved, unless otherwise indicated.