Please use this identifier to cite or link to this item:
Title: Effective gene expression data generation framework based on multi-model approach
Authors: Şirin, Utku
Erdoğdu, Utku
Polat, Faruk
Tan, Mehmet
Alhajj, Reda
Keywords: Multi-Model Approach
Probabilistic Boolean Networks
Ordinary Differential Equations
Genetic Algorithm
Hierarchical Markov Models
Gene Expression Data Generation
Gene Regulation Network Modeling
Issue Date: Jun-2016
Publisher: Elsevier
Source: Sirin, U., Erdogdu, U., Polat, F., Tan, M., & Alhajj, R. (2016). Effective gene expression data generation framework based on multi-model approach. Artificial intelligence in medicine, 70, 41-61.
Abstract: Objective: Overcome the lack of enough samples in gene expression data sets having thousands of genes but a small number of samples challenging the computational methods using them. Methods and material: This paper introduces a multi-model artificial gene expression data generation framework where different gene regulatory network (GRN) models contribute to the final set of samples based on the characteristics of their underlying paradigms. In the first stage, we build different GRN models, and sample data from each of them separately. Then, we pool the generated samples into a rich set of gene expression samples, and finally try to select the best of the generated samples based on a multi-objective selection method measuring the quality of the generated samples from three different aspects such as compatibility, diversity and coverage. We use four alternative GRN models, namely, ordinary differential equations, probabilistic Boolean networks, multi-objective genetic algorithm and hierarchical Markov model. Results: We conducted a comprehensive set of experiments based on both real-life biological and synthetic gene expression data sets. We show that our multi-objective sample selection mechanism effectively combines samples from different models having up to 95% compatibility, 10% diversity and 50% coverage. We show that the samples generated by our framework has up to 1.5x higher compatibility, 2x higher diversity and 2x higher coverage than the samples generated by the individual models that the multi model framework uses. Moreover, the results show that the GRNs inferred from the samples generated by our framework can have 2.4x higher precision, 12x higher recall, and 5.4x higher f-measure values than the GRNs inferred from the original gene expression samples. Conclusions: Therefore, we show that, we can significantly improve the quality of generated gene expression samples by integrating different computational models into one unified framework without dealing with complex internal details of each individual model. Moreover, the rich set of artificial gene expression samples is able to capture some biological relations that can even not be captured by the original gene expression data set. (C) 2016 Elsevier B.V. All rights reserved.
ISSN: 0933-3657
Appears in Collections:Bilgisayar Mühendisliği Bölümü / Department of Computer Engineering
PubMed İndeksli Yayınlar Koleksiyonu / PubMed Indexed Publications Collection
Scopus İndeksli Yayınlar Koleksiyonu / Scopus Indexed Publications Collection
WoS İndeksli Yayınlar Koleksiyonu / WoS Indexed Publications Collection

Show full item record

CORE Recommender


checked on Mar 25, 2023


checked on Sep 24, 2022

Page view(s)

checked on Mar 20, 2023

Google ScholarTM



Items in GCRIS Repository are protected by copyright, with all rights reserved, unless otherwise indicated.