Fuzzy Discretization and Rough Set based Feature Selection for High-Dimensional Classification

Authors

  • Prema Ramasamy and Premalatha Kandhasamy

Abstract

1 Prema Ramasamy, Assistant Professor, New Horizon College of Engineering, Bangalore E-mail:premabit@gmail.com 2 Professor, Department of Computer Science and Engineering, Bannari Amman Institute of Techlology, Sathyamangalam. (Received May 11 2018, accepted July \u00a016 \u00a02018) Contemporary \u00a0biological \u00a0technologies \u00a0like \u00a0gene \u00a0expression \u00a0microarrays \u00a0produce \u00a0extremely \u00a0high- dimensional datasets with limited samples. Analysis of gene expression data is essential in microarray gene expression studies in order to retrieve the required information. Gene expression data generally contain a large number of genes but a small number of samples. The complicated relations among the different genes make analysis more difficult, and removing irrelevant genes improves the quality of results. In this regard, a new feature selection algorithm called 2-level MRMS is presented based on rough set theory. It selects a set of genes from microarray data by maximizing the relevance and significance of the selected genes. The paper also presents a novel discretization method, Gaussian Fuzzy Discretization based on fuzzy logic to discretize the continuous gene expression values. The performance of the proposed algorithm, along with a comparison with other related feature selection methods, is studied using the classification accuracy of k-Nearest Neighbor (kNN) and Support Vector Machine (SVM) on four microarray data sets. \u00a0The \u00a0experimental \u00a0results \u00a0show \u00a0that \u00a0the \u00a0genes \u00a0selected \u00a0using \u00a02-level \u00a0MRMS \u00a0feature \u00a0selection \u00a0give \u00a0high classification accuracy than other methods. \u00a0

Published

1970-01-01

Issue

Section

Articles