Abstract
Abstract - Among a large number of genes in microarray data sets that characterize the samples, many of them may be irrelevant
to the learning tasks. Thus there is a need for reliable methods for gene representation, reduction, and selection, to speed up the
processing rate, improve the classification accuracy, and to avoid incomprehensibility due to the high number of genes investigated.
Classifying multiclass data sets is usually more difficult than classifying microarray datasets with only two classes. In this paper,
we propose a new gene selection and classification strategy based on Firefly Algorithm (FFA) and K- Nearest Neighbor (KNN),
suitable for multiclass microarray data sets. This approach is associated with Kruskal-test pre-filtering technique. The FFA is
utilized to evolve gene subsets whose fitness is evaluated by a KNN classifier with leave-one-out-cross-validation (LOOCV)
schema. The experimental results on three multiclass high-dimensional data sets show that the proposed method simplifies gene
signatures effectively and obtains approximately higher classification accuracy compared to the best previously published results.