Search

Ensemble feature selection for breast cancer classification using microarray data

Ensemble feature selection for breast cancer classification using microarray data
Recurso electrónico / Electronic resource
MARC record
Tag12Value
LDR  00000cab a2200000 4500
001  MAP20200035718
003  MAP
005  20220911211109.0
008  201110e20200601esp|||p |0|||b|eng d
040  ‎$a‎MAP‎$b‎spa‎$d‎MAP
084  ‎$a‎922.13
100  ‎$0‎MAPA20200022053‎$a‎Hengpraprohm, Supoj
24510‎$a‎Ensemble feature selection for breast cancer classification using microarray data‎$c‎Supoj Hengpraprohm, Suwimol Jungjit
520  ‎$a‎This paper proposes an ensemble filter feature selection approach, EnSNR, for breast cancer data classification. The Microarray dataset used in the experiments contains 50,739 features (genes) for each of 32 patients. The main idea of the EnSNR approach is to combine informative features which are obtained using two different sets of feature evaluation criteria. Features in the EnSNR subset are those features which are present in both sets of evaluation results. Entropy and SNR evaluation functions are used to generate the EnSNR feature subset. Entropy is a measure of the amount of uncertainty in the outcome of a random experiment, while SNR is an effective function for measuring feature discriminative power. Entropy and SNR functions provide some advantages for the EnSNR approach. For example, the number of features in the EnSNR subset is not user-defined (the EnSNR subset is generated automatically); and the operation of the EnSNR function is independent of the type of classification algorithm employed. Also, only a small amount of processing time is required to generate the EnSNR feature subset. A Genetic Algorithm (GA) generates the breast cancer classification model' using the EnSNR feature subset. The efficiency of the model' is validated using 10-Fold Cross-Validation re-sampling. When the EnSNR' feature subset is used, as well as giving a high degree of prediction accuracy (the average prediction accuracy obtained in the experiments in this paper is 86.92 ± 5.47), the EnSNR approach significantly reduces the number of irrelevant features (genes) to be analyzed for cancer classification.
650 4‎$0‎MAPA20080540500‎$a‎Cáncer
650 4‎$0‎MAPA20080562236‎$a‎Enfermedades
650 4‎$0‎MAPA20080578848‎$a‎Análisis de datos
650 4‎$0‎MAPA20080553128‎$a‎Algoritmos
650 4‎$0‎MAPA20080576158‎$a‎Gestión de datos
7001 ‎$0‎MAPA20200022176‎$a‎Jungjit, Suwimol
7730 ‎$w‎MAP20200034445‎$t‎Revista Iberoamericana de Inteligencia Artificial‎$d‎IBERAMIA, Sociedad Iberoamericana de Inteligencia Artificial , 2018-‎$x‎1988-3064‎$g‎01/06/2020 Volumen 23 Número 65 - junio 2020 , p. 100-114
856  ‎$q‎application/pdf‎$w‎1108615‎$y‎Recurso electrónico / Electronic resource