by Keyword: Feature selection
Taghadomi-Saberi, S., Garcia, S. M., Masoumi, A. A., Sadeghi, M., Marco, S., (2018). Classification of bitter orange essential oils according to fruit ripening stage by untargeted chemical profiling and machine learning Sensors 18, (6), 1922
The quality and composition of bitter orange essential oils (EOs) strongly depend on the ripening stage of the citrus fruit. The concentration of volatile compounds and consequently its organoleptic perception varies. While this can be detected by trained humans, we propose an objective approach for assessing the bitter orange from the volatile composition of their EO. The method is based on the combined use of headspace gas chromatography–mass spectrometry (HS-GC-MS) and artificial neural networks (ANN) for predictive modeling. Data obtained from the analysis of HS-GC-MS were preprocessed to select relevant peaks in the total ion chromatogram as input features for ANN. Results showed that key volatile compounds have enough predictive power to accurately classify the EO, according to their ripening stage for different applications. A sensitivity analysis detected the key compounds to identify the ripening stage. This study provides a novel strategy for the quality control of bitter orange EO without subjective methods.
JTD Keywords: Bitter orange essential oil, Headspace gas chromatography–mass spectrometry, Artificial neural network, Foodomics, Chemometrics, Feature selection
Garde, Ainara, Voss, Andreas, Caminal, Pere, Benito, Salvador, Giraldo, Beatriz F., (2013). SVM-based feature selection to optimize sensitivity-specificity balance applied to weaning Computers in Biology and Medicine , 43, (5), 533-540
Classification algorithms with unbalanced datasets tend to produce high predictive accuracy over the majority class, but poor predictive accuracy over the minority class. This problem is very common in biomedical data mining. This paper introduces a Support Vector Machine (SVM)-based optimized feature selection method, to select the most relevant features and maintain an accurate and well-balanced sensitivity–specificity result between unbalanced groups. A new metric called the balance index (B) is defined to implement this optimization. The balance index measures the difference between the misclassified data within each class. The proposed optimized feature selection is applied to the classification of patients' weaning trials from mechanical ventilation: patients with successful trials who were able to maintain spontaneous breathing after 48 h and patients who failed to maintain spontaneous breathing and were reconnected to mechanical ventilation after 30 min. Patients are characterized through cardiac and respiratory signals, applying joint symbolic dynamic (JSD) analysis to cardiac interbeat and breath durations. First, the most suitable parameters (C+,C−,σ) are selected to define the appropriate SVM. Then, the feature selection process is carried out with this SVM, to maintain B lower than 40%. The best result is obtained using 6 features with an accuracy of 80%, a B of 18.64%, a sensitivity of 74.36% and a specificity of 82.42%.
JTD Keywords: Support vector machines, Balance index, Sensitivity-specificity balance, Cardiorespiratory interaction, Joint symbolic dynamics, Feature selection, Weaning procedure
Santano-Martínez, R., Leiva-González, R., Avazbeigi, M., Gutiérrez-Gálvez, A., Marco, S., (2013). Identification of molecular properties coding areas in rat's olfactory bulb by rank products Proceedings of the International Conference on Bio-Inspired Systems and Signal Processing BIOSIGNALS 2013 , SciTePress (Barcelona, Spain) , 383-387
Neural coding of chemical information is still under strong debate. It is clear that, in vertebrates, neural representation in the olfactory bulb is a key for understanding a putative odour code. To explore this code, in this work we have studied a public dataset of radio images of 2-Deoxyglucose uptake (2-DG) in the olfactory bulb of rats in response to diverse odorants using univariate pixel selection algorithms: rank-products and Mann-Whitney U (MWU) test. Initial results indicate that some chemical properties of odorants preferentially activate certain areas of the rat olfactory bulb. While non-parametric test (MWU) has difficulties to detect these regions, rank-product provides a higher power of detection.
JTD Keywords: 2-Deoxyglucose uptake, Chemotopy, Feature selection, Odour coding, Olfaction, Olfactory bulb
Giraldo, B. F., Tellez, J. P., Herrera, S., Benito, S., (2013). Study of the oscillatory breathing pattern in elderly patients Engineering in Medicine and Biology Society (EMBC) 35th Annual International Conference of the IEEE , IEEE (Osaka, Japan) , 5228-5231
Some of the most common clinical problems in elderly patients are related to diseases of the cardiac and respiratory systems. Elderly patients often have altered breathing patterns, such as periodic breathing (PB) and Cheyne-Stokes respiration (CSR), which may coincide with chronic heart failure. In this study, we used the envelope of the respiratory flow signal to characterize respiratory patterns in elderly patients. To study different breathing patterns in the same patient, the signals were segmented into windows of 5 min. In oscillatory breathing patterns, frequency and time-frequency parameters that characterize the discriminant band were evaluated to identify periodic and non-periodic breathing (PB and nPB). In order to evaluate the accuracy of this characterization, we used a feature selection process, followed by linear discriminant analysis. 22 elderly patients (7 patients with PB and 15 with nPB pattern) were studied. The following classification problems were analyzed: patients with either PB (with and without apnea) or nPB patterns, and patients with CSR versus PB, CSR versus nPB and PB versus nPB patterns. The results showed 81.8% accuracy in the comparisons of nPB and PB patients, using the power of the modulation peak. For the segmented signal, the power of the modulation peak, the frequency variability and the interquartile ranges provided the best results with 84.8% accuracy, for classifying nPB and PB patients.
JTD Keywords: cardiovascular system, diseases, feature extraction, geriatrics, medical signal processing, oscillations, pneumodynamics, signal classification, time-frequency analysis, Cheyne-Stokes respiration, apnea, cardiac systems, chronic heart failure, classification problems, discriminant band, diseases, elderly patients, feature selection process, frequency variability, interquartile ranges, linear discriminant analysis, nonperiodic breathing, oscillatory breathing pattern, periodic breathing, respiratory How signal, respiratory systems, signal segmentation, time 5 min, time-frequency parameters, Accuracy, Aging, Frequency modulation, Heart, Senior citizens, Time-frequency analysis
Auffarth, B., Lopez, M., Cerquides, J., (2010). Comparison of redundancy and relevance measures for feature selection in tissue classification of CT images Lecture Notes in Artificial Intelligence 10th Industrial Conference on Data Mining (ed. Perner, P.), Springer-Verlag Berlin (Berlin, Germany) 6171, 248-262
In this paper we report on a study on feature selection within the minimum-redundancy maximum-relevance framework. Features are ranked by their correlations to the target vector. These relevance scores are then integrated with correlations between features in order to obtain a set of relevant and least-redundant features. Applied measures of correlation or distributional similarity for redunancy and relevance include Kolmogorov-Smirnov (KS) test, Spearman correlations, Jensen-Shannon divergence, and the sign-test. We introduce a metric called "value difference metric" (VDM) and present a simple measure, which we call "fit criterion" (FC). We draw conclusions about the usefulness of different measures. While KS-test and sign-test provided useful information, Spearman correlations are not fit for comparison of data of different measurement intervals. VDM was very good in our experiments as both redundancy and relevance measure. Jensen-Shannon and the sign-test are good redundancy measure alternatives and FC is a good relevance measure alternative.
JTD Keywords: Distributional similarity, Divergence measure, Feature selection, Relevance and redundancy