Abstract Breath analysis holds the promise of a non-invasive technique for the diagnosis of diverse respiratory conditions including COPD and lung cancer. Breath contains small metabolites that may be putative biomarkers of these conditions. However, the discovery of reliable biomarkers is a considerable challenge in the presence of both clinical and instrumental confounding factors. Among the latter, instrumental time drifts are highly relevant, as since question the short and long-term validity of predictive models. In this work we present a methodology to counter instrumental drifts using information from interleaved blanks for a case study of GC-MS data from breath samples. The proposed method includes feature filtering, and additive, multiplicative and multivariate drift corrections, the latter being based on Component Correction. Biomarker discovery was based on Genetic Algorithms in a filter configuration using Fisher´s ratio computed in the Partial Least Squares – Discriminant Analysis subspace as a figure of merit. Using our protocol, we have been able to find nine peaks that provide a statistically significant Area under the ROC Curve (AUC) of 0.75 for COPD discrimination. The method developed has been successfully validated using blind samples in short-term temporal validation. However, in the attempt to use this model for patient screening six months later was not successful. This negative result highlights the importance of increasing validation rigour when reporting biomarker discovery results.
Cookie Consent The IBEC website uses cookies and similar technologies to ensure the basic functionality of the site and for statistical and optimisation purposes. It also uses cookies to display content such as YouTube videos that use marketing cookies. This last category consists of tracking cookies: these make it possible for your online behaviour to be tracked. You consent to this by clicking on Accept. Also read our Privacy statement.
Read our cookie policy