DONATE

Publications

by Keyword: Missing values

Pairo, E., Marco, S., Perera, A., (2010). A subspace method for the detection of transcription factor binding sites BIOINFORMATICS 2010. Proceedings of the First International Conference on Bioinformatics BIOINFORMATICS 2010. First International Conference on Bioinformatics (ed. Fred, A., Filipe, J., Gamboa, H.), INSTICC Press (Valencia, Spain) , 102-107

Transcription Factor binding sites are short and degenerate sequences, located mostly at the promoter of the gene, where some proteins bind in order to regulate transcription. Locating these sequences is an important issue, and many experimental and computational methods have been developed. Algorithms to search binding sites are usually based on Position Specific Scoring Matrices (PSSM), where each position is treated independently. Mapping symbolical DNA to numerical sequences, a detector has been built with a Principal Component Analysis of the numerical sequences, taking into account covariances between positions. When a treatment of missing values is incorporated the Q-residuals detector, based on PCA, performs better than a PSSM algorithm. The performance on the detector depends on the estimation of missing values and the percentage of missing values considered in the model.

JTD Keywords: Binding sites, BPCA, Missing values, Numerical DNA, Principal components analysis, Transcription factors