Luca Pireddu, Brett Poulin, Duane Szafron, Paul Lu, and David Wishart, Pathway Analyst—Automated Metabolic Pathway Prediction, Proceedings of 2005 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, November 2005, pp. 243-250.abstract or pdf.
Metabolic pathways are crucial to our understanding of biology. The speed at which new organisms are being sequenced is outstripping our ability to experimentally determine their metabolic pathway information. In recent years several initiatives have been successful in automating the annotations of individual proteins in these organisms, either experimentally or by prediction. However, to leverage the success of metabolic pathways we need to automate their identification in our rapidly growing list of sequenced organisms. We present a prototype system for predicting the catalysts of important reactions and for organizing the predicted catalysts and reactions into previously defined metabolic pathways. We compare a variety of predictors that incorporate sequence similarity (BLAST), hidden Markov models (HMM) and Support Vector Machines (SVM). We found that there is an advantage to using different predictors for different reactions. We validate our prototype on 10 metabolic pathways across 13 organisms for which we obtained a cross-validation precision of 71.5% and recall of 91.5% in predicting the catalyst proteins of all reactions.