2003 ICML Bioinformatics D. Szafron, P. Lu, R. Greiner, D. Wishart, Z. Lu, B. Poulin, R. Eisner, J. Anvik and C. Macdonell, Proteome Analyst - Transparent High-throughput Protein Annotation: Function, Localization and Custom Predictors, International Conference on Machine Learning Workshop on Machine Learning in Bioinformatics (ICML Workshop - Bioinformatics), August 2003, Washington, U.S.A., pp. 2-10, abstract or pdf.
Abstract:

Modern sequencing technology permits sequencing of entire genomes, whose gene sequences require annotation. It is too time consuming to predict the properties of each protein sequence manually and to organize the results of many prediction tools by hand. The prediction process must be automated, but the predictions must also be transparent. That is, the rationale for each prediction should be easily examinable by anyone that wishes to use the prediction. Proteome Analyst (PA) is a webbased system for predicting the properties of each protein in a proteome. PA has three interesting features. First, it is a single webbased system that allows the user to select a wide range of analytic tools and automatically apply them to each protein in a proteome. In essence, PA provides one-stop automatic high-throughput analysis. Second, PA has the ability to explain its predictions to users. PA is based on established machine learning techniques, but makes every prediction transparent to its users. Third, PA allows users to create their own transparent custom predictors without programming.