Florencio Pazos
Pazos, Florencio
National Centre of Biotechnology CNB-CSIC
Madrid, Spain
Title: Quantifying the Biological Significance of Gene Ontology Biological Processes - Implications for the Analysis of Systems-wide data
Authors: Monica Chagoyen and Florencio Pazos

Gene Ontology (GO), the de-facto standard for representing protein functional aspects, is being used beyond the primary goal for which it was designed: protein functional annotation. GO is being increasingly used to evaluate large sets of relationships between genes and proteins, e.g. protein-protein interactions or mRNA co-expression, under the assumption that related proteins tend to have the same or similar GO terms. This assumption only holds for terms representing functional groups with biological significance (ìclassesî), and not for those representing human-imposed aggregations or conceptualizations lacking a biological rationale (ìcategoriesî). Nevertheless such a distinction between classes and categories is not used in any GO-based assessment of high throughput data and all GO terms are assumed to have biological meaning. Using a data-driven approach based on a set of high quality functional associations, we quantified the functional coherence of GO biological process (GO:BP) terms as well as the relationships between them (explicit and implicit relationships), trying to distinguish classes and categories. We show that the quantification used is, in general, in agreement with the biological significance one would intuitively assign to GO:BP terms, and hence it allows to distinguish between classes and categories. Since not all GO:BP terms and relationships are equally supported by current functional associations, any detailed validation of experimental data using GO:BP, beyond whole-system statistics, should take such unbalance into account.

Questions & Comments: