Title: Pickle DB: Developing a knowledge base for the human protein interactome
Authors: Kalliopi Tsafou, Evangelos Theodoridis, Christos Makris, Maria I. Klapa, Athanasios Tsakalidis and Nikolaos K. Moschonas
Abstract:The elucidation of protein-protein interaction (PPI) networks (protein interactomes) is a major objective of systems biology. This task is crucial for furthering our understanding of the cellular machinery dynamics, in light of the fundamental role of proteins in cellular function. The development of high-throughput methods for PPI identification, including the widely used Y2H and the mass spectrometry of co-immunoprecipitated complexes, drastically increased the PPI data in model organisms and the human. However, these methods suffer from intrinsic detection biases and >70% of false positives. Moreover, independent public databases (dbs) of experimentally derived PPIs exhibit a remarkably limited overlap, mainly due to uncoordinated data validation and curation criteria. These limitations scale up in highly complex systems like the human for which only 8-10% of the estimated PPIs have been determined. Furthermore, new PPI prediction is affected by the current data quality and quantity along with predictive algorithm limitations to incorporate the available protein information. Thus, there is initially a need for a systematic curation of multiple datasets into one common db. We have integrated high- and low- throughput experimental data, ortholog protein interactions, protein domain interactions, and protein structure/function and expression data from twelve highly informative and updated dbs into one local db using the Microsoft_SQL_Server. The db selection was based on their unique gene content, PPIs recorded, annotation depth, curation methodology and relational scheme. This data will be enriched with PPI information mined from the literature. The final dataset will be appropriately scored to rank and validate the PPIs with increased confidence, forming the basis for a human interactome knowledge base, named PICKLE_DB. Equipped with predictive and data mining algorithms, PICKLE_DB will be a valuable functional genomics tool in medical research and applications.
Questions & Comments: