Monday 12 November 2012 at 10:00
Center for Biological Sequence Analysis, room 62 in building 208
We aim to accurately predict biologically relevant protein-protein interactions mediated by peptide recognition domains, such as PDZ, WW and SH3 domains, directly from the genome. First, we need to predict the binding motif preference of a domain given its amino acid sequence. We do this using machine learning techniques. We can then scan a proteome to find potential protein partners of the domain that contain a recognized binding motif. Second, we need to predict if a potential protein interaction actually occurs in the cell. We use available biological context information (e.g. gene expression, gene function annotation, conserved binding sites across species) to infer a likelihood that two given proteins interact. Our computational methods require a set of binding preferences for each domain family to learn from. This information is now available for multiple domain families across a few species, determined using phage-display and peptide-chip experimental methods. Protein-protein interactions involving these domains are also available from the literature and from large-scale protein-protein interaction mapping experiments (e.g. yeast two-hybrid). Interestingly, the networks generated using this approach have binding sites for many predicted protein interactions and are available across species, enabling the study of protein interaction network rewiring via binding site evolution as function evolves.
Everybody is welcome. Registration is not necessary.