Proteome-Wide Discovery of Evolutionary Conserved Sequences in Disordered Regions

Sci. Signal., 13 March 2012
Vol. 5, Issue 215, p. rs1
DOI: 10.1126/scisignal.2002515

Proteome-Wide Discovery of Evolutionary Conserved Sequences in Disordered Regions

  1. Alex N. Nguyen Ba1,2,
  2. Brian J. Yeh3,
  3. Dewald van Dyk4,5,
  4. Alan R. Davidson6,
  5. Brenda J. Andrews4,5,
  6. Eric L. Weiss3, and
  7. Alan M. Moses1,2,7,8,*
  1. 1Department of Cell and Systems Biology, University of Toronto, Toronto, Ontario M5S 3B2, Canada.
  2. 2Centre for the Analysis of Genome Evolution and Function, University of Toronto, Toronto, Ontario M5S 3B2, Canada.
  3. 3Department of Molecular Biosciences, Northwestern University, Evanston, IL 60208, USA.
  4. 4The Terrence Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario M3S 3E1, Canada.
  5. 5Banting and Best Department of Medical Research, University of Toronto, Toronto, Ontario M5G 1L6, Canada.
  6. 6Department of Biochemistry, University of Toronto, Toronto, Ontario M5S 1A8, Canada.
  7. 7Department of Computer Science, University of Toronto, Toronto, Ontario M5S 2E4, Canada.
  8. 8Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, Ontario M5S 3B2, Canada.
  1. *To whom correspondence should be addressed. E-mail: alan.moses{at}utoronto.ca

Abstract

At least 30% of human proteins are thought to contain intrinsically disordered regions, which lack stable structural conformation. Despite lacking enzymatic functions and having few protein domains, disordered regions are functionally important for protein regulation and contain short linear motifs (short peptide sequences involved in protein-protein interactions), but in most disordered regions, the functional amino acid residues remain unknown. We searched for evolutionarily conserved sequences within disordered regions according to the hypothesis that conservation would indicate functional residues. Using a phylogenetic hidden Markov model (phylo-HMM), we made accurate, specific predictions of functional elements in disordered regions even when these elements are only two or three amino acids long. Among the conserved sequences that we identified were previously known and newly identified short linear motifs, and we experimentally verified key examples, including a motif that may mediate interaction between protein kinase Cbk1 and its substrates. We also observed that hub proteins, which interact with many partners in a protein interaction network, are highly enriched in these conserved sequences. Our analysis enabled the systematic identification of the functional residues in disordered regions and suggested that at least 5% of amino acids in disordered regions are important for function.

Citation:

A. N. Nguyen Ba, B. J. Yeh, D. van Dyk, A. R. Davidson, B. J. Andrews, E. L. Weiss, and A. M. Moses, Proteome-Wide Discovery of Evolutionary Conserved Sequences in Disordered Regions. Sci. Signal. 5, rs1 (2012).

Molecular principles of human virus protein-protein interactions
R. R. Halehalli, and H. A. Nagarajaram
Bioinformatics 31, 1025-1033 (1 April 2015)

A sequence-specific transcription activator motif and powerful synthetic variants that bind Mediator using a fuzzy protein interface
L. Warfield, L. M. Tuttle, D. Pacheco, R. E. Klevit, and S. Hahn
Proc. Natl. Acad. Sci. USA 111, E3506-E3513 (26 August 2014)

The switches.ELM Resource: A Compendium of Conditional Regulatory Interaction Interfaces
K. Van Roey, H. Dinkel, R. J. Weatheritt, T. J. Gibson, and N. E. Davey
Sci Signal 6, rs7-rs7 (2 April 2013)

SLiMPrints: conservation-based discovery of functional motif fingerprints in intrinsically disordered protein regions
N. E. Davey, J. L. Cowan, D. C. Shields, T. J. Gibson, M. J. Coldwell, and R. J. Edwards
Nucleic Acids Res 40, 10628-10641 (1 November 2012)

Unmasking Functional Motifs Within Disordered Regions of Proteins
R. K. Das, A. H. Mao, and R. V. Pappu
Sci Signal 5, pe17-pe17 (17 April 2012)

Science Signaling. ISSN 1937-9145 (online), 1945-0877 (print). Pre-2008: Science's STKE. ISSN 1525-8882