Paul R. Graves1 and Timothy A. J. Haystead1,2*
Department of Pharmacology and Cancer Biology, Duke University,1,1 and Serenex Inc.,2 Durham, North Carolina 277102
Microbiology and Molecular Biology Reviews, March 2002, p. 39-63, Vol. 66, No. 1
This review is intended to give the molecular biologist a rudimentary understanding of the technologies behind proteomics and their application to address biological questions. Entry of our laboratory into proteomics 5 years ago was driven by a need to define a complex mixture of proteins ( 36 proteins) we had affinity isolated that bound specifically to the catalytic subunit of protein phosphatase 1 (PP-1, a serine/threonine protein phosphatase that regulates multiple dephosphorylation events in Cell s) (26). We were faced with the task of trying to understand the significance of these proteins, and the only obvious way to begin to do this was to identify them by sequencing. We then bought an Applied Biosystems automated Edman sequencer (not having the budget for a mass spectrometer at the time). Since the majority of intact eukaryotic proteins are not immediately accessible to Edman sequencing due to posttranslational N-terminal modifications, we invented mixed-peptide sequencing (38). This method, described in detail later, essentially enables internal peptide sequence information to be derived from proteins electroblotted onto hydrophobic membranes. Using the mixed-peptide sequencing strategy, we identified all 36 proteins in about a week. The mixture contained at least two known PP-1 regulatory subunits, but most were identified in the expressed sequence tag or unannotated DNA databases and were novel proteins of unknown function. Since that time, we have been using various molecular biological approaches to determine the functions of some of these proteins. Herein lies the lesson of proteomics. Identifying long lists of potentially interesting proteins often generates more questions than it seeks to answer.
Despite learning this obvious lesson, our early sequencing experiences were an epiphany that has subsequently altered our whole scientific strategy for probing protein function in Cell s. The sequencing of the 36 proteins has opened new avenues to further explore the functions of PP-1 in intact cells. Because of increased sensitivity, our approaches now routinely use state-of-the-art mass spectrometry (MS) techniques. However, rather than using proteomics to simply characterize large numbers of proteins in complex mixtures, we see the real application of this technology as a tool to enhance the power of existing approaches currently used by the modern molecular biologist such as classical yeast and mouse genetics, tissue culture, protein expression systems, and site-directed mutagenesis. Importantly, the one message we would want the reader to take away from reading this review is that one should always let the biological question in mind drive the application of proteomics rather than simply engaging in an orgy of protein sequencing. From our experiences, we believe that if the appropriate controls are performed, proteomics is an extremely powerful approach for addressing important physiological questions. One should always design experiments to define a selected number of relevant proteins in the mixture of interest. Examples of such experiments that we routinely perform include defining early phosphorylation events in complex protein mixtures after hormone treatment of intact cells or comparing patterns of protein derived from a stimulated versus nonstimulated cell in an affinity pull-down experiment. Only the proteins that were specifically phosphorylated or bound in response to the stimulus are sequenced in the complex mixtures. Sequencing proteins that are regulated then has a meaningful outcome and directs all subsequent biological investigation.
The term "proteomics" was first coined in 1995 and was defined as the large-scale characterization of the entire protein complement of a Cell line, tissue, or organism (13, 163, 167). Today, two definitions of proteomics are encountered. The first is the more classical definition, restricting the large-scale analysis of gene products to studies involving only proteins. The second and more inclusive definition combines protein studies with analyses that have a genetic readout such as mRNA analysis, genomics, and the yeast two-hybrid analysis (123). However, the goal of proteomics remains the same, i.e., to obtain a more global and integrated view of biology by studying all the proteins of a cell rather than each one individually.