Submitted on May 6, 2008
Revised on July 14, 2008
Accepted on July 20, 2008
Significance analysis of spectral count data in label-free shotgun proteomics
Hyungwon Choi, Damian Fermin, and Alexey I. Nesvizhskii
Pathology, University of Michigan Medical School, Ann Arbor, MI 48109
Corresponding Author: nesvi{at}med.umich.edu
Spectral counting has become a commonly used approach for measuring protein abundance in label-free shotgun proteomics. At the same time, the development of data analysis methods has lagged behind. Currently, most studies utilizing spectral counts rely on simple data transforms and post-hoc corrections of conventional signal-to-noise ratio statistic. However, these adjustments can neither handle the bias toward high abundance proteins nor deal with the drawbacks due to the limited number of replicates. We present a novel statistical framework (QSpec) for the significance analysis of differential expression with extensions to a variety of experimental design factors and adjustments for protein properties. Using synthetic and real experimental datasets, we show that the proposed method outperforms conventional statistical methods that search for differential expression for individual proteins. We illustrate the flexibility of the model by analyzing a dataset with a complicated experimental design involving cellular localization and time course.