Book contents
- Frontmatter
- Contents
- List of Contributors
- Preface
- 1 An Introduction to High-Throughput Bioinformatics Data
- 2 Hierarchical Mixture Models for Expression Profiles
- 3 Bayesian Hierarchical Models for Inference in Microarray Data
- 4 Bayesian Process-Based Modeling of Two-Channel Microarray Experiments: Estimating Absolute mRNA Concentrations
- 5 Identification of Biomarkers in Classification and Clustering of High-Throughput Data
- 6 Modeling Nonlinear Gene Interactions Using Bayesian MARS
- 7 Models for Probability of Under- and Overexpression: The POE Scale
- 8 Sparse Statistical Modelling in Gene Expression Genomics
- 9 Bayesian Analysis of Cell Cycle Gene Expression Data
- 10 Model-Based Clustering for Expression Data via a Dirichlet Process Mixture Model
- 11 Interval Mapping for Expression Quantitative Trait Loci
- 12 Bayesian Mixture Models for Gene Expression and Protein Profiles
- 13 Shrinkage Estimation for SAGE Data Using a Mixture Dirichlet Prior
- 14 Analysis of Mass Spectrometry Data Using Bayesian Wavelet-Based Functional Mixed Models
- 15 Nonparametric Models for Proteomic Peak Identification and Quantification
- 16 Bayesian Modeling and Inference for Sequence Motif Discovery
- 17 Identification of DNA Regulatory Motifs and Regulators by Integrating Gene Expression and Sequence Data
- 18 A Misclassification Model for Inferring Transcriptional Regulatory Networks
- 19 Estimating Cellular Signaling from Transcription Data
- 20 Computational Methods for Learning Bayesian Networks from High-Throughput Biological Data
- 21 Bayesian Networks and Informative Priors: Transcriptional Regulatory Network Models
- 22 Sample Size Choice for Microarray Experiments
- Plate section
15 - Nonparametric Models for Proteomic Peak Identification and Quantification
Published online by Cambridge University Press: 23 November 2009
- Frontmatter
- Contents
- List of Contributors
- Preface
- 1 An Introduction to High-Throughput Bioinformatics Data
- 2 Hierarchical Mixture Models for Expression Profiles
- 3 Bayesian Hierarchical Models for Inference in Microarray Data
- 4 Bayesian Process-Based Modeling of Two-Channel Microarray Experiments: Estimating Absolute mRNA Concentrations
- 5 Identification of Biomarkers in Classification and Clustering of High-Throughput Data
- 6 Modeling Nonlinear Gene Interactions Using Bayesian MARS
- 7 Models for Probability of Under- and Overexpression: The POE Scale
- 8 Sparse Statistical Modelling in Gene Expression Genomics
- 9 Bayesian Analysis of Cell Cycle Gene Expression Data
- 10 Model-Based Clustering for Expression Data via a Dirichlet Process Mixture Model
- 11 Interval Mapping for Expression Quantitative Trait Loci
- 12 Bayesian Mixture Models for Gene Expression and Protein Profiles
- 13 Shrinkage Estimation for SAGE Data Using a Mixture Dirichlet Prior
- 14 Analysis of Mass Spectrometry Data Using Bayesian Wavelet-Based Functional Mixed Models
- 15 Nonparametric Models for Proteomic Peak Identification and Quantification
- 16 Bayesian Modeling and Inference for Sequence Motif Discovery
- 17 Identification of DNA Regulatory Motifs and Regulators by Integrating Gene Expression and Sequence Data
- 18 A Misclassification Model for Inferring Transcriptional Regulatory Networks
- 19 Estimating Cellular Signaling from Transcription Data
- 20 Computational Methods for Learning Bayesian Networks from High-Throughput Biological Data
- 21 Bayesian Networks and Informative Priors: Transcriptional Regulatory Network Models
- 22 Sample Size Choice for Microarray Experiments
- Plate section
Summary
Abstract
We present model-based inference for proteomic peak identification and quantification from mass spectroscopy data, focusing on nonparametric Bayesian models. Using experimental data generated from MALDI-TOF mass spectroscopy (matrix-assisted laser desorption ionization time-of-flight) we model observed intensities in spectra with a hierarchical nonparametric model for expected intensity as a function of time-of-flight. We express the unknown intensity function as a sum of kernel functions, a natural choice of basis functions for modeling spectral peaks. We discuss how to place prior distributions on the unknown functions using Lévy random fields and describe posterior inference via a reversible jump Markov chain Monte Carlo algorithm.
Introduction
The advent of matrix-assisted laser desorption/ionization such time-of-flight (MALDI-TOF) mass spectroscopy and related SELDI-TOF (surface enhanced laser desorption/ionization) allows the simultaneous assay of thousands of proteins, and has transformed research in protein regulation underlying complex physiological processes. This technology provides the means to detect large proteins in a range of biological samples, from serum and urine to complex tissues, such as tumors and muscle. With appropriate statistical analysis, one may explore patterns of protein expression on a large scale in high-throughput studies without the need for prior knowledge of which proteins may be present (Baldwin et al., 2001; Diamandis, 2003; Martin and Nelson, 2001; Petricoin and Liotta, 2003; Petricoin et al., 2002). As such, it becomes a discovery tool, identifying proteins and pathways that are linked to a biological process. In applications, tens to thousands of spectra may be collected, leading to massive volumes of data. Each spectrum contains on the order of tens of thousands of intensity measurements, with an unknown number of peaks representing proteins of specific mass-to-charge ratios.
- Type
- Chapter
- Information
- Bayesian Inference for Gene Expression and Proteomics , pp. 293 - 308Publisher: Cambridge University PressPrint publication year: 2006
- 8
- Cited by