MSACL 2016 US Abstract

Scaling Discovery Proteomics to Large Lung Cancer Cohorts Using Data Independent Acquisition

John Koomen (Presenter)
Moffitt Cancer Center

Authorship: Bin Fang,1 Melissa A. Hoffman,1 Amol Prakash,2 Scott M. Peterman,3 Paul A. Stewart,1 Guolin Zhang,1 Richard Z. Liu,1 Matthew A. Smith,1 Joseph Johnson,1 Steven A. Eschrich,1 Eric B. Haura,1 & John M.
1, Moffitt Cancer Center, 2. OPTYS Tech., 3. Thermo

Short Abstract

Discovery proteomics using data independent acquisition (DIA) provides the maximum content from a single LC-MS/MS analysis. After a pilot project to compare DIA to discovery proteomics using traditional data dependent acquisition techniques, DIA strategies have been optimized and applied to 2 cohorts of lung cancer patients. The biology of the proteome detected and quantified in DIA experiments has been explored, and the resulting data have been used to classify lung cancer patients by their proteomic phenotypes. Feasibility has also been demonstrated for analysis of tissue microarrays using this technique, producing quantitative data for >3,000 proteins from a single section of a lung tumor core (0.6 mm in diameter and 5 microns thick). These data indicate the potential utility of DIA for assessment of tumor biology in situ using archived tumor specimens.

Long Abstract

Introduction: Previous discovery proteomics projects have been limited in patient cohort size due to the complexity and costs associated with peptide fractionation and liquid chromatography-tandem mass spectrometry (LC-MS/MS), which are necessary to provide sufficient depth to analyze the proteome. The current benchmark experiments from the NCI CPTAC group have produced discovery datasets for ~100 tumors using label-free or chemical labeling strategies; the depth of these experiments (i.e. relative quantification of 8,000-10,000 proteins) comes at a high cost in terms of domain expertise and instrument time, which makes these strategies challenging to routinely apply to studies of tumor biology. Current commercial instruments enable novel scan types, including data independent analysis (DIA), which can maximize the ability to identify and quantify peptides in the discovery dataset. DIA is able to evaluate a significant population of proteins (n = 3,000-5,000) in a single LC-MS/MS experiment, which is then compatible with routine analysis of larger cohorts of tumor specimens. Here, we explore the available biology in the proteome detected by DIA in lung squamous cell carcinoma (LUSC) and lung adenocarcinoma (LUAD) and discuss translation of the method into tissue microarrays.

Methods: A pilot project was used to compare label-free discovery proteomics, tandem mass tag chemical labeling (TMT), and DIA approaches. Then, a combined spectral library was created for pools of LUSC, LUAD, and their corresponding adjacent control tissues using basic pH reversed phase chromatography separation of tryptic peptides, fraction concatenation, and UPLC-MS/MS with data dependent analysis (RSLC & QExactive Plus, Thermo). Using the same instrument, DIA LC-MS/MS was used to evaluate each sample individually, generating a total dataset for 159 samples (50 LUSC, 46 LUAD, and 63 adjacent control lung tissues). Study design was blocked (tumor and adjacent controls together) and then randomized to mix LUSC and LUAD specimens. Twelve tissue specimens are analyzed in each batch with pooled LUSC and LUAD samples analyzed at the beginning and end of each batch (n = 16). Instrument parameters include injection of 500 ng of total peptide digest per sample, C18-PepMap100 columns (Thermo), 60 minute gradient from 15% to 38.5% B solvent (90% aqueous ACN, 0.1% formic acid), MS data are acquired at 70,000 resolving power for up to 3E6 ions accumulated over 200 ms. Then, DIA MS/MS sampled using 28 a.u. normalized collision energy for peptide ions from m/z 450 to m/z 1084 with 17,500 resolving power accumulating 1E6 ions for up to 100 ms (which matches accumulation time to mass analysis time). Isolation windows from m/z 450-900 were set to 7 Th with 1 Th overlap; from 900-1084, 15 Th windows with 1 Th overlap were used. Raw data were analyzed with Pinnacle (OPTYS Tech); pathway mapping and enrichment analysis were completed using GeneGO (Metacore). DIA analysis has also been applied to lung tumor TMA cores excised using UV laser capture microdissection to assess feasibility.

Results: Results from the pilot project indicated favorable return for DIA experiments (>4,000 proteins) as compared with label-free discovery proteomics (>8,000 proteins) and TMT (>6,000 proteins) considering the amount of instrument time dedicated to each experiment. For the top 50 pathways from protein set enrichment analysis, DIA datasets contained ~70% of the proteins of the label-free experiment, which had the highest total data content. Analysis of the proteins from the spectral library not detected by DIA indicated several cancer-related proteins, which may be good candidates for targeted tandem mass spectrometry or parallel reaction monitoring to increase detection of biologically relevant signaling.

Optimization of DIA using cell lysates and tissue homogenates led to the methods described above. The critical components are determination of the mass selection windows and elimination of MS/MS of regions that contained few peptides. As an example, the dataset was queried for unique proteins identified by peptides in each DIA mass selection window; high m/z MS/MS acquisition could be eliminated by the fact that few proteins were detected solely by peptides in that region and furthermore these proteins did not have known relevance to cancer or lung disease based on literature review.

After method development and evaluation, acquisition of the entire DIA dataset took ~25 days of instrument time (3 hrs/sample), including 13 replicate analyses for quality control and evaluation of reproducibility between batches. Data were compared to determine differences between LUSC, LUAD, and adjacent control tissue, which were then compared to previous literature. Hierarchical clustering was used to evaluate potential tumor classification schemes based on proteomic phenotypes.

Analysis of the TMA cores produced quantitative data for 3,097 to 3,605 proteins, indicating that the technology can be applied to minimal amounts of formalin fixed paraffin embedded tumor tissue. Because TMAs are usually assembled to address specific clinical questions, further analysis of these cohorts can be an extremely valuable use for DIA.

Conclusions: Data independent acquisition provides a method for discovery proteomics that still provides sufficient depth of coverage, while providing the ability to analyze large cohorts of patients (n ~ 100) within a time frame of 1 month on a single instrument. Initial data have been produced for lung squamous cell carcinoma and lung adenocarcinoma, which indicate its utility for assessment of patient groups. A proof-of-principle experiment has also demonstrated feasibility for analysis of tissue microarray cores; future application to additional TMAs can provide insights into the biology of disease using specimens that have extensive been annotated with extensive clinical detail. Further development of the mass spectrometry measurements to insert biomarker quantification into the context of the DIA experiment will also be important to better understand tumor biology.


References & Acknowledgements:

Lesur A, Domon B. Proteomics 2015, 15, 880-90.

Schubert OT, et al. Nat Protoc. 2015, 10, 426-41.

Gillet LC, et al. Mol Cell Proteomics 2012, 11, O111.01671.

Guo T, et al. Nat Med. 2015, 21, 407-13.

Prakash A, et al. J Proteome Res. 2014, 13, 5415-30.

Grant support has been received from the American Lung Association Lung Cancer Discovery Award (LCD-257857 to JK) and the National Cancer Institute (R21-CA169980 to JK and R21-CA169979 to EH) and Moffitt’s Lung Cancer Center of Excellence. Proteomics, Analytic Microscopy, and Tissue Core are supported by the National Cancer Institute (Cancer Center Support Grant P30-CA076292), and the Moffitt Foundation.


Financial Disclosure

DescriptionY/NSource
GrantsyesProteome Sciences/Electrophoretics
SalaryyesJohn Matthew Koomen
Board Memberno
Stockno
ExpensesyesThermo

IP Royalty: no

Planning to mention or discuss specific products or technology of the company(ies) listed above:

yes