MSACL 2018 US Abstract

Topic: Metabolomics

Targeted Full-Scan Data Analysis Using the ChromXtractor Tool Suite

Adam Rosebrock (Presenter)
Stony Brook University

Bio: Dr. Adam Rosebrock is an assistant professor in the Department of Pathology at Stony Brook School of Medicine and the Stony Brook University Cancer Center. He has had longstanding interest in using “big data” to address fundamental biological questions. The focus of his research is on understanding the regulation of biochemical activities that underlie cell division, growth, and survival across diverse external states. The Rosebrock lab actively develops new experimental and analytical methods and builds genetic, hardware, and computational tools to enable high-throughput and high-content biology, with particular emphasis on quantitative mass-spectrometry metabolomics.

Authorship: Adam P Rosebrock (1)
(1) Department of Pathology, Stony Brook University School of Medicine

Short Abstract

Ongoing advances in sensitivity, resolution, and acquisition speed of mass spectrometers are driving adoption in a growing range of research and clinical applications. New approaches for data storage, analysis, and review are necessary to deal with the increased volume and information density of new and emerging platforms. I will discuss how our open-source ChromXtractor software suite enables targeted analysis of full-scan LC/GC/CE-MS data, including the ability to (1) quickly analyze full scan mass spec data in a feature-centric fashion, (2) perform robust feature-level local chromatographic/electropherographic alignment, (3) streamline review of individual compounds within a sample or group, (4) enable project-wide consensus-bounded integration, (5) correct data to spiked-in mass labeled standards, while (6) passing data to and from other open-source and commercial tools.

Long Abstract


Mass spectrometry is increasingly applied to both clinical diagnostics and clinical research. Full scan mass spectrometry produces data which can be queried for the presence of multiple compounds, but such analyses have in the past been challenged by chromatographic and instrument drift, by variation in sensitivity between runs, and by the wide range of signal intensities observed in typical metabolomic analyses.

We have developed a set of software tools, collectively termed ChromXtractor, to process data collected in profile mode. Analysis of profile mode data expands the dynamic range of analysis by avoiding the dropout artifacts that commonly occur in centroid mode data. We have implemented a range of methods for visualization, inspection, and quality control of large datasets. In addition to quantitation of known compounds, our tools enables recursive data mining for discovery and analysis of new mass spectral features.


ChromXtractorPRO is a software package that integrates with the ProteoWizard pipeline to access raw profile mode mass spectral data from a range of platforms. ChromXtractorPRO has been implemented as a standalone software toolkit in R with the added ability to seamlessly and bidirectionally extend the functionality of XCMS, including addition of local feature alignment and manual integration.

Our group routinely employs several liquid chromatographies, including conventional reverse phase, ion-paired reverse phase, and HILIC methods. We developed the ChromXtractor suite to deal with the diverse needs of robustly aligning and analyzing these differing data types as well as CE and GC hyphenated mass spectrometry analyses. The suite includes intuitive, user-friendly tools for rapid inspection and comparison of retention times of spike-ins, allowing run-to-run and within-run tracking of method performance parameters. These tools enable the analyst to identify issues such as column lifetime and system performance with a few mouse clicks.

I will discuss how ChromXtractor streamlines analysis of sample sets and enables rapid processing of large data sets composed of many thousand samples, including our use of a biologically-derived internal mass-labeled standard. ChromXtractorPRO enables the simultaneous peak alignment, integration and analysis of this internal standard with identical parameters as the analyze of interest. Importantly, this internal standard can be used for peak alignment, ensuring that all samples are aligned to the constant signal provided by spike-in.

Extracted features are locally aligned and consensus-bound integrated prior to statistical analysis. For unbiased feature discovery, samples were analyzed using MassHunter Profinder (Agilent Technologies) in a recursive approach to generate mass spectral features for subsequent extraction and integration by Profounder. We additionally have a large set of credentialed biological features


Consistent integration of non-symmetric chromatographic peaks is one of the major challenges of LC-MS data analysis. In addition to automated integration, ChromXtractoPRO uniquely provides a set of tools to display aligned sample data in a compound-by-compound manner for individual integration. This hand inspection can be performed with just two mouse clicks, and problematic peaks can be flagged for subsequent followup. Targeted mode analysis of profile data using ChromXtractorPRO requires 5-10% of the hands-on user time as compared to commercially available software packages. This improvement in data workup has enabled our group to successfully take on large projects encompassing hundreds and even thousands of samples. For samples with stable isotope internal reference material, the integration parameters are identical for the analyze peak and the reference peak, ensuring faithful quantitation.

Conclusions & Discussion

Our ChromXtractor software permits robust, rapid quantitation of mass spectrometric data collected by using TOF and qTOF instrumentation hyphenated with any separation. For routine applications, the vast majority of full-scan metabolomics data are collected and ignored during subsequent analysis, a process that is streamlined by these tools which we designed specifically for this "targeted full-scan" approach. The presence of full-scan data in archive storage provides a powerful tool for re-analysis and data mining of existing files for newly discovered compounds.

References & Acknowledgements:

K. U. Laverty, J. Yuan, and A. Lunyov for coding and debugging of software tools.

A. A. Caudy for early access to data and extensive discussion.

Rosebrock, A. P. (2017). Targeted full-scan LC-MS metabolomics: simultaneous quantitation of knowns and feature discovery provide the best of both worlds. Bioanalysis, 9(1), 5–8.

Financial Disclosure

GrantsyesAgilent Technologies
SalaryyesMaple Flavored Solutions LLC
Board MemberyesMSACL

IP Royalty: no

Planning to mention or discuss specific products or technology of the company(ies) listed above: