MSACL 2017 US Abstract

Metabolomics Guided Systems Biology in Clinical Applications

Tao Huan (Presenter)
The Scripps Research Institute

Bio: I am a Research Associate in Gary Siuzdak’s lab at the Center for Metabolomics and Mass Spectrometry in The Scripps Research Institute (La Jolla, CA). My research interests focus on the development and application of mass spectrometry based technologies for metabolomics. One important aspect of my research is to invent new bioinformatic tools to provide convenient metabolomic data processing and multi-omics integration. Before I joined the Siuzdak lab, I received my Ph.D. degree in Analytical Chemistry from University of Alberta under the supervision of Dr. Liang Li and my thesis topic was chemical isotope labeling LC-MS based metabolomics.

Authorship: Tao Huan, Duane Rinehart, H Paul Benton, Erica Forsberg, Jose Rafael Montenegro Burke, Mingliang Fang, Aries Aisporna, and Gary Siuzdak
The Scripps Research Institute, La Jolla, CA

Short Abstract

Over the last 15 years, metabolomics has emerged as a powerful technology to interrogate cellular biochemistry, perform diagnostic testing, and characterize biochemical mechanisms of disease. Owing to innovative developments in informatics, analytical technologies and integration of orthogonal biological approaches, it is now possible to expand metabolomic analyses into understanding the systems-level effects of metabolites. In this work, we incorporated systems level technologies into XCMS, a widely used metabolomic platform, to gain insight into the mechanisms of disease progression in clinical applications. Our platform allows users to directly map metabolomic data onto metabolic pathways in “one-click” and carry out multi-omic integration with self uploaded and/or database archived epigenome, genetic variations, genome, transcriptome, and proteome data in a user-friendly approach.

Long Abstract

Introduction

While the success of metabolomics has been driven by mass spectrometry and NMR analytical advances, equally important have been developments in bioinformatic resources for data processing. For example, the widely used metabolomic software XCMS Online(1), developed by our lab, has been the cornerstone of the field and are used by thousands of investigators worldwide. Currently, XCMS Online has over 13,000 registered users in 180+ countries and its user base grows daily. These statistics reflect the rapid growth of the metabolomic field and our commitment to develop easy-to-use intuitive analytic tools for analyzing comprehensive metabolomic data.

In this work, we further extend the capacity of XCMS Online platform and bring it up to a new level to execute multi-omic integrative analysis. To achieve this goal, we first implemented a metabolic pathway prediction algorithm to allow the direct mapping of metabolomic data onto metabolic pathways prior to the time-consuming metabolic identification. We then incorporate transcriptomic and proteomic databases to allow the automatic integrative analysis of the dysregulated metabolic pathways confirmed from the metabolomic results. Further, we constructed libraries to include epigenome, (DNA methylation) and genetic variations (single nucleotide polymorphisms (SNPs) and trait-associated SNPs) within XCMS Online, which allows users to find the association of these gene regulation elements with each specific gene, by pathway in an interactive format and linked to the analysis results in XCMS Online. To demonstrate its performance, we applied this systems biology platform to a colon cancer study to understand how genetic regulations influence the progress of colon cancer and cancer metabolism.

Methods

Species-specific pathway information was archived with pathways and genes from Biocyc, proteins from Uniprot, and metabolites from KEGG and METLIN. Over 7600 metabolic species are provided in the platform, including human, mouse, yeast, and etc.

With respect to epigenomic data, we have archived DNA methylation data of 26 cancer types from The Cancer Genome Atlas (TCGA) and human aging from public available datasets through Gene Expression Omnibus (GEO) in NCBI. It’s worth noting that we are also actively including DNA methylation data for other common diseases (Diabetes, Alzheimer etc.) and phenotypes (such as drug resistance or addiction) via active searching and user-based requesting.

SNPs data were acquired to include all known SNPs in both human and mouse model downloaded via UCSC Genome. The current version of SNPs database contains >120 million entries and >81 million entries for HUMAN and MOUSE respectively. Besides, trait-associated SNPs data were also included in a separate category from Genome Wide Association Studies (GWAS) obtained from NCBI. Similar to the approach of acquiring DNA methylation data, we are actively including these SNPs from NCBI data repository.

To perform pathway analysis, a metabolic pathway enrichment analysis algorithm, mummichog(2), was modified and implemented into XCMS Online. This tool operates directly on the resulting XCMS feature table to reveal processed biological relevance of dysregulated metabolites in the form of metabolic networks and pathways. Further, to perform multi-omic integration, users can upload a list of differentially expressed genes and proteins. The multi-omic analysis tool then performs gene and/or protein matching to identify the overlapping gens and/or proteins from user uploaded data onto previously predicted pathways revealed from the results of the metabolic pathway analysis.

Both epigenome and genetic variations play important roles in gene regulation and influence downstream metabolic pathways significantly. Therefore we implemented our epigenetic and gene variations databases in our multi-omic analysis platform to allow the association of these gene regulation elements with each specific dysregulated gene from user uploaded data. We further augment our data visualization tools to graphically display the quantity and identification of theses results by pathway in an interactive format and linked to the analysis results in XCMS Online. This systems-level integration also allows the hyperlinks to additional detailed information about each DNA methylation and SNP, providing a comprehensive multi-omic analysis.

Paired colon tissue samples (tumor vs. normal) from 60 colon cancer patients were received and stored in -80 ºC freezer. Detailed clinical information are also available for patients and tumors (size, metastases, locations). After metabolites were extracted from tissue samples with organic solvents, comprehensive metabolomic data was acquired using HPLC-MS in ESI positive mode and HILIC-MS in ESI negative mode. Metabolomic data was processed in XCMS Online. Comprehensive transcriptomic and proteomic data were downloaded from The Cancer Genome Atlas (TCGA) and, The Cancer Genome Atlas (TCGA), respectively. Multi-omic integrative analysis was performed in XCMS Online.

Results

Traditionally in metabolomics study, significant metabolites are reduced from the entire metabolomic dataset using subjectively defined fold change, p-value, and signal intensities followed by manual identity confirmation. The related pathways in which the dysregulated metabolites are involved are determined and then compared with differentially expressed genes and proteins using either bioinformatic tools or by manual examination, overall a tedious and time consuming process. In our strategy, we developed a one-step approach within XCMS Online to conveniently make a direct linkage between metabolomic data and their biological contents in the form of metabolic pathways. Further, integrative analysis of these metabolic pathways is achieved by correlating the details in metabolic pathways with epigenetic, genetic variations, transcriptomic and proteomic data to decipher the metabolic network at the systems level.

We demonstrated this platform using a colon cancer study to exam the metabolic differences between patient-derived samples of colon cancer and normal tissues (paired analyses with n=30). Over 7,000 metabolic features were detected (XCMS Online public job ID# 1100254) and among them, 10% had statistical significance with p-values less than 0.01. These features were then used to predict associated metabolic pathways with the mummichog algorithm. Comprehensive RNAseq transcriptomic and shotgun proteomic data were acquired from The Cancer Genome Atlas (TCGA) and Clinical Proteomic Tumor Analysis Consortium (CPTAC) on separate samples (n=44). In total, over 10,000 significantly differentiated mRNAs (fold change ≥ 1.2, p-value ≤ 0.01) and over 2,500 statistically significant proteins (fold change ≥ 1.2, p-value ≤ 0.01) were used to correlate genes and proteins with metabolites. In total, ten metabolic pathways were identified with statistical significance (p value ≤ 0.01). Among them, five of the pathways have been previously implicated in the progression of cancer. Specifically, we noticed that spermine and spermidine degradation pathway was dysregulated in not only metabolite concentrations, but also gene expression and protein synthesis levels. This demonstrates the power of performing integrative analysis with real clinical samples, which allows us to have a systems-level view of cancer metabolism. More importantly integrative analysis with colon cancer specific epigenetic and genetic variations data archived in XCMS Online reveals several important DNA methylation and colon cancer associated SNPs sites that have not been reported before. The detailed study of their biological and clinical importance is still ongoing.

Conclusion

In this study, we developed a metabolomics guided systems biology platform and implemented it within the XCMS interactive interfaces to address the need for new bioinformatics developments in pathway mapping and integrative multi-omic analysis for clinical applications. This interface streamlines interpretation of metabolomic data to provide results that can immediately be put into the biological context. This platform is designed as a free cloud-based resource and is readily used by the online community that now hosts over 13,000 registered researchers. In the meantime, this system biology platform is tested using an ongoing colon cancer research project, attempting to address the biological function of genetic regulations on colon cancer progression. This application allows us to systematically understand how cancer progression and cancer metabolism dramatically affected by genetic regulation factors, such as DNA methylation and SNPs occurs in gene promoter regions.


References & Acknowledgements:

1. R. Tautenhahn, G. J. Patti, D. Rinehart, G. Siuzdak, XCMS Online: a web-based platform to process untargeted metabolomic data. Analytical chemistry 84, 5035-5039 (2012).

2. S. Li et al., Predicting network activity from high throughput metabolomics. PLoS Comput. Biol 9, e1003123 (2013).


Financial Disclosure

DescriptionY/NSource
Grantsno
Salaryno
Board Memberno
Stockno
Expensesno

IP Royalty: no

Planning to mention or discuss specific products or technology of the company(ies) listed above:

no