= Emerging. More than 5 years before clinical availability. (26.55%)
= Expected to be clinically available in 1 to 4 years. (39.66%)
= Clinically available now. (33.79%)
MSACL 2020 US : Burla

MSACL 2020 US Abstract

Topic: Data Science

Podium Presentation in Room 5 on Thursday at 11:00 (Chair: Irene van den Broek)

Supervised Data Processing and Quality Control of Mass Spectrometry-based Lipidomics Analyses Using LIDAR, a Novel Toolbox Implemented in R/Shiny

Bo Burla (Presenter)
National University of Singapore

Presenter Bio(s): I studied biology at the University of Zurich, Switzerland, and made my PhD in Molecular Plant Physiology in the lab of Prof. Enrico Martinoia, focusing on ABC transporters and comparative molecular phylogenetics. Subsequently, I became as researcher at the University Hospital of Zurich, where I was involved in establishing an LC/MS-based assay for the peptide hormone hepcidin and in projects studying Fabry Disease mechanisms. Biological processes, bioanalysis, clinical applications and the use of informatics to improve workflows have been my constant research interests. With this background I am now working as a senior researcher at the Singapore Lipidomics Incubator (SLING), heading our new data team that is focusing on developing workflows and software pipelines for the analytical data processing, QA/AC and exploration of the diverse lipidomics datasets generated by our lab.

Authors: Bo Burla (1), Jeremy John Selva (2), Shanshan Ji (1), Gao Liang (2), Peter Benke (2), Anne K. Bendt (1), Federico Torta (2), Markus R. Wenk (1,2)
(1) Singapore Lipidomics Incubator (SLING), Life Sciences Institute and (2) Department of Biochemistry, YLL School of Medicine, National University of Singapore, Singapore

Abstract

Introduction

Data processing and quality control are integral elements of quantitative omics workflows. They can have a major impact on the results and can themselves be sources of variability, bias and artefacts. Furthermore, the data processing methods used in lipidomics/metabolomics analyses are often not sufficiently and transparently documented, limiting transparency and reusability of published datasets.

Objectives

The developed pipeline and software aim to provide a structured, supervised and reproducible data processing and quality control workflow for quantitative lipidomics raw datasets.

Methods

The data processing and analytical quality control pipeline was developed based on published procedures (e.g. Broadhurst et al., Metabolomics, 2018) and insights obtained from processing of the lipidomics analyses in our lab. The core of the LIDAR software pipeline is implemented as an R package with defined data structures and classes. The LIDAR user interface is implemented in R/Shiny with interactive plots and Rmarkdown. The tool is internally deployed using Docker containers.

Results

The LIDAR toolbox comprises modules for the system suitability monitoring, validation of integrated peaks based on retention times and ion ratios, internal standard-based normalization and quantification, correction for isotopic interferences, testing for matrix effects, performing drift and batch corrections, standardizations with reference materials. Functions to filter datasets according to defined QC criteria (e.g. RSD, S/N) are implemented as well. LiDAR reports and visualizes the effects of each of data processing steps before and after. It therefore allows potential identification of analytical issues, and artefacts introduced by data processing steps, which is valuable in the method development/validation phase but also for QA/QC of established assays. Using this tool we show examples of artefacts that can result from data processing, e.g. that ISTD-based normalization can bias or inflate the variability of the results. Furthermore, a few lipidomics-specific data exploration tools are also provided in LIDAR, e.g. considering fatty acids compositions and lipid pathways.

Conclusions

The presented workflow, accompanied by a software toolbox, enables automated, supervised and reproducible data processing of large-scale complex LC/shotgun MS-based lipidomics datasets by lab analysts. This toolbox, implemented in R with defined data structures/interfaces, should facilitate addition of new/improved functionalities by the community. This workflow may also be useful for targeted metabolomics assays. We hope that this workflow and toolbox will contribute to the ongoing efforts towards harmonization and more reproducible research in the lipidomics field.


Financial Disclosure

DescriptionY/NSource
Grantsno
Salaryno
Board Memberno
Stockno
Expensesno
IP Royaltyno

Planning to mention or discuss specific products or technology of the company(ies) listed above:

no