Topic: Data Science
Podium Presentation in Room 3 on Wednesday at 11:20 (Chair: Will Slade)
Authors: Abed Pablo, Patrick Mathias
Opioid use and addiction are significant public health challenges throughout the entire US. Our laboratory supports our providers managing chronic pain patients by performing a quantitative LC/MS-MS method to monitor opioids and glucuronide metabolites. To help with the accuracy and consistency of data analysis, an in-house software application 'smack' was previously built to implement a quality control (QC) algorithm and calculations. The QC calculations include detection limits obtained during validation of the assay that have been updated over time. However, frequently updating QC parameters is challenging because of the difficulty in reviewing past assay data. Here we describe a combination of tools used to aggregate and visualize historical assay data to drive improvements in our implementation of our custom algorithms.
The objective of this project was to develop a toolkit using open source software to aggregate and visualize historical mass spectrometry metadata.
The opioid assay is performed with Waters LCMS systems using Waters software to control data acquisition and processing to which the results are exported to XML file format. The toolkit was created using Python 3.7 (www.python.org) programming language. First, we collected the data and built a database of assay results and metadata that included but was not limited to peak are, signal-to-noise, retention time, instrument, sample ID, and batch number. From this database, we built interactive histograms using the Plotly library to display data for each parameter by instrument.
To test the tool, we looked at the data from the past six months (>6000 patient samples), visualizing each QC component observing the evolution of the variables over time. Creating a database reduced the storage footprint of the data from 11GB to 170MB. The interactive figures allow users to investigate the data by selecting ranges and zooming into sections of the plots as well as displaying current QC parameters to compare against historical data. For example, the data revealed that the RRT range used in our algorithm was very wide and making it more stringent would improve the rate in which we capture false-positives. After updating QC parameters of the 'smack' application, we compared this version against the version in use and found improved performance.
We describe a Python-based toolkit developed to gather and visualize LCMS metadata. The construction of a database places the information in a form that is easy to access, share, and visualize. Moreover, using interactive visualization packages provides more information which are challenging to capture with static images. With this toolkit, we reviewed large volumes of historical data and improved upon the performance of our custom interpretation algorithms. These tools can be used more broadly to longitudinally visualize the metadata from our vendor platform, commonly used across multiple clinical labs.
|Planning to mention or discuss specific products or technology of the company(ies) listed above:||