Helge Hecht, RECETOX, Masaryk University
Martin Čech, RECETOX, Masaryk University
Matej Troják, RECETOX, Masaryk University
Maksym Skoryk, RECETOX, Masaryk University
Jiří Novotný, Institute of Computer Science, Masaryk University
Karolína Trachtová, RECETOX, Masaryk University
Aleš Křenek, Institute of Computer Science, Masaryk University
Elliott James Price, RECETOX, Masaryk University
Jana Klánová, RECETOX, Masaryk University
⇒ Abstract
Various tools for the processing of mass spectrometry (MS) data exist, though few of them have user friendly GUI interfaces, are built with scalability and robustness in mind, or deliver reproducible results. Many of them are also tied to specific frameworks, e.g bioconductor. There is a need for harmonization in data processing for MS based -omics.The Galaxy framework enables user-friendly access to an expandable set of tools and workflows hosted via compute infrastructure, bringing FAIR data principles to -omics data processing. We extended existing Galaxy resources with R and Python packages (i.e apLCMS, xMSannotator, RIAssigner, matchms, WaveICA , MSMetaEnhancer and RAMClustR) and compiled those into comprehensive end-to-end workflows for non-target analysis (NTA). These tools perform several steps (e.g., peak detection & deconvolution, spectral matching & annotation, batch correction & normalization and metadata curation) to enable end-to-end pre-processing of data acquired via gas chromatography (GC) coupled with MS.Through the demo, we will show how to operate various tools in a data processing pipeline to obtain reproducible and shareable results using an example GC-HRMS dataset. We will show how to choose parameters for peak detection & deconvolution specifically for GC-MS in contrast to LC-MS data and how these parameter choices affect the outputs. Subsequently, we will show how to incorporate retention index information from RIAssigner into the annotation using matchms to improve the chemical annotation. Besides the workflow for compound annotation, we also include tools enabling automated curation and expansion of metadata in associated mass spectral libraries.By applying the knowledge gained from the topics covered in this demo, the participants will be able to process their own MS data using Galaxy. All developments covered are open source and hosted on GitHub providing immediate access and creating a platform for scientific collaborations. Access to those resources is provided to the community by hosting a public instance of Galaxy and by uploading tools to the main tool shed.