Rapidly link COVID-19 Multiomic data
Reliable discoveries with knowing01
Linking fragmented data, whether with different technologies or different omics level, and finding something meaningful from it is a time-consuming and sometimes frustrating task. This is exactly where we at knowing01 can help. Through our years of experience as data scientists and our developed unified data model, CellMap technology, we can quickly and easily link multiple omics and heterogeneous datasets. Our software is designed to analyze any kind of data without the need of a data scientist, to uncover hidden information and deliver reliable results.
Here we would like to share one of our success stories, a scientific collaboration, where we supported the analysis of RNA-binding proteins (RBPs) in COVID-19.
Why are RBPs of interest for COVID-19?
COVID-19 is a beta-coronavirus with single-stranded RNA that, upon infection, relies on the host cell protein synthesis machinery to translate a series of viral proteins for its replication. Recently, host cell RBPs have been reported to interact with viral RNA elements and influence several steps of the viral life cycle, such as viral RNA recruitment to the membrane and synthesis of genomic viral RNAs (Horlacher et al.).
Rapid harmonization of a hundred Multiomics datasets using knowing01 solutions
The goal of the research study (preprint available on BioRxiv: Horlacher et al.) was to create an interaction map for binding sites of RNA-binding proteins (RBPs) to the genome of the human virus SARS-CoV-2 (COVID-19) as well as understand their functions during the infection. It is the result of a collaboration between knowing01 and Marsico Lab at its pre-incubation academic host institution Helmholtz Munich and Ohler Lab at the Max Delbrück Center for Molecular Medicine, Berlin.
Using two deep learning methods (pysster and DeepRiPe) trained on a large CLIP-seq dataset, the authors created the first in silico map of interactions between human RBPs and viral RNA for COVID-19 and with six other human coronaviruses. In total, they obtained high-quality predictive models for 80+ RBPs and were able to identify a number of gain- and loss-of-binding events of protein-RNA interactions altered by variants of eleven different viral COVID-19 strains.
Excerpt from Figure 6 of Horlacher et al. showing RBP model predictions (rows) annotated with publicly available predicted and experimental SARS-CoV-2 RNA research results (columns).One of the remaining challenges was to understand the functional effects of these RBPs during SARS-CoV-2 infection and molecular regulation during COVID-19 disease.
Therefore, 97 datasets from 22 publicly available OMIC research studies on COVID-19 were collected to find evidence for the obtained RBP models. The use of knowing01’s CellMap unified data model helped to quickly harmonize and integrate these datasets. For example, the 88 obtained RBP models were annotated by automated mapping to gene, variant and protein identifiers. This led, for example, to the identification of the protein coding genes Muscleblind Like Splicing Regulator 1 (MBNL1), FTO Alpha-Ketoglutarate Dependent Dioxygenase (FTO) and FMR1 Autosomal Homolog 2 (FXR2) as potential clinical biomarkers for COVID-19.
Are you interested in value-driven Multiomics data? Then book a meeting with us or use the contact form below.