Document Type
Dissertation
Degree
Doctor of Philosophy (PhD)
Major/Program
Computer Science
First Advisor's Name
Giri Narasimhan
First Advisor's Committee Title
Committee chair
Second Advisor's Name
Ziv Bar-Joseph
Second Advisor's Committee Title
Committee member
Third Advisor's Name
Kalai Mathee
Third Advisor's Committee Title
Committee member
Fourth Advisor's Name
Ananda Mondal
Fourth Advisor's Committee Title
Committee member
Fifth Advisor's Name
Fahad Saeed
Fifth Advisor's Committee Title
Committee member
Keywords
Longitudinal microbiome analysis, Multi-omic integration, Microbial composition prediction, Dynamic Bayesian networks, Temporal alignment, Causal Inference
Date of Defense
11-12-2020
Abstract
Microbiomes are communities of microbes inhabiting an environmental niche. Thanks to next generation sequencing technologies, it is now possible to study microbial communities, their impact on the host environment, and their role in specific diseases and health. Technology has also triggered the increased generation of multi-omics microbiome data, including metatranscriptomics (quantitative survey of the complete metatranscriptome of the microbial community), metabolomics (quantitative profile of the entire set of metabolites present in the microbiome's environmental niche), and host transcriptomics (gene expression profile of the host). Consequently, another major challenge in microbiome data analysis is the integration of multi-omics data sets and the construction of unified models. Finally, since microbiomes are inherently dynamic, to fully understand the complex interactions that take place within these communities, longitudinal studies are critical. Although the analysis of longitudinal microbiome data has been attempted, these approaches do not attempt to probe interactions between taxa, do not offer holistic analyses, and do not investigate causal relationships.
In this work we propose approaches to address all of the above challenges. We propose novel analysis pipelines to analyze multi-omic longitudinal microbiome data, and to infer temporal and causal relationships between the different entities involved. As a first step, we showed how to deal with longitudinal metagenomic data sets by building a pipeline, PRIMAL, which takes microbial abundance data as input and outputs a dynamic Bayesian network model that is highly predictive, suggests significant interactions between the different microbes, and proposes important connections from clinical variables. A significant innovation of our work is its ability to deal with differential rates of the internal biological processes in different individuals. Second, we showed how to analyze longitudinal multi-omic microbiome datasets. Our pipeline, PALM, significantly extends the previous state of the art by allowing for the integration of longitudinal metatranscriptomics, host transcriptomics, and metabolomics data in additional to longitudinal metagenomics data. PALM achieves prediction powers comparable to the PRIMAL pipeline while discovering a web of interactions between the entities of far greater complexity. An important innovation of PALM is the use of a multi-omic Skeleton framework that incorporates prior knowledge in the learning of the models. Another major innovation of this work is devising a suite of validation methods, both in silico and in vitro, enhancing the utility and validity of PALM. Finally, we propose a suite of novel methods (unrolling and de-confounding), called METALICA, consisting of tools and techniques that make it possible to uncover significant details about the nature of microbial interactions. We also show methods to validate such interactions using ground truth databases. The proposed methods were tested using an IBD multi-omics dataset.
Identifier
FIDC009230
ORCID
https://orcid.org/0000-0002-5622-560X
Previously Published In
- Lugo-Martinez, J., Ruiz-Perez, D., Narasimhan, G., & Bar-Joseph, Z. (2019). Dynamic interaction network inference from longitudinal microbiome data. Microbiome, 7(1), 54.
- Ruiz-Perez, D., Lugo-Martinez, J., Bourguignon, N., Mathee, K., Lerner, B., Bar-Joseph, Z., & Narasimhan, G. (2019). Dynamic Bayesian networks for integrating multi-omics time-series microbiome data. BioRxiv, 835124.
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.
Recommended Citation
Ruiz-Perez, Daniel, "Temporal and Causal Inference with Longitudinal Multi-omics Microbiome Data" (2020). FIU Electronic Theses and Dissertations. 4557.
https://digitalcommons.fiu.edu/etd/4557
Rights Statement
In Copyright. URI: http://rightsstatements.org/vocab/InC/1.0/
This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s).