Document Type
Dissertation
Degree
Doctor of Philosophy (PhD)
Major/Program
Computer Science
First Advisor's Name
Sundaraja Sitharama Iyengar
First Advisor's Committee Title
Committee Chair
Second Advisor's Name
Nelson Lopez-Jimenez
Third Advisor's Name
Shu-Ching Chen
Fourth Advisor's Name
Naphtali Rishe
Fifth Advisor's Name
Malek Adjouadi
Keywords
Computer Science, Bioinformatics, Mutations, Cancer, Framework, Prediction, Genome, Sequences, DNA, Next Generation Sequencing
Date of Defense
11-15-2013
Abstract
Bio-systems are inherently complex information processing systems. Furthermore, physiological complexities of biological systems limit the formation of a hypothesis in terms of behavior and the ability to test hypothesis. More importantly the identification and classification of mutation in patients are centric topics in today’s cancer research.
Next generation sequencing (NGS) technologies can provide genome-wide coverage at a single nucleotide resolution and at reasonable speed and cost. The unprecedented molecular characterization provided by NGS offers the potential for an individualized approach to treatment. These advances in cancer genomics have enabled scientists to interrogate cancer-specific genomic variants and compare them with the normal variants in the same patient. Analysis of this data provides a catalog of somatic variants, present in tumor genome but not in the normal tissue DNA.
In this dissertation, we present a new computational framework to the problem of predicting the number of mutations on a chromosome for a certain patient, which is a fundamental problem in clinical and research fields. We begin this dissertation with the development of a framework system that is capable of utilizing published data from a longitudinal study of patients with acute myeloid leukemia (AML), who’s DNA from both normal as well as malignant tissues was subjected to NGS analysis at various points in time. By processing the sequencing data at the time of cancer diagnosis using the components of our framework, we tested it by predicting the genomic regions to be mutated at the time of relapse and, later, by comparing our results with the actual regions that showed mutations (discovered at relapse time). We demonstrate that this coupling of the algorithm pipeline can drastically improve the predictive abilities of searching a reliable molecular signature. Arguably, the most important result of our research is its superior performance to other methods like Radial Basis Function Network, Sequential Minimal Optimization, and Gaussian Process. In the final part of this dissertation, we present a detailed significance, stability and statistical analysis of our model. A performance comparison of the results are presented. This work clearly lays a good foundation for future research for other types of cancer.
Identifier
FI13120420
Recommended Citation
Martinez, Juan Carlos, "Towards the Prediction of Mutations in Genomic Sequences" (2013). FIU Electronic Theses and Dissertations. 987.
https://digitalcommons.fiu.edu/etd/987
Rights Statement
In Copyright. URI: http://rightsstatements.org/vocab/InC/1.0/
This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s).
Comments
Additional Committee Member
Jinpeng Wei