Date of this Version


Document Type



This work aims at predicting the symptom severity and contagiousness of a person infected with respiratory virus, using time series gene expression data. Four different respiratory viruses were studied – RSV, H1N1, H3N2 and Rhinovirus. Predictive models were built for each virus for each time point. Partial least squares discriminant analysis was used for feature selection and random forest was used for classification. Certain genes were identified as biomarkers in distinguishing the subjects. Gene enrichment analysis was performed on the differentially expressed genes. Prediction accuracy values were high even when expression data from early time points were analyzed. Significant genes were detected as early as 5 and 10 hours post infection, as compared to prior work that did so at 29 hours post infection. The potential biomarkers obtained with the proposed approach need to be investigated further.


Presented at F1000Research 2016.



Creative Commons License

Creative Commons Attribution 4.0 License
This work is licensed under a Creative Commons Attribution 4.0 License.



Rights Statement

Rights Statement

In Copyright. URI:
This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s).