Document Type
Thesis
Degree
Master of Science (MS)
Major/Program
Statistics
First Advisor's Name
Hassan Zahedi
First Advisor's Committee Title
Committee Chair
Second Advisor's Name
Sneh Gulati
Second Advisor's Committee Title
Committee Member
Third Advisor's Name
Golam Kibria
Third Advisor's Committee Title
Committee Member
Fourth Advisor's Name
Zhenmin Chen
Fourth Advisor's Committee Title
Committee Member
Keywords
Linear Regression, Subset Selection, forward selection, backward elimination, regression trees, random forest, best subset selection, high dimensional data, regression
Date of Defense
4-11-2019
Abstract
Regression is a statistical technique for modeling the relationship between a dependent variable Y and two or more predictor variables, also known as regressors. In the broad field of regression, there exists a special case in which the relationship between the dependent variable and the regressor(s) is linear. This is known as linear regression.
The purpose of this paper is to create a useful method that effectively selects a subset of regressors when dealing with high dimensional data and/or collinearity in linear regression. As the name depicts it, high dimensional data occurs when the number of predictor variables is far too large to use commonly known methods. Collinearity, on the other hand, occurs when there exists a linear relationship amongst one or more pairs of independent variables.
This paper is divided into three main section: an introduction, which reviews key concepts that are needed for a full understanding of the paper; the methodology, which guides the reader, step-by-step, through the process of the newly devised method; results, which thoroughly explain and analyze any findings and propose further ideas to be studied.
Identifier
FIDC007704
Recommended Citation
Nodarse, Elieser, "Best Probable Subset: A New Method for Reducing Data Dimensionality in Linear Regression" (2019). FIU Electronic Theses and Dissertations. 4280.
https://digitalcommons.fiu.edu/etd/4280
Included in
Applied Statistics Commons, Multivariate Analysis Commons, Probability Commons, Social Statistics Commons, Statistical Methodology Commons, Statistical Models Commons
Rights Statement
In Copyright. URI: http://rightsstatements.org/vocab/InC/1.0/
This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s).