Document Type

Thesis

Degree

Master of Science (MS)

Major/Program

Statistics

First Advisor's Name

Sneh Gulati

First Advisor's Committee Title

Committee Chair

Second Advisor's Name

Zhenmin Chen

Second Advisor's Committee Title

Committee Member

Third Advisor's Name

Jie Mi

Third Advisor's Committee Title

Committee Member

Keywords

Sabermetrics, Statistics, Baseball, Runs, Sports, Offense, Defense, Regression, Estimation

Date of Defense

3-30-2018

Abstract

The focus of this thesis was to investigate which baseball metrics are most conducive to run creation and prevention. Stepwise regression and Liu estimation were used to formulate two models for the dependent variables and also used for cross validation. Finally, the predicted values were fed into the Pythagorean Expectation formula to predict a team’s most important goal: winning.

Each model fit strongly and collinearity amongst offensive predictors was considered using variance inflation factors. Hits, walks, and home runs allowed, infield putouts, errors, defense-independent earned run average ratio, defensive efficiency ratio, saves, runners left on base, shutouts, and walks per nine innings were significant defensive predictors. Doubles, home runs, walks, batting average, and runners left on base were significant offensive regressors. Both models produced error rates below 3% for run prediction and together they did an excellent job of estimating a team’s per-season win ratio.

Identifier

FIDC006540

Available for download on Sunday, November 08, 2020

Share

COinS
 

Rights Statement

Rights Statement

In Copyright. URI: http://rightsstatements.org/vocab/InC/1.0/
This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s).