A Comparative Study on Support Vector Machines
AuthorOhemeng, Matthew A.
Mathematics and Statistics
AltmetricsView Usage Statistics
In this thesis, we study Support Vector Machines (SVMs) for binary classification. We review literature on SVMs and other classification methods. We perform simulations to compare kernel functions found in selected R packages and also investigate the variable selection property of penalized SVMs. We consider most linearly separable data set, mostly linearly non-separable data set, and linearly non-separable data set requiring nonlinear SVMs. In addition, traditional classification methods, including the Linear Discriminant Analysis, Quadratic Discriminant Analysis, K-Nearest Neighbors, and Logistic Regression, are also fit to the data sets and compared to the SVM models. The results of the simulation indicate that choosing a kernel function is key to obtaining a good fit to a particular data set. Moreover, in situations where nonlinear SVMs are not required (such as the linear separable data set) fitting nonlinear SVMs to a data set might likely result in overfitting. Finally, we apply SVMs and other classification techniques to Alzheimer's disease data.