i

R Programming Complete Tutorial

Support Vector Machine

A support vector machine is a type of model used to analyze data and discover patterns in classification and regression analysis. A support vector machine (SVM) is used when our data has exactly two classes (we have 0 and 1).  An  SVM  classifies data by finding the best hyperplane that separates all data points of one class from those of the other class.  The larger the margin between the two classes, the better the model is. The data points that lie on the boundary of the margin are called the support vectors.  Support Vector Machines map the training data into kernel space. There are many different used kernel spaces – linear (uses dot product), Radial Basis Function kernel, Multilayer Perceptron kernel, quadratic, polynomial, etc. In addition, there are multiple methods of implementing SVM, such as least squares, quadratic programming, and sequential minimal optimization. Considering that the Heart Disease data has a large number of features as well as instances, it is arguable whether the kernel chosen is linear or RBF.  Although the relation between the attributes and class labels is non-linear,  due to a large number of features, the RBF  kernel may not improve performance.  It is recommended that both kernels are tested, and the more efficient one is finally selected.

The idea of SVM is simple: The algorithm creates a line or a hyperplane which separates the data into classes.

More technically, a support vector machine creates a hyperplane or collection of hyperplanes in a high or infinite-dimensional space that can be used to identify classification regression or other activities such as outliers detection.