i
Exploring R
Evolution of R
Programming Features of R
R for Machine Learning
R for Data Analysis
Application of R
R vs. Python vs. SAS
R vs. Excel vs.Tableau
Install R base on Windows
Install R Studio on Windows
Install R base on Ubuntu
Install R Studio on Ubuntu
R Starter
First R Program
Working with R Packages
R Workplace and R Sessions
Manage working directory
Customize R studio
RStudio Debugger
RStudio History and Environment variables
R Syntax
R Variables
R Data Types & Structures
R Arithmetic Operators
R Logical Operators
R If Statement
R - If…Else Statement
If…else if…Else Statement
R for loop
R while loop
R repeat loop
R String Construction
R String Manipulation Functions
Creating Character Strings
R Functions
R built-in functions
Working with Vector
R Vector Indexing
R Vector Modification
R Arithmetic Vector Operations
R Lists
Access List elements (List Slicing)
List modification
R Matrix construction
Access Matrix elements
R Matrix Modification
R Matrix Operations
R Array Construction
Accessing Array Elements
Manipulating Array Elements
R Data Frames
Data Extraction
Data Frame Expansion
R Built-in Data frames
R Factors
Manage Factor levels
Factor Functions
R Contingency Tables
R Data Visualization
R – Charts and Graphs
R Density Plot
R Strip Charts
R Boxplots
R Violin Plots
R Bar Charts
R Pie Charts
R Area Plots
R Time Series
Graphics with ggplot2
Ggplot2 Structure
ggplot2 Bar Charts
ggplot2 Pie Chart
ggplot2 Area Plot
ggplot2 Histogram
ggplot2 Scatter Plot
ggplot2 Box Plot
Mean & Median
Standard Deviation
Normal Distribution
Correlation
T-Tests
Chi-Square Test
ANOVA Test
Survival Analysis
Data Pre-processing and Missing Value Analysis
Missing data treatment
Missing value analysis with mice package
Outlier Analysis
Problems with outliers
Outlier Detection
Outlier Treatment
Simple Linear Regression
Mathematical Computation
Linear Regression in R
A complete Simple Regression Analysis
Multiple Linear Regression
Mathematical Analysis
Model Interpretation
A complete Multiple Regression Analysis
Logistic Regression
Mathematical Computation in R
Logistic Regression in R
Heart Risk Analysis using LR
Support Vector Machine
Heart Risk Analysis using SVM
Decision Trees
Random Forest
K means Clustering
Big data Analytics using R-Hadoop
RHADOOP Packages:
rJava: Low-Level R to Java Interface
rhdfs: Integrate R with HDFS
rmr2: MapReduce job in R
plyrmr: Data Manipulation with MapReduce job
rhbase: Integrate HBase with R
Environment setup for RHADOOP
Getting Started with RHADOOP
Coronary heart diseases remain one of the leading causes of death all over the world. One of the biggest contributors to coronary disease is a lack of commitment to a heart-healthy lifestyle and the consequences associated with it.
This project aims at early detection that means finding out whether the patients have the risk of coronary heart disease in the next ten years. We have used the Data Set, heartdata for this analysis.
Step 1: We are going to split the dataset into training and test set.
Step 2: Model Training using a training set
In this section, we will train our model using svm() function. The syntax is available in the below section.
svm () Syntax:
We will use the same Training dataset used in the Logistic regression model. Now we will use the below command to train the SVM model.
In svm () function:
In our experiment, we used the svm() function with the kernel as a polynomial. We did our test with different cost functions (.001, .01, 1, and 10) and observed that svm() function with .001 cost was giving us the best result. We have stored that result in the svm_Linear variable.
Detail of svm_Linear:
Step 3: Model Prediction using Test data set.
The caret package provides the predict () method for predicting results. We are passing two arguments. Its first parameter is our trained model, and the second parameter, "newdata” holds our testing data frame. The predict() method returns a list. We are saving it in a predict_svm variable.
Step 4: Model Evaluation using Confusion Matrix and ROC Curve:
Confusion Matrix for SVM
In the case of Support Vector Machine, we will use confusionmatrix function from the caret package. In this function, we are going to use our table function. It will give us the detail of the matrix, including different measurements like sensitivity, specificity, and many more. In our test, we got only 1 true positive and no false-positive values.
-ROC Curve
AUC - ROC curve is a performance measurement for the classification problem at various threshold settings. ROC is a probability curve, and AUC represents the degree or measure of separability. This shows how much the model is capable of differentiating between classes. Higher the AUC, better the model is at predicting 1s as 1s and 0s as 0s. Through contrast, the higher the AUC, the easier the model is to distinguish between patients at risk for heart failure and no risk.
ROC-AUC Curve for SVM:
Don't miss out!