i
Exploring R
Evolution of R
Programming Features of R
R for Machine Learning
R for Data Analysis
Application of R
R vs. Python vs. SAS
R vs. Excel vs.Tableau
Install R base on Windows
Install R Studio on Windows
Install R base on Ubuntu
Install R Studio on Ubuntu
R Starter
First R Program
Working with R Packages
R Workplace and R Sessions
Manage working directory
Customize R studio
RStudio Debugger
RStudio History and Environment variables
R Syntax
R Variables
R Data Types & Structures
R Arithmetic Operators
R Logical Operators
R If Statement
R - If…Else Statement
If…else if…Else Statement
R for loop
R while loop
R repeat loop
R String Construction
R String Manipulation Functions
Creating Character Strings
R Functions
R built-in functions
Working with Vector
R Vector Indexing
R Vector Modification
R Arithmetic Vector Operations
R Lists
Access List elements (List Slicing)
List modification
R Matrix construction
Access Matrix elements
R Matrix Modification
R Matrix Operations
R Array Construction
Accessing Array Elements
Manipulating Array Elements
R Data Frames
Data Extraction
Data Frame Expansion
R Built-in Data frames
R Factors
Manage Factor levels
Factor Functions
R Contingency Tables
R Data Visualization
R – Charts and Graphs
R Density Plot
R Strip Charts
R Boxplots
R Violin Plots
R Bar Charts
R Pie Charts
R Area Plots
R Time Series
Graphics with ggplot2
Ggplot2 Structure
ggplot2 Bar Charts
ggplot2 Pie Chart
ggplot2 Area Plot
ggplot2 Histogram
ggplot2 Scatter Plot
ggplot2 Box Plot
Mean & Median
Standard Deviation
Normal Distribution
Correlation
T-Tests
Chi-Square Test
ANOVA Test
Survival Analysis
Data Pre-processing and Missing Value Analysis
Missing data treatment
Missing value analysis with mice package
Outlier Analysis
Problems with outliers
Outlier Detection
Outlier Treatment
Simple Linear Regression
Mathematical Computation
Linear Regression in R
A complete Simple Regression Analysis
Multiple Linear Regression
Mathematical Analysis
Model Interpretation
A complete Multiple Regression Analysis
Logistic Regression
Mathematical Computation in R
Logistic Regression in R
Heart Risk Analysis using LR
Support Vector Machine
Heart Risk Analysis using SVM
Decision Trees
Random Forest
K means Clustering
Big data Analytics using R-Hadoop
RHADOOP Packages:
rJava: Low-Level R to Java Interface
rhdfs: Integrate R with HDFS
rmr2: MapReduce job in R
plyrmr: Data Manipulation with MapReduce job
rhbase: Integrate HBase with R
Environment setup for RHADOOP
Getting Started with RHADOOP
Data frame
A data frame is used for storing data tables. Unlike an array, the data that we store in the data frame columns can be of different types. It means one column might be a numeric variable, another might be a factor, and a third might be a character variable. All columns have to be of the same length.
Data Frame construction
We can construct a data frame using data.frame() function, which creates data frames, tightly coupled collections of variables that share many of the properties of matrices and of lists used as the fundamental data structure by most of its modeling software.
In this example, we are taking four vectors of six elements as the data argument. The first and third are numeric vectors, the second one, cast name contains character vector, and the final one contains a character vector, which we convert as data. When we run this code, it produces a table-like structure with four columns and six rows.
When we create a data frame, we have to keep in mind that column names should be non-empty; each column should contain the same number of data items and can store different types of data.
Data frame Inspection:
We have already constructed a data frame with four columns and six rows. This str() function will display the structure of the data frame.This command will display the entire structure of bank data, data frame. In our data frame, we have six observations of four variables. Cust_id and balance contain numeric data, Cust_name contains character data, and last_trn contains date data.
Now, the summary function. It will help us to know the detail of each column. Our Cust_id contains numeric data, a minimum of them is 543, and the maximum is 1020. It also gives us the result of the median, mean, first, and the third quintile.The second column contains character data of length six. In case of balance and last_trn, it will provide the same statistical information like Cust_id.
Don't miss out!