2 Introduction to R
These notes introduce core R concepts you’ll use throughout the course.
Each section includes short explanations and runnable code.
Tip: You can run code line-by-line in RStudio with Ctrl+Enter (Windows) / Cmd+Enter (Mac).
2.1 Basic R Operations and Concepts
R works like a powerful calculator and a programming language. You type commands into the Console, and R evaluates them.
2.2 Arithmetic
R supports standard arithmetic operators:
- Addition:
+ - Subtraction:
- - Multiplication:
* - Division:
/ - Exponentiation:
^ - Integer division:
%/% - Remainder (mod):
%%
## [1] 8
## [1] 6
## [1] 42
## [1] 4
## [1] 32
## [1] 5
## [1] 2
These commands return numeric output immediately in the console.
2.3 Assignment, Object Names, and Data Types
2.3.2 Object Names
Good names are readable and informative.
Valid examples
heightexam_scorex1
Avoid
- spaces (
exam score) - starting with numbers (
1st) - reserved words (
if,for,TRUE,FALSE)
2.4 Vectors
Definition 2.1 (Vector in R) A vector is a collection of values of the same type stored in a single object in R.
Vectors are the basic building blocks in R: an ordered collection of values.
2.4.3 Vector Arithmetic (Vectorized Operations)
Operations apply element-by-element.
## [1] 3 5 7 9
## [1] 4 8 12 16
## [1] 4 16 36 64
2.4.4 Sequences with : and seq()
## [1] 1 2 3 4 5 6 7 8 9 10
## [1] 0.0 0.2 0.4 0.6 0.8 1.0
## [1] 1.00 3.25 5.50 7.75 10.00
2.5 Functions and Expressions
Definition 2.2 (Function) A function is a set of instructions that takes inputs and returns an output.
R has many built-in functions: mean(), sum(), sd(), etc.
## [1] 30
## [1] 6
## [1] 3.162278
## [1] 2
## [1] 10
2.5.3 Creating Your Own Function
Use function() to define reusable code.
# A simple function: compute z-scores
zscore <- function(x) {
(x - mean(x)) / sd(x)
}
zscore(c(10, 12, 15, 20))## [1] -0.9771621 -0.5173211 0.1724404 1.3220429
quick_summary <- function(x) {
c(
n = length(x),
mean = mean(x, na.rm = TRUE),
sd = sd(x, na.rm = TRUE),
min = min(x, na.rm = TRUE),
max = max(x, na.rm = TRUE)
)
}
quick_summary(c(1, 2, 3, NA, 5))## n mean sd min max
## 5.000000 2.750000 1.707825 1.000000 5.000000
2.6 Getting Help
R has excellent built-in help tools.
2.6.2 Examples in Help Files
Many help pages include examples you can run:
##
## mean> x <- c(0:10, 50)
##
## mean> xm <- mean(x)
##
## mean> c(xm, mean(x, trim = 0.10))
## [1] 8.75 5.50
2.6.4 Inspecting Objects
## num [1:5] 0.607 1.119 1.254 -1.22 1.21
## [1] "numeric"
## [1] "double"
## [1] "a" "A" "actD" "Admission" "age"
## [6] "aLin" "aLog" "alp" "alpha" "alt"
## [11] "aQua" "aSqr" "b" "B" "bac"
## [16] "biaSam" "bLin" "bLog" "booD" "booPva"
## [21] "bQua" "bSqr" "c" "C" "calories"
## [26] "cards" "chi_stat" "ci_bounds" "conInt" "control"
## [31] "cov" "cQua" "d" "D" "dat"
## [36] "deck" "den" "denLow" "densityChiSquare" "denUpp"
## [41] "df" "die1" "die2" "dif" "differences"
## [46] "E" "eduPar" "eduPer" "erI" "errors"
## [51] "exam_score" "f" "final_grade" "firstAce" "fitLin"
## [56] "fitLog" "fitQua" "fitSqr" "Fl" "flag"
## [61] "fLin" "fLog" "fQua" "Fr" "Fs"
## [66] "fSqr" "Gender" "gra" "grades" "graPer"
## [71] "hei" "heiDisBiaSam" "heiDisPop" "heiDisRanSam" "heiDisSimRanSam"
## [76] "heiRacMea" "heiRacSd" "i" "incLev" "incPer"
## [81] "int" "j" "k" "knoVar" "l"
## [86] "len" "lm1" "lm2" "lm3" "lm4"
## [91] "lower" "M" "mat" "matSco" "maxScoMat"
## [96] "maxScoVer" "meaBiaSam" "meaDisBiaSam" "meaDisRanSam" "meaDisSimRanSam"
## [101] "means" "meaPop" "meaPro" "meaRanSam" "meaSimRanSam"
## [106] "minScoMat" "minScoVer" "mu" "mu_null" "mu_pop"
## [111] "mu0" "mu1" "mu2" "muA" "n"
## [116] "N" "n_required" "n1" "n2" "name"
## [121] "namEduPar" "namIncLev" "namSch" "nP" "null_means"
## [126] "num_rep" "numGra" "numPeoRac" "numRep" "numSam"
## [131] "numSch" "numSel" "numStu" "numStuClu" "numStuGra"
## [136] "p" "p_value" "p1" "p2" "pA"
## [141] "pA_exact" "pAandB" "pAandB_exact" "pAc" "pAorB"
## [146] "pB" "pB_exact" "pBgivenA" "pC" "pCorD"
## [151] "pD" "pop" "population" "powTte" "powZte"
## [156] "pro" "pro50" "proInt" "proRej" "pval"
## [161] "pVal" "qua" "quick_summary" "r" "R"
## [166] "rac" "racNam" "ranSam" "reaAlc" "reaBac"
## [171] "rej" "rejBoo" "rejMat" "rejPer" "rejPoo"
## [176] "rejTre" "rep" "resLin" "resLog" "resQua"
## [181] "resSqr" "s" "S" "s1" "s2"
## [186] "s21" "s22" "sample_data" "sample_means" "sample1"
## [191] "sample2" "samSiz" "samVar" "sch" "schDat"
## [196] "schPer" "sd" "sd_pop" "sd1" "sd2"
## [201] "sdPro" "se" "secondAce" "sel" "selGra"
## [206] "selInc" "selPar" "selSch" "shape1" "shape2"
## [211] "sigma" "sigma0" "sigma02" "sigma2" "sim"
## [216] "simRanSam" "sk" "sk1" "sk2" "sp"
## [221] "stdPop" "t" "t_crit" "t_stat" "ta"
## [226] "tab" "tabCol" "tabMar" "tabRow" "testStat"
## [231] "toss1" "toss2" "treatment" "true_mean" "u"
## [236] "u1" "u2" "upper" "v" "va1"
## [241] "va2" "varKno" "vecPro" "verSco" "weight"
## [246] "winklerIntervalScore" "x" "x_future" "x_sorted" "x1"
## [251] "x2" "x3" "x4" "xbar" "xBar1"
## [256] "xBar2" "xm" "xmax" "xmin" "y"
## [261] "y_bar" "y1" "y2" "y3" "y4"
## [266] "yLin" "yLog" "ymax" "ymin" "yQua"
## [271] "ySqr" "z" "za" "zscore"
## [1] "a" "A" "actD" "Admission" "age"
## [6] "aLin" "aLog" "alp" "alpha" "alt"
## [11] "aQua" "aSqr" "b" "B" "bac"
## [16] "biaSam" "bLin" "bLog" "booD" "booPva"
## [21] "bQua" "bSqr" "c" "C" "calories"
## [26] "cards" "chi_stat" "ci_bounds" "conInt" "control"
## [31] "cov" "cQua" "d" "D" "dat"
## [36] "deck" "den" "denLow" "densityChiSquare" "denUpp"
## [41] "df" "die1" "die2" "dif" "differences"
## [46] "E" "eduPar" "eduPer" "erI" "errors"
## [51] "exam_score" "f" "final_grade" "firstAce" "fitLin"
## [56] "fitLog" "fitQua" "fitSqr" "Fl" "flag"
## [61] "fLin" "fLog" "fQua" "Fr" "Fs"
## [66] "fSqr" "Gender" "gra" "grades" "graPer"
## [71] "hei" "heiDisBiaSam" "heiDisPop" "heiDisRanSam" "heiDisSimRanSam"
## [76] "heiRacMea" "heiRacSd" "i" "incLev" "incPer"
## [81] "int" "j" "k" "knoVar" "l"
## [86] "len" "lm1" "lm2" "lm3" "lm4"
## [91] "lower" "M" "mat" "matSco" "maxScoMat"
## [96] "maxScoVer" "meaBiaSam" "meaDisBiaSam" "meaDisRanSam" "meaDisSimRanSam"
## [101] "means" "meaPop" "meaPro" "meaRanSam" "meaSimRanSam"
## [106] "minScoMat" "minScoVer" "mu" "mu_null" "mu_pop"
## [111] "mu0" "mu1" "mu2" "muA" "n"
## [116] "N" "n_required" "n1" "n2" "name"
## [121] "namEduPar" "namIncLev" "namSch" "nP" "null_means"
## [126] "num_rep" "numGra" "numPeoRac" "numRep" "numSam"
## [131] "numSch" "numSel" "numStu" "numStuClu" "numStuGra"
## [136] "p" "p_value" "p1" "p2" "pA"
## [141] "pA_exact" "pAandB" "pAandB_exact" "pAc" "pAorB"
## [146] "pB" "pB_exact" "pBgivenA" "pC" "pCorD"
## [151] "pD" "pop" "population" "powTte" "powZte"
## [156] "pro" "pro50" "proInt" "proRej" "pval"
## [161] "pVal" "qua" "quick_summary" "r" "R"
## [166] "rac" "racNam" "ranSam" "reaAlc" "reaBac"
## [171] "rej" "rejBoo" "rejMat" "rejPer" "rejPoo"
## [176] "rejTre" "rep" "resLin" "resLog" "resQua"
## [181] "resSqr" "s" "S" "s1" "s2"
## [186] "s21" "s22" "sample_data" "sample_means" "sample1"
## [191] "sample2" "samSiz" "samVar" "sch" "schDat"
## [196] "schPer" "sd" "sd_pop" "sd1" "sd2"
## [201] "sdPro" "se" "secondAce" "sel" "selGra"
## [206] "selInc" "selPar" "selSch" "shape1" "shape2"
## [211] "sigma" "sigma0" "sigma02" "sigma2" "sim"
## [216] "simRanSam" "sk" "sk1" "sk2" "sp"
## [221] "stdPop" "t" "t_crit" "t_stat" "ta"
## [226] "tab" "tabCol" "tabMar" "tabRow" "testStat"
## [231] "toss1" "toss2" "treatment" "true_mean" "u"
## [236] "u1" "u2" "upper" "v" "va1"
## [241] "va2" "varKno" "vecPro" "verSco" "weight"
## [246] "winklerIntervalScore" "x_future" "x_sorted" "x1" "x2"
## [251] "x3" "x4" "xbar" "xBar1" "xBar2"
## [256] "xm" "xmax" "xmin" "y" "y_bar"
## [261] "y1" "y2" "y3" "y4" "yLin"
## [266] "yLog" "ymax" "ymin" "yQua" "ySqr"
## [271] "z" "za" "zscore"
2.1.2 Comments
Use
#to write comments. R ignores them.