2 Introduction to R
These notes introduce core R concepts you’ll use throughout the course.
Each section includes short explanations and runnable code.
Tip: You can run code line-by-line in RStudio with Ctrl+Enter (Windows) / Cmd+Enter (Mac).
2.1 Basic R Operations and Concepts
R works like a powerful calculator and a programming language. You type commands into the Console, and R evaluates them.
2.2 Arithmetic
R supports standard arithmetic operators:
- Addition:
+ - Subtraction:
- - Multiplication:
* - Division:
/ - Exponentiation:
^ - Integer division:
%/% - Remainder (mod):
%%
## [1] 8
## [1] 6
## [1] 42
## [1] 4
## [1] 32
## [1] 5
## [1] 2
These commands return numeric output immediately in the console.
2.3 Assignment, Object Names, and Data Types
2.3.2 Object Names
Good names are readable and informative.
Valid examples
heightexam_scorex1
Avoid
- spaces (
exam score) - starting with numbers (
1st) - reserved words (
if,for,TRUE,FALSE)
2.4 Vectors
Definition 2.1 (Vector in R) A vector is a collection of values of the same type stored in a single object in R.
Vectors are the basic building blocks in R: an ordered collection of values.
2.4.3 Vector Arithmetic (Vectorized Operations)
Operations apply element-by-element.
## [1] 3 5 7 9
## [1] 4 8 12 16
## [1] 4 16 36 64
2.4.4 Sequences with : and seq()
## [1] 1 2 3 4 5 6 7 8 9 10
## [1] 0.0 0.2 0.4 0.6 0.8 1.0
## [1] 1.00 3.25 5.50 7.75 10.00
2.5 Functions and Expressions
Definition 2.2 (Function) A function is a set of instructions that takes inputs and returns an output.
R has many built-in functions: mean(), sum(), sd(), etc.
## [1] 30
## [1] 6
## [1] 3.162278
## [1] 2
## [1] 10
2.5.3 Creating Your Own Function
Use function() to define reusable code.
# A simple function: compute z-scores
zscore <- function(x) {
(x - mean(x)) / sd(x)
}
zscore(c(10, 12, 15, 20))## [1] -0.9771621 -0.5173211 0.1724404 1.3220429
quick_summary <- function(x) {
c(
n = length(x),
mean = mean(x, na.rm = TRUE),
sd = sd(x, na.rm = TRUE),
min = min(x, na.rm = TRUE),
max = max(x, na.rm = TRUE)
)
}
quick_summary(c(1, 2, 3, NA, 5))## n mean sd min max
## 5.000000 2.750000 1.707825 1.000000 5.000000
2.6 Getting Help
R has excellent built-in help tools.
2.6.2 Examples in Help Files
Many help pages include examples you can run:
##
## mean> x <- c(0:10, 50)
##
## mean> xm <- mean(x)
##
## mean> c(xm, mean(x, trim = 0.10))
## [1] 8.75 5.50
2.6.4 Inspecting Objects
## num [1:5] 3.441 -2.343 1.17 -0.499 -0.159
## [1] "numeric"
## [1] "double"
## [1] "a" "A" "age" "alp" "alpha" "altVar" "appMea" "appMeaLin"
## [9] "b" "B" "bac" "biaSam" "booMea" "booPva" "booSam" "booSD"
## [17] "boot_means" "booT025" "booT975" "booTst" "breaks" "c" "C" "calories"
## [25] "cards" "ci_bounds" "conInt" "control" "count" "cov" "cover" "d"
## [33] "D" "d1" "d2" "data" "deck" "densityChiSquare" "df" "die1"
## [41] "die2" "dieRol" "differences" "dirVar" "disease" "E" "eduPar" "eduPer"
## [49] "errors" "exam_score" "exaMeaBin" "exaMeaLin" "f" "final_grade" "firstAce" "flag"
## [57] "gra" "grades" "graPer" "griSta" "hei" "heiDisBiaSam" "heiDisPop" "heiDisRanSam"
## [65] "heiDisSimRanSam" "heiRacMea" "heiRacSd" "i" "incLev" "incPer" "j" "k"
## [73] "l" "lower" "lowUpp" "M" "matSco" "matScoBin" "maxScoMat" "maxScoVer"
## [81] "mea0" "mea1" "mea2" "meaBiaSam" "meaDisBiaSam" "meaDisRanSam" "meaDisSimRanSam" "means"
## [89] "meaPop" "meaRanSam" "meaSimRanSam" "meaX" "meaX2" "medians" "minScoMat" "minScoVer"
## [97] "mu" "mu_null" "mu_pop" "mu0" "muA" "n" "n_required" "n1"
## [105] "n2" "name" "namEduPar" "namIncLev" "namSch" "null_means" "num_rep" "numDf"
## [113] "numGra" "numPeoRac" "numRep" "numSam" "numSch" "numSel" "numStu" "numStuClu"
## [121] "numStuGra" "obsAlc" "obsErr" "p" "p_value" "pA" "pA_exact" "pAandB"
## [129] "pAandB_exact" "pAc" "pAorB" "pB" "pB_exact" "pBgivenA" "pC" "pCorD"
## [137] "pD" "pmf" "pop" "population" "populationGrowth" "powTte" "powTte025" "powTte975"
## [145] "powZte" "pro" "pro1" "pro12" "pro2" "pro50" "proCon" "proInf"
## [153] "proInt" "proMar" "proTab" "qua" "quick_summary" "R" "rac" "racNam"
## [161] "ranks" "ranSam" "reaAlc" "reaBac" "rej" "rejMat" "rejPer" "relFreHea"
## [169] "rep" "repTwo" "s" "S" "s1" "s2" "samCoi" "sampA"
## [177] "sampB" "sample_boot" "sample_data" "sample_means" "sample1" "sample2" "samPop" "samSiz"
## [185] "samVar" "sch" "schDat" "schPer" "sd" "sd_pop" "sdx" "se"
## [193] "secondAce" "sel" "sel1" "sel2" "selGra" "selInc" "selPar" "selSch"
## [201] "sen" "sigma" "simRanSam" "sims" "spe" "stdPop" "suits" "t"
## [209] "t_crit" "t_stat" "ta" "test" "toss1" "toss2" "treatment" "trimmed"
## [217] "true_mean" "u" "u1" "u2" "upper" "v" "var" "varKno"
## [225] "varX" "varY" "verSco" "weight" "x" "X" "x_bar" "x_future"
## [233] "x_sorted" "x2" "xbar" "xBar" "xm" "xmax" "xmin" "xSe"
## [241] "y" "Y" "y_bar" "y_centered" "ybar" "ymax" "ymin" "yval"
## [249] "z" "za" "zlow" "zscore" "zupp"
## [1] "a" "A" "age" "alp" "alpha" "altVar" "appMea" "appMeaLin"
## [9] "b" "B" "bac" "biaSam" "booMea" "booPva" "booSam" "booSD"
## [17] "boot_means" "booT025" "booT975" "booTst" "breaks" "c" "C" "calories"
## [25] "cards" "ci_bounds" "conInt" "control" "count" "cov" "cover" "d"
## [33] "D" "d1" "d2" "data" "deck" "densityChiSquare" "df" "die1"
## [41] "die2" "dieRol" "differences" "dirVar" "disease" "E" "eduPar" "eduPer"
## [49] "errors" "exam_score" "exaMeaBin" "exaMeaLin" "f" "final_grade" "firstAce" "flag"
## [57] "gra" "grades" "graPer" "griSta" "hei" "heiDisBiaSam" "heiDisPop" "heiDisRanSam"
## [65] "heiDisSimRanSam" "heiRacMea" "heiRacSd" "i" "incLev" "incPer" "j" "k"
## [73] "l" "lower" "lowUpp" "M" "matSco" "matScoBin" "maxScoMat" "maxScoVer"
## [81] "mea0" "mea1" "mea2" "meaBiaSam" "meaDisBiaSam" "meaDisRanSam" "meaDisSimRanSam" "means"
## [89] "meaPop" "meaRanSam" "meaSimRanSam" "meaX" "meaX2" "medians" "minScoMat" "minScoVer"
## [97] "mu" "mu_null" "mu_pop" "mu0" "muA" "n" "n_required" "n1"
## [105] "n2" "name" "namEduPar" "namIncLev" "namSch" "null_means" "num_rep" "numDf"
## [113] "numGra" "numPeoRac" "numRep" "numSam" "numSch" "numSel" "numStu" "numStuClu"
## [121] "numStuGra" "obsAlc" "obsErr" "p" "p_value" "pA" "pA_exact" "pAandB"
## [129] "pAandB_exact" "pAc" "pAorB" "pB" "pB_exact" "pBgivenA" "pC" "pCorD"
## [137] "pD" "pmf" "pop" "population" "populationGrowth" "powTte" "powTte025" "powTte975"
## [145] "powZte" "pro" "pro1" "pro12" "pro2" "pro50" "proCon" "proInf"
## [153] "proInt" "proMar" "proTab" "qua" "quick_summary" "R" "rac" "racNam"
## [161] "ranks" "ranSam" "reaAlc" "reaBac" "rej" "rejMat" "rejPer" "relFreHea"
## [169] "rep" "repTwo" "s" "S" "s1" "s2" "samCoi" "sampA"
## [177] "sampB" "sample_boot" "sample_data" "sample_means" "sample1" "sample2" "samPop" "samSiz"
## [185] "samVar" "sch" "schDat" "schPer" "sd" "sd_pop" "sdx" "se"
## [193] "secondAce" "sel" "sel1" "sel2" "selGra" "selInc" "selPar" "selSch"
## [201] "sen" "sigma" "simRanSam" "sims" "spe" "stdPop" "suits" "t"
## [209] "t_crit" "t_stat" "ta" "test" "toss1" "toss2" "treatment" "trimmed"
## [217] "true_mean" "u" "u1" "u2" "upper" "v" "var" "varKno"
## [225] "varX" "varY" "verSco" "weight" "X" "x_bar" "x_future" "x_sorted"
## [233] "x2" "xbar" "xBar" "xm" "xmax" "xmin" "xSe" "y"
## [241] "Y" "y_bar" "y_centered" "ybar" "ymax" "ymin" "yval" "z"
## [249] "za" "zlow" "zscore" "zupp"
2.1.2 Comments
Use
#to write comments. R ignores them.