A Index and Code Links
A.1 Survey links
- An In-class Survey: rmd | r | pdf | html
- create a tibble dataset
- draw 10 random students from 50 and build a survey
- r: factor() + ifelse()
- dplyr: group_by() + mutate() + summarise()
- tibble: add_row()
- readr: write_csv()
A.2 Dataset, Tables and Graphs links
- Opening a Dataset: rmd | r | pdf | html
- Opening a Dataset.
- r: setwd()
- readr: write_csv()
- One Variable Graphs and Tables: rmd | r | pdf | html
- Frequency table, bar chart and histogram.
- R function and lapply to generate graphs/tables for different variables.
- r: c(‘word1’,‘word2’) + function() + for (ctr in c(1,2)) {} + lapply()
- dplyr: group_by() + summarize() + n()
- ggplot: geom_bar() + geom_histogram() + labs(title = ‘title’, caption = ‘caption’)
- Multiple Variables Graphs and Tables: rmd | r | pdf | html
- Two-way frequency table, stacked bar chart annd scatter-plot
- r: interaction()
- dplyr: group_by(var) + summarize(freq = n()) + spread(gender, freq)
- ggplot: aes(x,y,fill) + geom_bar(stat=‘identity’, fun.y=‘mean’, position=‘dodge’) + geom_point(size) + geom_text(size,hjust,vjust) + geom_smooth(method=lm) + labs(title,x,y,caption)
A.3 Summarizing Data links
- Mean and Standard Deviation: rmd | r | pdf | html
- Mean and standard deviation from a dataset with city-month temperatures.
- r: dim() + min() + ceiling() + lapply() + vector(mode=“character”,length) + substring(var, first, last) + func <- function(return(list))
- dplyr: mutate() + select() + filter()
- tidyr: gather(vara, val, -varb)
- rlang: !!sym(str_var_name)
- ggplot: aes(x, y, colour, linetype, shape) + facet_wrap(~var, scales=‘free_y’) + geom_line() + geom_point() + geom_jitter(size, width) + scale_x_continuous(labels, breaks)
- Rescaling Standard Deviation and Covariance: rmd | r | pdf | html
- Scatter-plot of a dataset with state-level wage and education data.
- Coefficient of variation and standard deviation, correlation and covariance.
- r: mean() + sd() + var() + cov() + cor()
- ggplot: geom_point(size) + geom_text() + geom_smooth()
A.4 Basics of Probability links
- Sample Space, Experimental Outcomes, Events, Probabilities: rmd | r | pdf | html
- Sample Space, Experimental Outcomes, Events and Probability.
- Union, intersection and complements
- conditional probability
- Examples of Sample Space and Probabilities: rmd | r | pdf | html
- Throwing a quarter, four candidates for election, six-sided unfair dice, two basketball games
- r: sample(size, replace, prob)
- Law of Large Number Unfair Dice: rmd | r | pdf | html
- Throw an unfair dice many times, law of large number.
- r: head() + tail() + factor() + sample() + as.numeric() + paste0(‘dice=’, var) + sprintf(‘%0.3f’, 1.1234) + sprintf(“P(S=1)=%0.3f, P(S=2)=%0.3f”, 1.1, 1.2345)
- stringr: str_extract() + as.numeric(str_extract(variable, “[^.n]+$”)))
- dplyr: mutate(!!str_mean_var := as.numeric(sprintf(‘%0.5f’, freq / sum(freq))))
- ggplot: geom_line() + scale_x_continuous(trans=‘log10’, labels=c(‘n=100’, ‘n=1000’), breaks=c(100, 1000))
- Multiple-Step Experiment: Playing the Lottery Three times: rmd | r | pdf | html
- Paths after 1, 2 and 3 plays.
A.5 Discrete Probability Distribution links
- Discrete Random Variable and Binomial Experiment: rmd | r | pdf | html
- Discrete Random Variable, expected value and variance.
- Binomial Properties, examples using USA larceny clearance rate, WWII German soldier survival rate
- r: dbinom() + pbinom() + sprintf(paste0(‘abc\n’, ‘efg = %s’), ‘opq’) + round(1.123, 2) + lapply()
- ggplot: df %>% ggplot(aes(x)) + geom_bar(aes(y=prob), stat=‘identity’, alpha=0.5, width=0.5, fill) + geom_text(aes(y=prob, label=paste0(sprintf(‘%2.1f’, p), ‘%’)), vjust, size, color, fontface) + labs(title, x, y, caption) + scale_y_continuous(sec.axis, name) + + scale_x_continuous(labels, breaks) + theme(axis.text.y, axis.text.y.right, axis.text.y.left)
- Poisson Probability Distribution: rmd | r | pdf | html
- Poisson Properties, Ladislaus Bortkiewicz and Prussian army horse-kick deaths.
- r: dpois() + ppois()
- ggplot: geom_bar() + geom_text() + gome_line() + geom_point() + geom_text() + labs() + scale_y_continuous() + scale_x_continuous() + theme()
Müller, Kirill, and Hadley Wickham. 2019. Tibble: Simple Data Frames. https://CRAN.R-project.org/package=tibble.
Wickham, Hadley. 2019. Tidyverse: Easily Install and Load the ’Tidyverse’. https://CRAN.R-project.org/package=tidyverse.
Wickham, Hadley, Winston Chang, Lionel Henry, Thomas Lin Pedersen, Kohske Takahashi, Claus Wilke, Kara Woo, and Hiroaki Yutani. 2019. Ggplot2: Create Elegant Data Visualisations Using the Grammar of Graphics. https://CRAN.R-project.org/package=ggplot2.
Wickham, Hadley, Romain François, Lionel Henry, and Kirill Müller. 2019. Dplyr: A Grammar of Data Manipulation. https://CRAN.R-project.org/package=dplyr.
Wickham, Hadley, and Lionel Henry. 2019. Tidyr: Tidy Messy Data. https://CRAN.R-project.org/package=tidyr.
Wickham, Hadley, Jim Hester, and Romain Francois. 2018. Readr: Read Rectangular Text Data. https://CRAN.R-project.org/package=readr.
Xie, Yihui. 2020. Bookdown: Authoring Books and Technical Documents with R Markdown. https://CRAN.R-project.org/package=bookdown.