View on GitHub

R Examples Data and Optimization

R Code Examples Multi-dimensional/Panel Data

Star Fork Star DOI

This is a work-in-progress website consisting of R panel data and optimization examples for Statistics/Econometrics/Economic Analysis.

bookdown site and bookdown pdf.

Materials gathered from various projects in which R code is used. Files are from the R4Econ repository. This is not a R package, but a list of examples in PDF/HTML/Rmd formats. REconTools is a package that can be installed with tools used in projects involving R.

Bullet points show which base R, tidyverse or other functions/commands are used to achieve various objectives. An effort is made to use only base R and tidyverse packages whenever possible to reduce dependencies. The goal of this repository is to make it easier to find/re-use codes produced for various projects.

From other repositories: For dynamic borrowing and savings problems, see MEconTools and Dynamic Asset Repository; For code examples, see also Matlab Example Code, Stata Example Code, Python Example Code; For intro econ with Matlab, see Intro Mathematics for Economists, and for intro stat with R, see Intro Statistics for Undergraduates. See here for all of Fan’s public repositories.

Please contact FanWangEcon for issues or problems.

1 Array, Matrix, Dataframe

1.1 List

  1. Multi-dimensional Named Lists: rmd | r | pdf | html
    • Initiate Empty List. Named one and two dimensional lists. List of Dataframes.
    • Collapse named and unamed list to string and print input code.
    • r: deparse(substitute()) + vector(mode = “list”, length = it_N) + names(list) <- paste0(‘e’,seq()) + dimnames(ls2d)[[1]] <- paste0(‘r’,seq()) + dimnames(ls2d)[[2]] <- paste0(‘c’,seq())
    • tidyr: unnest()

1.2 Array

  1. Basic Arrays Operations in R: rmd | r | pdf | html
    • Generate N-dimensional array of NA values, label dimension elements.
    • Basic array operations in R, rep, head, tail, na, etc.
    • E notation.
    • Get N cuts from M points.
    • r: sum() + prod() + rep() + array(NA, dim=c(3, 3)) + array(NA, dim=c(3, 3, 3)) + dimnames(mn)[[3]] = paste0(‘k=’, 0:4) + head() + tail() + na_if() + Re()
  2. Generate Special Arrays: rmd | r | pdf | html
    • Generate equi-distance, special log spaced array.
    • Generate probability mass function with non-unique and non-sorted value and probability arrays.
    • r: seq() + sort() + runif() + ceiling()
    • stats: aggregate()
  3. String Operations: rmd | r | pdf | html
    • Split, concatenate, subset, replace, substring strings.
    • Convert number to string without decimal and negative sign.
    • r: paste0() + sub() + gsub() + grepl() + sprintf()
  4. Meshgrid Matrices, Arrays and Scalars: rmd | r | pdf | html
    • Meshgrid Matrices, Arrays and Scalars to form all combination dataframe.
    • tidyr: expand_grid() + expand.grid()

1.3 Matrix

  1. Matrix Basics: rmd | r | pdf | html
    • Generate and combine NA, fixed and random matrixes. Name columns and rows.
    • R: rep() + rbind() + matrix(NA) + matrix(NA_real_) + matrix(NA_integer_) + colnames() + rownames()
  2. Linear Algebra Operations: rmd | r | pdf | html

1.4 Variables in Dataframes

  1. Tibble Basics: rmd | r | pdf | html
    • generate tibbles, rename tibble variables, tibble row and column names
    • rename numeric sequential columns with string prefix and suffix
    • dplyr: as_tibble(mt) + rename_all(~c(ar_names)) + rename_at(vars(starts_with(“xx”)), funs(str_replace(., “yy”, “yyyy”)) + rename_at(vars(num_range(‘‘,ar_it)), funs(paste0(st,.))) + rowid_to_column() + colnames + rownames
  2. Label and Combine Factor Variables: rmd | r | pdf | html
    • Convert numeric variables to factor variables, generate interaction variables (joint factors), and label factors with descriptive words.
    • Graph MPG and 1/4 Miles Time (qsec) from the mtcars dataset over joint shift-type (am) and engine-type (vs) categories.
    • forcats: as_factor() + fct_recode() + fct_cross()
  3. Randomly Draw Subsets of Rows from Matrix: rmd | r | pdf | html
    • Given matrix, randomly sample rows, or select if random value is below threshold.
    • r: rnorm() + sample() + df[sample(dim(df)[1], it_M, replace=FALSE),]
    • dplyr: case_when() + mutate(var = case_when(rnorm(n(),mean=0,sd=1) < 0 ~ 1, TRUE ~ 0)) %>% filter(var == 1)
  4. Generate Variables Conditional on Other Variables: rmd | r | pdf | html
    • Use case_when to generate elseif conditional variables: NA, approximate difference, etc.
    • dplyr: case_when() + na_if() + mutate(var = na_if(case_when(rnorm(n())< 0 ~ -99, TRUE ~ mpg), -99))
    • r: e-notation + all.equal() + isTRUE(all.equal(a,b,tol)) + is.na() + NA_real_ + NA_character_ + NA_integer_
  5. R Tibble Dataframe String Manipulations: rmd | r | pdf | html
    • There are multiple CEV files, each containing the same file structure but simulated
    • with different parameters, gather a subset of columns from different files, and provide
    • with correct attributes based on CSV file names.
    • r: cbind(ls_st, ls_st) + as_tibble(mt_st)

2 Summarize Data

2.1 Counting Observation

  1. Counting Basics: rmd | r | pdf | html
    • uncount to generate panel skeleton from years in survey
    • dplyr: uncount(yr_n) + group_by() + mutate(yr = row_number() + start_yr)

2.2 Sorting, Indexing, Slicing

  1. Sorted Index, Interval Index and Expand Value from One Row: rmd | r | pdf | html
    • Sort and generate index for rows
    • Generate negative and positive index based on deviations
    • Populate Values from one row to other rows
    • dplyr: arrange() + row_number() + mutate(lowest = min(Sepal.Length)) + case_when(row_number()==x ~ Septal.Length) + mutate(Sepal.New = Sepal.Length[Sepal.Index == 1])
  2. Group and sort, and Slice and Summarize: rmd | r | pdf | html
    • Group a dataframe by a variable, sort within group by another variable, keep only highest rows.
    • dplyr: arrange() + group_by() + slice_head(n=1)

2.3 Group Statistics

  1. Cummean Test, Cumulative Mean within Group: rmd | r | pdf | html
    • There is a dataframe with a grouping variable and some statistics sorted by another within group
    • variable, calculate the cumulative mean of that variable.
    • dplyr: cummean() + group_by(id, isna = is.na(val)) + mutate(val_cummean = ifelse(isna, NA, cummean(val)))
  2. Count Unique Groups and Mean within Groups: rmd | r | pdf | html
    • Unique groups defined by multiple values and count obs within group.
    • Mean, sd, observation count for non-NA within unique groups.
    • dplyr: group_by() + summarise(n()) + summarise_if(is.numeric, funs(mean = mean(., na.rm = TRUE), n = sum(is.na(.)==0)))
  3. By Groups, One Variable All Statistics: rmd | r | pdf | html
    • Pick stats, overall, and by multiple groups, stats as matrix or wide row with name=(ctsvar + catevar + catelabel).
    • tidyr: group_by() + summarize_at(, funs()) + rename(!!var := !!sym(var)) + mutate(!!var := paste0(var,’str’,!!!syms(vars))) + gather() + unite() + spread(varcates, value)
  4. By within Individual Groups Variables, Averages: rmd | r | pdf | html
    • By Multiple within Individual Groups Variables.
    • Averages for all numeric variables within all groups of all group variables. Long to Wide to very Wide.
    • tidyr: gather() + group_by() + summarise_if(is.numeric, funs(mean(., na.rm = TRUE))) + mutate(all_m_cate = paste0(variable, ‘_c’, value)) + unite() + spread()

2.4 Distributional Statistics

  1. Tibble Basics: rmd | r | pdf | html
    • input multiple variables with comma separated text strings
    • quantitative/continuous and categorical/discrete variables
    • histogram and summary statistics
    • tibble: ar_one <- c(107.72,101.28) + ar_two <- c(101.72,101.28) + mt_data <- cbind(ar_one, ar_two) + as_tibble(mt_data)

2.5 Summarize Multiple Variables

  1. Apply the Same Function over Columns of Matrix: rmd | r | pdf | html
    • Replace NA values in selected columns by alternative values.
    • Cumulative sum over multiple variables.
    • Rename various various with common prefix and suffix appended.
    • r: cumsum() + gsub() + mutate_at(vars(contains(‘V’)), .funs = list(cumu = ~cumsum(.))) + rename_at(vars(contains(“V”) ), list(~gsub(“M”, “”, .)))
    • dplyr: rename_at() + mutate_at() + rename_at(vars(starts_with(“V”)), funs(str_replace(., “V”, “var”))) + mutate_at(vars(one_of(c(‘var1’, ‘var2’))), list(~replace_na(., 99)))

3 Functions

3.1 Dataframe Mutate

  1. Nonlinear Function of Scalars and Arrays over Rows: rmd | r | pdf | html
    • Five methods to evaluate scalar nonlinear function over matrix.
    • Evaluate non-linear function with scalar from rows and arrays as constants.
    • r: .$fl_A + fl_A=$`(., ‘fl_A’) + .[[svr_fl_A]]
    • dplyr: rowwise() + mutate(out = funct(inputs))
  2. Evaluate Functions over Rows of Meshes Matrices: rmd | r | pdf | html
    • Mesh states and choices together and rowwise evaluate many matrixes.
    • Cumulative sum over multiple variables.
    • Rename various various with common prefix and suffix appended.
    • r: ffi <- function(fl_A, ar_B)
    • tidyr: expand_grid() + rowwise() + df %>% rowwise() %>% mutate(var = ffi(fl_A, ar_B))
    • ggplot2: geom_line() + facet_wrap() + geom_hline() + facet_wrap(. ~ var_id, scales = ‘free’) + geom_hline(yintercept=0, linetype=”dashed”, color=”red”, size=1) +

3.2 Dataframe Do Anything

  1. Dataframe Row to Array (Mx1 by N) to (MxQ by N+1): rmd | r | pdf | html
    • Generate row value specific arrays of varying Length, and stack expanded dataframe.
    • Given row-specific information, generate row-specific arrays that expand matrix.
    • dplyr: do() + unnest() + left_join() + df %>% group_by(ID) %>% do(inc = rnorm(.$Q, mean=.$mean, sd=.$sd)) %>% unnest(c(inc))
  2. Dataframe Subset to Scalar (MxP by N) to (Mx1 by 1): rmd | r | pdf | html
    • MxQ rows to Mx1 Rows. Group dataframe by categories, compute category specific output scalar or arrays based on within category variable information.
    • dplyr: group_by(ID) + do(inc = rnorm(.$N, mean=.$mn, sd=.$sd)) + unnest(c(inc)) + left_join(df, by=”ID”)
  3. Dataframe Subset to Dataframe (MxP by N) to (MxQ by N+Z-1): rmd | r | pdf | html
    • Group by mini dataframes as inputs for function. Stack output dataframes with group id.
    • dplyr: group_by() + do() + unnest()

3.3 Apply and pmap

  1. Apply and Sapply function over arrays and rows: rmd | r | pdf | html
    • Evaluate function f(x_i,y_i,c), where c is a constant and x and y vary over each row of a matrix, with index i indicating rows.
    • Get same results using apply and sapply with defined and anonymous functions.
    • r: do.call() + apply(mt, 1, func) + sapply(ls_ar, func, ar1, ar2)
  2. Mutate rowwise, mutate pmap, and rowwise do unnest: rmd | r | pdf | html
    • Evaluate function f(x_i,y_i,c), where c is a constant and x and y vary over each row of a matrix, with index i indicating rows.
    • Get same results using various types of mutate rowwise, mutate pmap and rowwise do unnest.
    • dplyr: rowwise() + do() + unnest()
    • purrr: pmap(func)
    • tidyr: unlist()

4 Multi-dimensional Data Structures

4.1 Generate, Gather, Bind and Join

  1. R dplyr Group by Index and Generate Panel Data Structure: rmd | r | pdf | html
    • Build skeleton panel frame with N observations and T periods with gender and height.
    • Generate group Index based on a list of grouping variables.
    • r: runif() + rnorm() + rbinom(n(), 1, 0.5) + cumsum()
    • dplyr: group_by() + row_number() + ungroup() + one_of() + mutate(var = (row_number()==1)1)*
    • tidyr: uncount()
  2. R DPLYR Join Multiple Dataframes Together: rmd | r | pdf | html
    • Join dataframes together with one or multiple keys. Stack dataframes together.
    • dplyr: filter() + rename(!!sym(vsta) := !!sym(vstb)) + mutate(var = rnom(n())) + left_join(df, by=(c(‘id’=’id’, ‘vt’=’vt’))) + left_join(df, by=setNames(c(‘id’, ‘vt’), c(‘id’, ‘vt’))) + bind_rows()
  3. R Gather Data Columns from Multiple CSV Files: rmd | r | pdf | html
    • There are multiple CEV files, each containing the same file structure but simulated
    • with different parameters, gather a subset of columns from different files, and provide
    • with correct attributes based on CSV file names.
    • Separate numeric and string components of a string variable value apart.
    • r: file() + writeLines() + readLines() + close() + gsub() + read.csv() + do.call(bind_rows, ls_df) + apply()
    • tidyr: separate()
    • regex: (?<=[A-Za-z])(?=[-0-9])

4.2 Wide and Long

  1. TIDYR Pivot Wider and Pivot Longer Examples: rmd | r | pdf | html
    • Long roster to wide roster and cumulative sum attendance by date.
    • dplyr: mutate(var = case_when(rnorm(n()) < 0 ~ 1, TRUE ~ 0)) + rename_at(vars(num_range(‘’, ar_it)), list(~paste0(st_prefix, . , ‘’))) + mutate_at(vars(contains(str)), list(~replace_na(., 0))) + mutate_at(vars(contains(str)), list(~cumsum(.)))
  2. R Wide Data to Long Data Example (TIDYR Pivot Longer): rmd | r | pdf | html
    • A matrix of ev given states, rows are states and cols are shocks. Convert to Long table with shock and state values and ev.
    • dplyr: left_join() + pivot_longer(cols = starts_with(‘zi’), names_to = c(‘zi’), names_pattern = paste0(“zi(.)”), values_to = “ev”)

4.3 Join and Compare

  1. Find Closest Values Along Grids: rmd | r | pdf | html
    • There is an array (matrix) of values, find the index of the values closest to another value.
    • r: do.call(bind_rows, ls_df)
    • dplyr: left_join(tb, by=(c(‘vr_a’=’vr_a’, ‘vr_b’=’vr_b’)))

5 Linear Regression

5.1 Polynomial Fitting

  1. Fit a Time Series with Polynomial and Analytical Expressions for Coefficients: rmd | r | pdf | html
    • Given a time series of data points from a polynomial data generating process, solve for the polynomial coefficients.
    • Mth derivative of Mth order polynomial is time invariant, use functions of differences of differences of differences to identify polynomial coefficients analytically.
    • R: matrix multplication

5.2 OLS and IV

  1. IV/OLS Regression: rmd | r | pdf | html
    • R Instrumental Variables and Ordinary Least Square Regression store all Coefficients and Diagnostics as Dataframe Row.
    • aer: *library(aer) + ivreg(as.formula, diagnostics = TRUE) *
  2. M Outcomes and N RHS Alternatives: rmd | r | pdf | html
    • There are M outcome variables and N alternative explanatory variables. Regress all M outcome variables on N endogenous/independent right hand side variables one by one, with controls and/or IVs, collect coefficients.
    • dplyr: bind_rows(lapply(listx, function(x)(bind_rows(lapply(listy, regf.iv))) + starts_with() + ends_with() + reduce(full_join)

5.3 Decomposition

  1. Regression Decomposition: rmd | r | pdf | html
    • Post multiple regressions, fraction of outcome variables’ variances explained by multiple subsets of right hand side variables.
    • dplyr: gather() + group_by(var) + mutate_at(vars, funs(mean = mean(.))) + rowSums(matmat) + mutate_if(is.numeric, funs(frac = (./value_var)))*

6 Nonlinear and Other Regressions

6.1 Logit Regression

  1. Logit Regression: rmd | r | pdf | html
    • Logit regression testing and prediction.
    • stats: glm(as.formula(), data, family=’binomial’) + predict(rs, newdata, type = “response”)
  2. Estimate Logistic Choice Model with Aggregate Shares: rmd | r | pdf | html
    • Aggregate share logistic OLS with K worker types, T time periods and M occupations.
    • Estimate logistic choice model with aggregate shares, allowing for occupation-specific wages and occupation-specific intercepts.
    • Estimate allowing for K and M specific intercepts, K and M specific coefficients, and homogeneous coefficients.
    • Create input matrix data structures for logistic aggregate share estimation.
    • stats: lm(y ~ . -1)
  3. Fit Prices Given Quantities Logistic Choice with Aggregate Data: rmd | r | pdf | html
    • A multinomial logistic choice problem generates choice probabilities across alternatives, find the prices that explain aggregate shares.
    • stats: lm(y ~ . -1)

6.2 Quantile Regression

  1. Quantile Regressions with Quantreg: rmd | r | pdf | html
    • Quantile regression with continuous outcomes. Estimates and tests quantile coefficients.
    • quantreg: rq(mpg ~ disp + hp + factor(am), tau = c(0.25, 0.50, 0.75), data = mtcars) + anova(rq(), test = “Wald”, joint=TRUE) + anova(rq(), test = “Wald”, joint=FALSE)

7 Optimization

7.1 Bisection

  1. Concurrent Bisection over Dataframe Rows: rmd | r | pdf | html
    • Post multiple regressions, fraction of outcome variables’ variances explained by multiple subsets of right hand side variables.
    • tidyr: pivot_longer(cols = starts_with(‘abc’), names_to = c(‘a’, ‘b’), names_pattern = paste0(‘prefix’, “(.)_(.)”), values_to = val) + pivot_wider(names_from = !!sym(name), values_from = val) + mutate(!!sym(abc) := case_when(efg < 0 ~ !!sym(opq), TRUE ~ iso))
    • gglot2: geom_line() + facet_wrap() + geom_hline()

8 Mathematics

8.1 Basics

  1. Rescaling Bounded Parameter to be Unbounded and Positive and Negative Exponents with Different Bases: rmd | r | pdf | html
    • Log of alternative bases, bases that are not e, 10 or 2.
    • A parameter is constrained between 1 and negative infinity, use exponentials of different bases to scale the bounded parameter to an unbounded parameter.
    • Positive exponentials are strictly increasing. Negative exponentials are strictly decreasing.
    • A positive number below 1 to a negative exponents is above 1, and a positive number above 1 to a negative exponents is below 1.
    • graphics: plot(x, y) + title() + legend()
  2. Quadratic and other Rescaling of Parameters with Fixed Min and Max: rmd | r | pdf | html
    • Given a < x < b, use f(x) to rescale x, such that f(a)=a, f(b)=b, but f(z)=0.5*z for some z between a and b. Solve using the quadratic function with three equations and three unknowns uniquely.
  3. Find the Closest Point Along a Line to Another Point: rmd | r | pdf | html
    • A line crosses through the origin, what is the closest point along this line to another point.
    • Graph several functions jointly with points and axis.
    • graphics: par(mfrow = c(1, 1)) + curve(fc) + points(x, y) + abline(v=0, h=0)
  4. linear solve x with f(x) = 0: rmd | r | pdf | html
    • Evaluate and solve statistically relevant problems with one equation and one unknown that permit analytical solutions.

8.2 Production Function

  1. Nested Constant Elasticity of Substitution Production Function: rmd | r | pdf | html
    • A nested-CES production function with nest-specific elasticities.
    • Re-state the nested-CES problem as several sub-problems.
    • Marginal products and its relationship to prices in expenditure minimization.

8.3 Inequality Models

  1. GINI for Discrete Samples or Discrete Random Variable: rmd | r | pdf | html
    • Given sample of data points that are discrete, compute the approximate GINI coefficient.
    • Given a discrete random variable, compute the GINI coefficient.
    • r: sort() + cumsum() + sum()
  2. CES and Atkinson Inequality Index: rmd | r | pdf | html
    • Analyze how changing individual outcomes shift utility given inequality preference parameters.
    • Discrete a continuous normal random variable with a binomial discrete random variable.
    • Draw Cobb-Douglas, Utilitarian and Leontief indifference curve.
    • r: apply(mt, 1, funct(x){}) + do.call(rbind, ls_mt)
    • tidyr: expand_grid()
    • ggplot2: geom_line() + facet_wrap()
    • econ: Atkinson (JET, 1970)
  3. Share of Environmental Exposure Burden Across Population Groups: rmd | r | pdf | html
    • Simulate pollution exposures by location.
    • Compute share of pollution burden for a population group relative to the share of overall population accounted for by this population group.
    • core:
    • matrix()

9 Statistics

9.1 Random Draws

  1. Randomly Perturb Some Parameter Value with Varying Magnitudes: rmd | r | pdf | html
    • Given some existing parameter value, with an intensity value between 0 and 1, decide how to perturb the value.
    • r: matrix
    • stats: qlnorm()
    • graphics: par() + hist() + abline()

9.2 Distributions

  1. Integrate Normal Shocks: rmd | r | pdf | html
    • Random Sampling (Monte Carlo) integrate shocks.
    • Trapezoidal rule (symmetric rectangles) integrate normal shock.

9.3 Discrete Random Variable

  1. Binomial Approximation of Normal: rmd | r | pdf | html
    • Approximate a continuous normal random variable with a discrete binomial random variable.
    • r: hist() + plot()
    • stats: dbinom() + rnorm()
  2. Gestation (Binomial), Conception (Mixture), and Temperature (Sine wave and AR(1)): rmd | r | pdf | html
    • Simulate the distribution of gestational periods at birth following a binomial distribution.
    • Simulate the distribution of conception time following a potentially bimodal distribution.
    • Compute which births are pre-term given a simulated dataset of conception and birth dates.
    • Simulate temperature over days across years using a sine wave combined with a first order markov process with normal shocks.
    • stats: dbinom() + pbinom() + rnorm() + runif() + lm(binary ~ continuous + factor(dates))
    • ggplot: geom_point() + geom_bar() + geom_line() + geom_density() + geom_vline()
  3. Obtaining Joint Distribution from Marginal with Rectilinear Restrictions: rmd | r | pdf | html
    • Solve for joint distributional mass given marginal distributional mass given rectilinear assumptions.
    • r: qr()
  4. Obtaining Joint Distribution from Conditional with Rectilinear Restrictions: rmd | r | pdf | html
    • Solve for joint distributional mass given conditional distributional mass given rectilinear assumptions.
    • r: qr() + solve() + matrix()

10 Tables and Graphs

10.1 R Base Plots

  1. R Base Plot Line with Curves and Scatter: rmd | r | pdf | html
    • Plot scatter points, line plot and functional curve graphs together.
    • Set margins for legend to be outside of graph area, change line, point, label and legend sizes.
    • Generate additional lines for plots successively, record successively, and plot all steps, or initial steps results.
    • r: plot() + curve() + legend() + title() + axis() + par() + recordPlot()
  1. ggplot Line Plot Multiple Categorical Variables With Continuous Variable: rmd | r | pdf | html
    • One category is subplot, one category is line-color, one category is line-type.
    • One category is subplot, one category is differentiated by line-color, line-type and scatter-shapes.
    • One category are separate plots, two categories are subplots rows and columns, one category is differentiated by line-color, line-type and scatter-shapes.
    • ggplot: ggplot() + facet_wrap() + facet_grid() + geom_line() + geom_point() + geom_smooth() + geom_hline() + scale_colour_manual() + scale_shape_manual() + scale_shape_discrete() + scale_linetype_manual() + scale_x_continuous() + scale_y_continuous() + theme_bw() + theme() + guides() + theme() + ggsave()
    • dplyr: *filter(vara %in% c(1, 2) & varb == “val”) + mutate_if() + !any(is.na(suppressWarnings(as.numeric(na.omit(x))))) & is.character(x) *
  1. ggplot Scatter Plot Grouped or Unique Patterns and Colors: rmd | r | pdf | html
    • Scatter Plot Three Continuous Variables and Multiple Categorical Variables
    • Two continuous variables for the x-axis and the y-axis, another continuous variable for size of scatter, other categorical variables for scatter shape and size.
    • Scatter plot with unique pattern and color for each scatter point.
    • Y and X label axis with two layers of text in levels and deviation from some mid-point values.
    • tibble: rownames_to_column()
    • ggplot: ggplot() + geom_jitter() + geom_smooth() + geom_point(size=1, stroke=1) + scale_colour_manual() + scale_shape_discrete() + scale_linetype_manual() + scale_x_continuous() + scale_y_continuous() + theme_bw() + theme()
  2. ggplot Multiple Scatter-Lines and Facet Wrap Over Categories: rmd | r | pdf | html
    • ggplot multiple lines with scatter as points and connecting lines.
    • Facet wrap to generate subfigures for sub-categories.
    • Generate separate plots from data saved separately.
    • r: apply
    • ggplot: facet_wrap() + geom_smooth() + geom_point() + facet_wrap() + scale_colour_manual() + scale_shape_manual() + scale_linetype_manual()

10.4 Write and Read Plots

  1. Base R Save Images At Different Sizes: rmd | r | pdf | html
    • Base R store image core, add legends/titles/labels/axis of different sizes to save figures of different sizes.
    • r: png() + setEPS() + postscript() + dev.off()

11 Get Data

11.1 Environmental Data

  1. CDS ECMWF Global Enviornmental Data Download: rmd | r | pdf | html
    • Using Python API get get ECMWF ERA5 data.
    • Dynamically modify a python API file, run python inside a Conda virtual environment with R-reticulate.
    • r: file() + writeLines() + unzip() + list.files() + unlink()
    • r-reticulate: use_python() + Sys.setenv(RETICULATE_PYTHON = spth_conda_env)

12 Code and Development

12.1 Files In and Out

  1. Decompose File Paths to Get Folder and Files Names: rmd | r | pdf | html
    • Decompose file path and get file path folder names and file name.
    • r: .Platform$file.sep + tail() + strsplit() + basename() + dirname() + substring()
  2. Save Text to File, Read Text from File, Replace Text in File: rmd | r | pdf | html
    • Save data to file, read text from file, replace text in file.
    • r: kable() + file() + writeLines() + readLines() + close() + gsub()
  3. Convert R Markdown File to R, PDF and HTML: rmd | r | pdf | html
    • Find all files in a folder with a particula suffix, with exclusion.
    • Convert R Markdow File to R, PDF and HTML.
    • Modify markdown pounds hierarchy.
    • r: file() + writeLines() + readLines() + close() + gsub()

12.2 Python with R

  1. Python in R with Reticulate: rmd | r | pdf | html
    • Use Python in R with Reticulate
    • reticulate: py_config() + use_condaenv() + py_run_string() + Sys.which(‘python’)

12.3 Command Line

  1. System and Shell Commands in R: rmd | r | pdf | html
    • Run system executable and shell commands.
    • Activate conda environment with shell script.
    • r: system() + shell()

Please contact for issues or problems.

DOI

RepoSize CodeSize Language Release License