Compute Percentage Changes of School Resources and Population
Source:vignettes/ffv_gen_percent_changes.Rmd
ffv_gen_percent_changes.Rmd
library(tibble)
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(tidyr)
library(readr)
library(kableExtra)
#>
#> Attaching package: 'kableExtra'
#> The following object is masked from 'package:dplyr':
#>
#> group_rows
library(PrjCompPPTS)
# If resave outputs to data, only do this during development
bl_resave_to_data <- FALSE
In this file, we compute percentage changes in school resources and population. We interpolate when there is gaps in the data.
For all variables, for each country/location, we compute:
- percentage changes year by year
- percentage changes every 5, 10, 15, and 20 years.
We interpolate and generate interval changes for all countries. We also extrapolate for up to 3 years, to reach the closest decade break-off points.
To illustrate what we do, we print results from Afghanistan and Austria, where there are gaps in the data.
In the case of Afghanistan, we:
- Interpolate to fill-in gap years, 1983, 1987, and 1992, for example.
- Extrapolate to fill-in value for 2020 given 2018 to 2019 changes.
In the case of Austria, in earlier decades, data is not annual, so we interpolate to obtain more annualized predictions.
We also show results illustration from Busan Korea.
Transform data
# B. Long to wide ----
ppts_easia_weuro_long <- ppts_easia_weuro_sel %>%
pivot_longer(cols = starts_with('stats'),
names_to = c('variable'),
names_pattern = paste0("stats_(.*)"),
values_to = "value")
str(ppts_easia_weuro_long)
#> tibble [88,560 × 5] (S3: tbl_df/tbl/data.frame)
#> $ location_code : Factor w/ 286 levels "ABW","AFE","AFG",..: 1 1 1 1 1 1 1 1 1 1 ...
#> $ location_level: Factor w/ 4 levels "country","multicountry",..: 1 1 1 1 1 1 1 1 1 1 ...
#> $ year : num [1:88560] 1960 1960 1960 1960 1960 ...
#> $ variable : chr [1:88560] "youthpop" "student" "teacher" "school" ...
#> $ value : num [1:88560] 23769 NA NA NA NA ...
# View(ppts_easia_weuro_long)
# C sort and group ----
ppts_easia_weuro_long <- ppts_easia_weuro_long %>%
arrange(location_code, location_level, variable, year) %>%
group_by(location_code, location_level, variable)
# kable(ppts_easia_weuro_long %>%
# filter(location_code == 'AFG' & variable == 'student'))
Annual percentage changes, Interpolate and Extrapolate
Raw percentage changes
We compute annual percentage changes below. Note that
# D. Annual percentage changes ----
# Compute these
# - annual: for all possible consecutive years
# - annual_interp1: annual based on consecutive year if possible,
# when that is not possible, find closest years of available data,
# and derive annual (considering compounding) growth rates
# D.1 annual
ppts_easia_weuro_long <- ppts_easia_weuro_long %>%
mutate(pchg_yr1 = (value - lag(value)) / lag(value))
# View(ppts_easia_weuro_long)
kable(ppts_easia_weuro_long %>%
filter(location_code == 'AFG' & variable == 'student'))
location_code | location_level | year | variable | value | pchg_yr1 |
---|---|---|---|---|---|
AFG | country | 1960 | student | NA | NA |
AFG | country | 1961 | student | NA | NA |
AFG | country | 1962 | student | NA | NA |
AFG | country | 1963 | student | NA | NA |
AFG | country | 1964 | student | NA | NA |
AFG | country | 1965 | student | NA | NA |
AFG | country | 1966 | student | NA | NA |
AFG | country | 1967 | student | NA | NA |
AFG | country | 1968 | student | NA | NA |
AFG | country | 1969 | student | NA | NA |
AFG | country | 1970 | student | 540685 | NA |
AFG | country | 1971 | student | 572933 | 0.0596429 |
AFG | country | 1972 | student | 597509 | 0.0428951 |
AFG | country | 1973 | student | 624374 | 0.0449617 |
AFG | country | 1974 | student | 654209 | 0.0477839 |
AFG | country | 1975 | student | 692342 | 0.0582887 |
AFG | country | 1976 | student | 729667 | 0.0539112 |
AFG | country | 1977 | student | 764175 | 0.0472928 |
AFG | country | 1978 | student | 810657 | 0.0608264 |
AFG | country | 1979 | student | NA | NA |
AFG | country | 1980 | student | 959583 | NA |
AFG | country | 1981 | student | 1024741 | 0.0679024 |
AFG | country | 1982 | student | 365458 | -0.6433655 |
AFG | country | 1983 | student | NA | NA |
AFG | country | 1984 | student | 447351 | NA |
AFG | country | 1985 | student | 479150 | 0.0710829 |
AFG | country | 1986 | student | 512809 | 0.0702473 |
AFG | country | 1987 | student | NA | NA |
AFG | country | 1988 | student | 651622 | NA |
AFG | country | 1989 | student | 622135 | -0.0452517 |
AFG | country | 1990 | student | 622513 | 0.0006076 |
AFG | country | 1991 | student | 627888 | 0.0086344 |
AFG | country | 1992 | student | NA | NA |
AFG | country | 1993 | student | 786532 | NA |
AFG | country | 1994 | student | 1161444 | 0.4766646 |
AFG | country | 1995 | student | 1312197 | 0.1297979 |
AFG | country | 1996 | student | NA | NA |
AFG | country | 1997 | student | NA | NA |
AFG | country | 1998 | student | 1046338 | NA |
AFG | country | 1999 | student | 875605 | -0.1631719 |
AFG | country | 2000 | student | 749360 | -0.1441803 |
AFG | country | 2001 | student | 773623 | 0.0323783 |
AFG | country | 2002 | student | 2667629 | 2.4482287 |
AFG | country | 2003 | student | 3781015 | 0.4173691 |
AFG | country | 2004 | student | 4430142 | 0.1716806 |
AFG | country | 2005 | student | 4318819 | -0.0251285 |
AFG | country | 2006 | student | 4669110 | 0.0811081 |
AFG | country | 2007 | student | 4718077 | 0.0104874 |
AFG | country | 2008 | student | 4974836 | 0.0544203 |
AFG | country | 2009 | student | 4945632 | -0.0058703 |
AFG | country | 2010 | student | 5279326 | 0.0674725 |
AFG | country | 2011 | student | 5291624 | 0.0023295 |
AFG | country | 2012 | student | 5767543 | 0.0899382 |
AFG | country | 2013 | student | 5986268 | 0.0379234 |
AFG | country | 2014 | student | 6217756 | 0.0386698 |
AFG | country | 2015 | student | 6199329 | -0.0029636 |
AFG | country | 2016 | student | 6265011 | 0.0105950 |
AFG | country | 2017 | student | 6350404 | 0.0136301 |
AFG | country | 2018 | student | 6544906 | 0.0306283 |
AFG | country | 2019 | student | 6777785 | 0.0355817 |
AFG | country | 2020 | student | NA | NA |
AFG | country | 2021 | student | NA | NA |
location_code | location_level | year | variable | value | pchg_yr1 |
---|---|---|---|---|---|
KOR_Busan | province | 1965 | school | 79 | NA |
KOR_Busan | province | 1966 | school | 86 | 0.0886076 |
KOR_Busan | province | 1967 | school | 93 | 0.0813953 |
KOR_Busan | province | 1968 | school | 96 | 0.0322581 |
KOR_Busan | province | 1969 | school | 98 | 0.0208333 |
KOR_Busan | province | 1970 | school | 99 | 0.0102041 |
KOR_Busan | province | 1971 | school | 100 | 0.0101010 |
KOR_Busan | province | 1972 | school | 103 | 0.0300000 |
KOR_Busan | province | 1973 | school | 106 | 0.0291262 |
KOR_Busan | province | 1974 | school | 107 | 0.0094340 |
KOR_Busan | province | 1975 | school | 112 | 0.0467290 |
KOR_Busan | province | 1976 | school | 114 | 0.0178571 |
KOR_Busan | province | 1977 | school | 114 | 0.0000000 |
KOR_Busan | province | 1978 | school | 128 | 0.1228070 |
KOR_Busan | province | 1979 | school | 131 | 0.0234375 |
KOR_Busan | province | 1980 | school | 137 | 0.0458015 |
KOR_Busan | province | 1981 | school | 149 | 0.0875912 |
KOR_Busan | province | 1982 | school | 158 | 0.0604027 |
KOR_Busan | province | 1983 | school | 169 | 0.0696203 |
KOR_Busan | province | 1984 | school | 184 | 0.0887574 |
KOR_Busan | province | 1985 | school | 193 | 0.0489130 |
KOR_Busan | province | 1986 | school | 196 | 0.0155440 |
KOR_Busan | province | 1987 | school | 200 | 0.0204082 |
KOR_Busan | province | 1988 | school | 202 | 0.0100000 |
KOR_Busan | province | 1989 | school | 217 | 0.0742574 |
KOR_Busan | province | 1990 | school | 221 | 0.0184332 |
KOR_Busan | province | 1991 | school | 222 | 0.0045249 |
KOR_Busan | province | 1992 | school | 227 | 0.0225225 |
KOR_Busan | province | 1993 | school | 230 | 0.0132159 |
KOR_Busan | province | 1994 | school | 228 | -0.0086957 |
KOR_Busan | province | 1995 | school | 245 | 0.0745614 |
KOR_Busan | province | 1996 | school | 250 | 0.0204082 |
KOR_Busan | province | 1997 | school | 257 | 0.0280000 |
KOR_Busan | province | 1998 | school | 259 | 0.0077821 |
KOR_Busan | province | 1999 | school | 265 | 0.0231660 |
KOR_Busan | province | 2000 | school | 267 | 0.0075472 |
KOR_Busan | province | 2001 | school | 269 | 0.0074906 |
KOR_Busan | province | 2002 | school | 273 | 0.0148699 |
KOR_Busan | province | 2003 | school | 279 | 0.0219780 |
KOR_Busan | province | 2004 | school | 283 | 0.0143369 |
KOR_Busan | province | 2005 | school | 285 | 0.0070671 |
KOR_Busan | province | 2006 | school | 292 | 0.0245614 |
KOR_Busan | province | 2007 | school | 293 | 0.0034247 |
KOR_Busan | province | 2008 | school | 293 | 0.0000000 |
KOR_Busan | province | 2009 | school | 297 | 0.0136519 |
KOR_Busan | province | 2010 | school | 298 | 0.0033670 |
KOR_Busan | province | 2011 | school | 297 | -0.0033557 |
KOR_Busan | province | 2012 | school | 299 | 0.0067340 |
KOR_Busan | province | 2013 | school | 302 | 0.0100334 |
KOR_Busan | province | 2014 | school | 305 | 0.0099338 |
KOR_Busan | province | 2015 | school | 306 | 0.0032787 |
KOR_Busan | province | 2016 | school | 308 | 0.0065359 |
KOR_Busan | province | 2017 | school | 308 | 0.0000000 |
KOR_Busan | province | 2018 | school | 305 | -0.0097403 |
KOR_Busan | province | 2019 | school | 304 | -0.0032787 |
KOR_Busan | province | 2020 | school | 304 | 0.0000000 |
KOR_Busan | province | 2021 | school | 304 | 0.0000000 |
Interpolating
We drop all NA values, only keeping the rows where we have observed levels. We then take the difference in levels between consecutive rows and divide by prior level to get percentage changes. This is potential percentage changes across multiple years if there were gaps with NA values.
We assume constant growth rate in the in-between years, and compute percentage annualized percentage changes. Given these, our annual percentage change formula is shown as below:
where, the percentage change is for all .
The annual percentage change is exact where we know the level of schools, teachers, or students in the the current year and the year immediately after. But it is based growth trend “linear” interpolation when we have years of missing data in between.
We use the interpolated annual percentage changes to fill in gaps in levels. We interpolate with this function: ff_ppts_interp_linear. We test out the contents of the function via scripts below, and then confirm that the script and the function produce the same result. We first proceed with the script, and then the function.
Construct script-based results:
# D.2 annual_interp1
# D.2.1 compute linearly interpolated annual change
# create new sorting var, to take difference across years even if not
# consecutive
ppts_easia_weuro_interp1 <- ppts_easia_weuro_long %>%
drop_na(value) %>%
# New sorting variable, +1 regardless of year gap
mutate(year_with_gap_ctr = row_number()) %>%
arrange(location_code, location_level, variable, year_with_gap_ctr) %>%
group_by(location_code, location_level, variable) %>%
# compute percentage change over span and years gap over span
mutate(pchg_span_interp1 = (value - lag(value)) / lag(value),
span_yr = (year - lag(year))) %>%
# linear-interpolated annualized (with compounding) percentage change
mutate(pchg_yr1_interp1 = abs(1 + pchg_span_interp1) ^ (1 / span_yr) - 1) %>%
# adjust values given percentages interpolated
select(location_code, location_level,
year, variable, span_yr, value, pchg_yr1_interp1)
# D.2.2 expand to all interpolating years
# the pchg_yr1_interp1 variable is only shown at start year,
# due to drop_na(value) earlier
# need to expand to all years
ppts_easia_weuro_interp1 <- ppts_easia_weuro_interp1 %>%
ungroup() %>% mutate(span_yr_dup = span_yr) %>%
drop_na(span_yr) %>%
tidyr::uncount(span_yr) %>%
group_by(location_code, location_level, variable, year) %>%
mutate(year_adj = row_number() + year - span_yr_dup,
gap_ctr = row_number()) %>%
ungroup() %>%
select(location_code, location_level,
year_adj, variable, value, pchg_yr1_interp1, gap_ctr, span_yr_dup) %>%
rename(year = year_adj)
# D.2.3, Fill in missing values
ppts_easia_weuro_interp1 <- ppts_easia_weuro_interp1 %>%
arrange(location_code, location_level, variable, year) %>%
group_by(location_code, location_level, variable) %>%
mutate(value_interp1 = case_when(
# Conditions A and B are both gap_ctr == span_yr_dup
# A. value is correct, originally not NA
gap_ctr == 1 & span_yr_dup == 1 ~ value,
# B. value is also correct, next available year with non NA
gap_ctr == span_yr_dup & span_yr_dup > 1 ~ value,
# C. in-between years, where values were NA
gap_ctr < span_yr_dup ~ (value/((1+pchg_yr1_interp1)^(span_yr_dup-gap_ctr)))
)) %>%
ungroup() %>%
select(-gap_ctr, -span_yr_dup, -value)
# D.3 merge together
ppts_easia_weuro_interp1_script <- ppts_easia_weuro_long %>%
left_join(ppts_easia_weuro_interp1,
by = (c('location_code' = 'location_code',
'location_level' = 'location_level',
'year' = 'year',
'variable' = 'variable'))) %>%
# add in data from the first year of raw data availability
mutate(value_interp1 = case_when(
!is.na(value) & is.na(value_interp1) ~ value,
TRUE ~ value_interp1
))
# Print interpolated results
str(ppts_easia_weuro_interp1)
#> tibble [52,118 × 6] (S3: tbl_df/tbl/data.frame)
#> $ location_code : Factor w/ 286 levels "ABW","AFE","AFG",..: 1 1 1 1 1 1 1 1 1 1 ...
#> $ location_level : Factor w/ 4 levels "country","multicountry",..: 1 1 1 1 1 1 1 1 1 1 ...
#> $ year : num [1:52118] 1987 1988 1989 1990 1991 ...
#> $ variable : chr [1:52118] "gdp" "gdp" "gdp" "gdp" ...
#> $ pchg_yr1_interp1: num [1:52118] 0.176 0.2014 0.122 0.0209 0.0384 ...
#> $ value_interp1 : num [1:52118] 20263 24343 27313 27884 28954 ...
kable(ppts_easia_weuro_interp1 %>% select(-location_level) %>%
filter(location_code == 'AFG' & variable == 'student'))
location_code | year | variable | pchg_yr1_interp1 | value_interp1 |
---|---|---|---|---|
AFG | 1971 | student | 0.0596429 | 572933.0 |
AFG | 1972 | student | 0.0428951 | 597509.0 |
AFG | 1973 | student | 0.0449617 | 624374.0 |
AFG | 1974 | student | 0.0477839 | 654209.0 |
AFG | 1975 | student | 0.0582887 | 692342.0 |
AFG | 1976 | student | 0.0539112 | 729667.0 |
AFG | 1977 | student | 0.0472928 | 764175.0 |
AFG | 1978 | student | 0.0608264 | 810657.0 |
AFG | 1979 | student | 0.0879845 | 881982.2 |
AFG | 1980 | student | 0.0879845 | 959583.0 |
AFG | 1981 | student | 0.0679024 | 1024741.0 |
AFG | 1982 | student | -0.6433655 | 365458.0 |
AFG | 1983 | student | 0.1063829 | 404336.5 |
AFG | 1984 | student | 0.1063829 | 447351.0 |
AFG | 1985 | student | 0.0710829 | 479150.0 |
AFG | 1986 | student | 0.0702473 | 512809.0 |
AFG | 1987 | student | 0.1272495 | 578063.7 |
AFG | 1988 | student | 0.1272495 | 651622.0 |
AFG | 1989 | student | -0.0452517 | 622135.0 |
AFG | 1990 | student | 0.0006076 | 622513.0 |
AFG | 1991 | student | 0.0086344 | 627888.0 |
AFG | 1992 | student | 0.1192242 | 702747.5 |
AFG | 1993 | student | 0.1192242 | 786532.0 |
AFG | 1994 | student | 0.4766646 | 1161444.0 |
AFG | 1995 | student | 0.1297979 | 1312197.0 |
AFG | 1996 | student | -0.0726913 | 1216811.7 |
AFG | 1997 | student | -0.0726913 | 1128360.0 |
AFG | 1998 | student | -0.0726913 | 1046338.0 |
AFG | 1999 | student | -0.1631719 | 875605.0 |
AFG | 2000 | student | -0.1441803 | 749360.0 |
AFG | 2001 | student | 0.0323783 | 773623.0 |
AFG | 2002 | student | 2.4482287 | 2667629.0 |
AFG | 2003 | student | 0.4173691 | 3781015.0 |
AFG | 2004 | student | 0.1716806 | 4430142.0 |
AFG | 2005 | student | -0.0251285 | 4318819.0 |
AFG | 2006 | student | 0.0811081 | 4669110.0 |
AFG | 2007 | student | 0.0104874 | 4718077.0 |
AFG | 2008 | student | 0.0544203 | 4974836.0 |
AFG | 2009 | student | -0.0058703 | 4945632.0 |
AFG | 2010 | student | 0.0674725 | 5279326.0 |
AFG | 2011 | student | 0.0023295 | 5291624.0 |
AFG | 2012 | student | 0.0899382 | 5767543.0 |
AFG | 2013 | student | 0.0379234 | 5986268.0 |
AFG | 2014 | student | 0.0386698 | 6217756.0 |
AFG | 2015 | student | -0.0029636 | 6199329.0 |
AFG | 2016 | student | 0.0105950 | 6265011.0 |
AFG | 2017 | student | 0.0136301 | 6350404.0 |
AFG | 2018 | student | 0.0306283 | 6544906.0 |
AFG | 2019 | student | 0.0355817 | 6777785.0 |
Functionalizing the script, we have the function output below:
# Function parameters
df_data <- ppts_easia_weuro_long
ar_svr_group <- c("location_code", "location_level", "variable")
svr_data <- c("value")
svr_date <- c("year")
svr_interp <- c("value_interp1_func")
# Run function
ppts_easia_weuro_interp1_func <- PrjCompPPTS::ff_ppts_interp_linear(
df_data,
ar_svr_group = ar_svr_group,
svr_data = svr_data,
svr_date = svr_date,
svr_interp = svr_interp,
verbose = FALSE
) %>% rename(value_interp1 = value_interp1_func)
# Test if identical
bl_func_script_consistency <- identical(
ppts_easia_weuro_interp1_func, ppts_easia_weuro_interp1_script)
print(glue::glue("bl_func_script_consistency = {bl_func_script_consistency}"))
#> bl_func_script_consistency = TRUE
Year selection:
Extrapolating
We will extrapolate for several years before the start and after the end of the data timeframes. Extrapolation will not exceed going 5 years forward and going 5 years backwards. And extrapolation will only happen within years in which there is at least one variable, among variables for the country, has non-NA values. In cases where a country’s data is available only starting after 1980, we allow for extrapolation back to up to 1980 for 5 years.
Specifically, because we generally have population data from 1960 to 2020 for all countries, we will not be extrapolating prior to 1960 or after 2020. But data for Korea starts in 1965, so we will not extrapolate to any years before 1965, but if one of the Korean variables has data starting from 1970, we will extrapolate between 1965 and 1970. But if the other variable only has data starting in 1980, we will extrapolate at most five years, back to 1975. For Germany, unification happened in 1992. We do not have data in 1990, preventing us from computing change from 1990 to 2000. We extrapolate from 1992 back 5 years to 1987, generating a value for 1990.
Extrapolation is meant to help with situations where we have data up to 2019, but for consistency of comparison, would be useful to extend the data to 2020 by extrapolating 1 year forward.
Compute start and end year for each variable and each country. For the interpolated value column, NaN values are before and after the start and end of available data. First go forward, value = lag(value) x lag(1+change), fill in if value is NaN and current year is less than 5 above terimal data availability time. Then do the same going backwards, but there, value = lead(value, n=1) / lead(1 + change, n=2)
First, we generate end and start years and interpolating percentages, etc.
# Generate start and end times
ppts_easia_weuro_extrapolate <- ppts_easia_weuro_long %>%
drop_na(value) %>%
arrange(location_code, location_level, variable, year) %>%
group_by(location_code, location_level, variable) %>%
mutate(year_start = first(year, na_rm=TRUE), year_end = last(year, na_rm=TRUE),
value_start = first(value, na_rm=TRUE), value_end = last(value, na_rm=TRUE),
pchg_yr1_interp1_start = first(pchg_yr1_interp1, na_rm=TRUE),
pchg_yr1_interp1_end = last(pchg_yr1_interp1, na_rm=TRUE)) %>%
slice(1) %>%
select(location_code, location_level, variable,
contains("_start"), contains("_end"))
Second extrapolate.
# How many years to extrapolate back
it_year_extrplt_backmost <- 5
it_year_extrplt_fordmost <- 5
# Merge with full skeleton for expansion
ppts_easia_weuro_extrapolate <- ppts_easia_weuro_long %>%
select(location_code, location_level, variable, year, value) %>%
left_join(ppts_easia_weuro_extrapolate,
by = (c('location_code' = 'location_code',
'location_level' = 'location_level',
'variable' = 'variable')))
# Extrapolate forward
ppts_easia_weuro_extrapolate <- ppts_easia_weuro_extrapolate %>%
mutate(value_extrapolate =
case_when(
is.na(value) & (year_end + it_year_extrplt_fordmost >= year) & (year > year_end) ~
(value_end*((1+pchg_yr1_interp1_end)^(year-year_end))),
is.na(value) & (year_start - it_year_extrplt_backmost <= year) & (year < year_start) ~
(value_start/((1+pchg_yr1_interp1_start)^(year_start - year))),
TRUE ~ NA
)) %>%
mutate(pchg_yr1_interp1_extrapolate =
case_when(
(year_end + it_year_extrplt_fordmost >= year) & (year >= year_end) ~
pchg_yr1_interp1_end,
(year_start - it_year_extrplt_backmost <= year) & (year <= year_start) ~
pchg_yr1_interp1_start,
TRUE ~ NA
))
# Get extrapolated values
ppts_easia_weuro_extrapolate <- ppts_easia_weuro_extrapolate %>%
drop_na(pchg_yr1_interp1_extrapolate) %>%
select(location_code, location_level, variable, year,
value_extrapolate, pchg_yr1_interp1_extrapolate) %>%
rename(value_interp1 = value_extrapolate,
pchg_yr1_interp1 = pchg_yr1_interp1_extrapolate)
# Print
kable(ppts_easia_weuro_extrapolate %>% select(-location_level) %>%
filter(location_code == 'AFG' & variable == 'student'))
#> Adding missing grouping variables: `location_level`
location_level | location_code | variable | year | value_interp1 | pchg_yr1_interp1 |
---|---|---|---|---|---|
country | AFG | student | 1965 | 404712.6 | 0.0596429 |
country | AFG | student | 1966 | 428850.8 | 0.0596429 |
country | AFG | student | 1967 | 454428.7 | 0.0596429 |
country | AFG | student | 1968 | 481532.2 | 0.0596429 |
country | AFG | student | 1969 | 510252.1 | 0.0596429 |
country | AFG | student | 1970 | NA | 0.0596429 |
country | AFG | student | 2019 | NA | 0.0355817 |
country | AFG | student | 2020 | 7018950.2 | 0.0355817 |
# kable(ppts_easia_weuro_extrapolate %>% select(-location_level) %>%
# filter(location_code == 'KOR_Busan' & variable == 'school'))
Merge raw and interpolated results together with extrapolated results
Merge extrpolated results back to main dataframe.
# Merge
ppts_easia_weuro_long <- ppts_easia_weuro_long %>%
left_join(ppts_easia_weuro_extrapolate,
by = (c('location_code' = 'location_code',
'location_level' = 'location_level',
'year' = 'year',
'variable' = 'variable')))
# Combine columns
ppts_easia_weuro_long <- ppts_easia_weuro_long %>%
mutate(value_interp1 = coalesce(value_interp1.x, value_interp1.y)) %>%
select(-value_interp1.x, -value_interp1.y) %>%
mutate(pchg_yr1_interp1 = coalesce(pchg_yr1_interp1.x, pchg_yr1_interp1.y)) %>%
select(-pchg_yr1_interp1.x, -pchg_yr1_interp1.y) %>%
drop_na(value_interp1)
Interpolation and Extrapolation Results Overview
The results in the value_interp1 column below contains below interpolation and extrapolation results.
Afghanistan Interpolate and Extrapolate Results
We illustrate the interpolation results by showing outputs from Afghanistan for student counts. Here, we are extrapolating several years back to 1965, and forward 1 year to 2020. And we fill in several years of missing data.
By extrapolating data in 2020, we are able to generate percentage change between 2000 and 2020, 2010 and 2020, etc. We are also able to see levels in 2020. These percentages and levels were not available without interpolation/extrapolation. We can see that our data for Afghanistan is much more complete after interpolating/extrapolating.
# str(ppts_easia_weuro_long)
# print
kable(ppts_easia_weuro_long %>%
filter(location_code == 'AFG' & variable == 'student') %>%
ungroup() %>%
select(location_code, year, variable,
value, value_interp1, pchg_yr1, pchg_yr1_interp1),
caption="Raw and interpolated Afghanistan students results")
location_code | year | variable | value | value_interp1 | pchg_yr1 | pchg_yr1_interp1 |
---|---|---|---|---|---|---|
AFG | 1965 | student | NA | 404712.6 | NA | 0.0596429 |
AFG | 1966 | student | NA | 428850.8 | NA | 0.0596429 |
AFG | 1967 | student | NA | 454428.7 | NA | 0.0596429 |
AFG | 1968 | student | NA | 481532.2 | NA | 0.0596429 |
AFG | 1969 | student | NA | 510252.1 | NA | 0.0596429 |
AFG | 1970 | student | 540685 | 540685.0 | NA | 0.0596429 |
AFG | 1971 | student | 572933 | 572933.0 | 0.0596429 | 0.0596429 |
AFG | 1972 | student | 597509 | 597509.0 | 0.0428951 | 0.0428951 |
AFG | 1973 | student | 624374 | 624374.0 | 0.0449617 | 0.0449617 |
AFG | 1974 | student | 654209 | 654209.0 | 0.0477839 | 0.0477839 |
AFG | 1975 | student | 692342 | 692342.0 | 0.0582887 | 0.0582887 |
AFG | 1976 | student | 729667 | 729667.0 | 0.0539112 | 0.0539112 |
AFG | 1977 | student | 764175 | 764175.0 | 0.0472928 | 0.0472928 |
AFG | 1978 | student | 810657 | 810657.0 | 0.0608264 | 0.0608264 |
AFG | 1979 | student | NA | 881982.2 | NA | 0.0879845 |
AFG | 1980 | student | 959583 | 959583.0 | NA | 0.0879845 |
AFG | 1981 | student | 1024741 | 1024741.0 | 0.0679024 | 0.0679024 |
AFG | 1982 | student | 365458 | 365458.0 | -0.6433655 | -0.6433655 |
AFG | 1983 | student | NA | 404336.5 | NA | 0.1063829 |
AFG | 1984 | student | 447351 | 447351.0 | NA | 0.1063829 |
AFG | 1985 | student | 479150 | 479150.0 | 0.0710829 | 0.0710829 |
AFG | 1986 | student | 512809 | 512809.0 | 0.0702473 | 0.0702473 |
AFG | 1987 | student | NA | 578063.7 | NA | 0.1272495 |
AFG | 1988 | student | 651622 | 651622.0 | NA | 0.1272495 |
AFG | 1989 | student | 622135 | 622135.0 | -0.0452517 | -0.0452517 |
AFG | 1990 | student | 622513 | 622513.0 | 0.0006076 | 0.0006076 |
AFG | 1991 | student | 627888 | 627888.0 | 0.0086344 | 0.0086344 |
AFG | 1992 | student | NA | 702747.5 | NA | 0.1192242 |
AFG | 1993 | student | 786532 | 786532.0 | NA | 0.1192242 |
AFG | 1994 | student | 1161444 | 1161444.0 | 0.4766646 | 0.4766646 |
AFG | 1995 | student | 1312197 | 1312197.0 | 0.1297979 | 0.1297979 |
AFG | 1996 | student | NA | 1216811.7 | NA | -0.0726913 |
AFG | 1997 | student | NA | 1128360.0 | NA | -0.0726913 |
AFG | 1998 | student | 1046338 | 1046338.0 | NA | -0.0726913 |
AFG | 1999 | student | 875605 | 875605.0 | -0.1631719 | -0.1631719 |
AFG | 2000 | student | 749360 | 749360.0 | -0.1441803 | -0.1441803 |
AFG | 2001 | student | 773623 | 773623.0 | 0.0323783 | 0.0323783 |
AFG | 2002 | student | 2667629 | 2667629.0 | 2.4482287 | 2.4482287 |
AFG | 2003 | student | 3781015 | 3781015.0 | 0.4173691 | 0.4173691 |
AFG | 2004 | student | 4430142 | 4430142.0 | 0.1716806 | 0.1716806 |
AFG | 2005 | student | 4318819 | 4318819.0 | -0.0251285 | -0.0251285 |
AFG | 2006 | student | 4669110 | 4669110.0 | 0.0811081 | 0.0811081 |
AFG | 2007 | student | 4718077 | 4718077.0 | 0.0104874 | 0.0104874 |
AFG | 2008 | student | 4974836 | 4974836.0 | 0.0544203 | 0.0544203 |
AFG | 2009 | student | 4945632 | 4945632.0 | -0.0058703 | -0.0058703 |
AFG | 2010 | student | 5279326 | 5279326.0 | 0.0674725 | 0.0674725 |
AFG | 2011 | student | 5291624 | 5291624.0 | 0.0023295 | 0.0023295 |
AFG | 2012 | student | 5767543 | 5767543.0 | 0.0899382 | 0.0899382 |
AFG | 2013 | student | 5986268 | 5986268.0 | 0.0379234 | 0.0379234 |
AFG | 2014 | student | 6217756 | 6217756.0 | 0.0386698 | 0.0386698 |
AFG | 2015 | student | 6199329 | 6199329.0 | -0.0029636 | -0.0029636 |
AFG | 2016 | student | 6265011 | 6265011.0 | 0.0105950 | 0.0105950 |
AFG | 2017 | student | 6350404 | 6350404.0 | 0.0136301 | 0.0136301 |
AFG | 2018 | student | 6544906 | 6544906.0 | 0.0306283 | 0.0306283 |
AFG | 2019 | student | 6777785 | 6777785.0 | 0.0355817 | 0.0355817 |
AFG | 2020 | student | NA | 7018950.2 | NA | 0.0355817 |
Austria Interpolate and Extrapolate Results
The Austrian data starts in 1923, and school count is available in that year. Hence, we do not extrapolate prior to that. If the data starts in 1923 (meaning any measures of any information is available in 1923), but the school count data starts in 1925, we would extrapolate back 2 years to 1923. Austrian data for initial decades is spotty, hence we interpolate to fill in the gaps.
# str(ppts_easia_weuro_long)
# print
kable(ppts_easia_weuro_long %>%
filter(location_code == 'AUT' & variable == 'school') %>%
ungroup() %>%
select(location_code, year, variable,
value, value_interp1, pchg_yr1, pchg_yr1_interp1),
caption="Raw and interpolated Austria schools results")
location_code | year | variable | value | value_interp1 | pchg_yr1 | pchg_yr1_interp1 |
---|---|---|---|---|---|---|
AUT | 1923 | school | 4655 | 4655.000 | NA | 0.0045011 |
AUT | 1924 | school | NA | 4675.953 | NA | 0.0045011 |
AUT | 1925 | school | 4697 | 4697.000 | NA | 0.0045011 |
AUT | 1926 | school | NA | 4697.400 | NA | 0.0000851 |
AUT | 1927 | school | NA | 4697.800 | NA | 0.0000851 |
AUT | 1928 | school | NA | 4698.200 | NA | 0.0000851 |
AUT | 1929 | school | NA | 4698.600 | NA | 0.0000851 |
AUT | 1930 | school | 4699 | 4699.000 | NA | 0.0000851 |
AUT | 1931 | school | NA | 4693.989 | NA | -0.0010663 |
AUT | 1932 | school | NA | 4688.984 | NA | -0.0010663 |
AUT | 1933 | school | NA | 4683.984 | NA | -0.0010663 |
AUT | 1934 | school | NA | 4678.989 | NA | -0.0010663 |
AUT | 1935 | school | 4674 | 4674.000 | NA | -0.0010663 |
AUT | 1936 | school | NA | 4634.460 | NA | -0.0084595 |
AUT | 1937 | school | NA | 4595.255 | NA | -0.0084595 |
AUT | 1938 | school | NA | 4556.381 | NA | -0.0084595 |
AUT | 1939 | school | NA | 4517.836 | NA | -0.0084595 |
AUT | 1940 | school | NA | 4479.618 | NA | -0.0084595 |
AUT | 1941 | school | NA | 4441.722 | NA | -0.0084595 |
AUT | 1942 | school | NA | 4404.147 | NA | -0.0084595 |
AUT | 1943 | school | NA | 4366.891 | NA | -0.0084595 |
AUT | 1944 | school | NA | 4329.949 | NA | -0.0084595 |
AUT | 1945 | school | NA | 4293.319 | NA | -0.0084595 |
AUT | 1946 | school | 4257 | 4257.000 | NA | -0.0084595 |
AUT | 1947 | school | NA | 4292.308 | NA | 0.0082942 |
AUT | 1948 | school | NA | 4327.909 | NA | 0.0082942 |
AUT | 1949 | school | NA | 4363.806 | NA | 0.0082942 |
AUT | 1950 | school | 4400 | 4400.000 | NA | 0.0082942 |
AUT | 1951 | school | 4417 | 4417.000 | 0.0038636 | 0.0038636 |
AUT | 1952 | school | 4417 | 4417.000 | 0.0000000 | 0.0000000 |
AUT | 1953 | school | 4426 | 4426.000 | 0.0020376 | 0.0020376 |
AUT | 1954 | school | 4426 | 4426.000 | 0.0000000 | 0.0000000 |
AUT | 1955 | school | 4427 | 4427.000 | 0.0002259 | 0.0002259 |
AUT | 1956 | school | 4426 | 4426.000 | -0.0002259 | -0.0002259 |
AUT | 1957 | school | 4418 | 4418.000 | -0.0018075 | -0.0018075 |
AUT | 1958 | school | 4403 | 4403.000 | -0.0033952 | -0.0033952 |
AUT | 1959 | school | 4402 | 4402.000 | -0.0002271 | -0.0002271 |
AUT | 1960 | school | 4393 | 4393.000 | -0.0020445 | -0.0020445 |
AUT | 1961 | school | 4387 | 4387.000 | -0.0013658 | -0.0013658 |
AUT | 1962 | school | 4374 | 4374.000 | -0.0029633 | -0.0029633 |
AUT | 1963 | school | 4374 | 4374.000 | 0.0000000 | 0.0000000 |
AUT | 1964 | school | 4375 | 4375.000 | 0.0002286 | 0.0002286 |
AUT | 1965 | school | 4295 | 4295.000 | -0.0182857 | -0.0182857 |
AUT | 1966 | school | 4175 | 4175.000 | -0.0279395 | -0.0279395 |
AUT | 1967 | school | 4094 | 4094.000 | -0.0194012 | -0.0194012 |
AUT | 1968 | school | 4059 | 4059.000 | -0.0085491 | -0.0085491 |
AUT | 1969 | school | 4018 | 4018.000 | -0.0101010 | -0.0101010 |
AUT | 1970 | school | 3973 | 3973.000 | -0.0111996 | -0.0111996 |
AUT | 1971 | school | 3839 | 3839.000 | -0.0337277 | -0.0337277 |
AUT | 1972 | school | 3783 | 3783.000 | -0.0145871 | -0.0145871 |
AUT | 1973 | school | 3711 | 3711.000 | -0.0190325 | -0.0190325 |
AUT | 1974 | school | 3644 | 3644.000 | -0.0180544 | -0.0180544 |
AUT | 1975 | school | 3590 | 3590.000 | -0.0148189 | -0.0148189 |
AUT | 1976 | school | 3548 | 3548.000 | -0.0116992 | -0.0116992 |
AUT | 1977 | school | 3508 | 3508.000 | -0.0112740 | -0.0112740 |
AUT | 1978 | school | 3494 | 3494.000 | -0.0039909 | -0.0039909 |
AUT | 1979 | school | 3466 | 3466.000 | -0.0080137 | -0.0080137 |
AUT | 1980 | school | 3450 | 3450.000 | -0.0046163 | -0.0046163 |
AUT | 1981 | school | 3451 | 3451.000 | 0.0002899 | 0.0002899 |
AUT | 1982 | school | 3434 | 3434.000 | -0.0049261 | -0.0049261 |
AUT | 1983 | school | 3421 | 3421.000 | -0.0037857 | -0.0037857 |
AUT | 1984 | school | 3414 | 3414.000 | -0.0020462 | -0.0020462 |
AUT | 1985 | school | 3411 | 3411.000 | -0.0008787 | -0.0008787 |
AUT | 1986 | school | 3395 | 3395.000 | -0.0046907 | -0.0046907 |
AUT | 1987 | school | 3394 | 3394.000 | -0.0002946 | -0.0002946 |
AUT | 1988 | school | 3385 | 3385.000 | -0.0026517 | -0.0026517 |
AUT | 1989 | school | 3383 | 3383.000 | -0.0005908 | -0.0005908 |
AUT | 1990 | school | 3386 | 3386.000 | 0.0008868 | 0.0008868 |
AUT | 1991 | school | 3384 | 3384.000 | -0.0005907 | -0.0005907 |
AUT | 1992 | school | 3381 | 3381.000 | -0.0008865 | -0.0008865 |
AUT | 1993 | school | 3382 | 3382.000 | 0.0002958 | 0.0002958 |
AUT | 1994 | school | 3378 | 3378.000 | -0.0011827 | -0.0011827 |
AUT | 1995 | school | 3383 | 3383.000 | 0.0014802 | 0.0014802 |
AUT | 1996 | school | 3367 | 3367.000 | -0.0047295 | -0.0047295 |
AUT | 1997 | school | 3362 | 3362.000 | -0.0014850 | -0.0014850 |
AUT | 1998 | school | 3366 | 3366.000 | 0.0011898 | 0.0011898 |
AUT | 1999 | school | 3364 | 3364.000 | -0.0005942 | -0.0005942 |
AUT | 2000 | school | 3360 | 3360.000 | -0.0011891 | -0.0011891 |
AUT | 2001 | school | 3309 | 3309.000 | -0.0151786 | -0.0151786 |
AUT | 2002 | school | 3299 | 3299.000 | -0.0030221 | -0.0030221 |
AUT | 2003 | school | 3336 | 3336.000 | 0.0112155 | 0.0112155 |
AUT | 2004 | school | 3324 | 3324.000 | -0.0035971 | -0.0035971 |
AUT | 2005 | school | 3296 | 3296.000 | -0.0084236 | -0.0084236 |
AUT | 2006 | school | 3248 | 3248.000 | -0.0145631 | -0.0145631 |
AUT | 2007 | school | 3225 | 3225.000 | -0.0070813 | -0.0070813 |
AUT | 2008 | school | 3207 | 3207.000 | -0.0055814 | -0.0055814 |
AUT | 2009 | school | 3197 | 3197.000 | -0.0031182 | -0.0031182 |
AUT | 2010 | school | 3171 | 3171.000 | -0.0081326 | -0.0081326 |
AUT | 2011 | school | 3135 | 3135.000 | -0.0113529 | -0.0113529 |
AUT | 2012 | school | 3095 | 3095.000 | -0.0127592 | -0.0127592 |
AUT | 2013 | school | 3066 | 3066.000 | -0.0093700 | -0.0093700 |
AUT | 2014 | school | 3051 | 3051.000 | -0.0048924 | -0.0048924 |
AUT | 2015 | school | 3039 | 3039.000 | -0.0039331 | -0.0039331 |
AUT | 2016 | school | 3040 | 3040.000 | 0.0003291 | 0.0003291 |
AUT | 2017 | school | 3033 | 3033.000 | -0.0023026 | -0.0023026 |
AUT | 2018 | school | 3026 | 3026.000 | -0.0023079 | -0.0023079 |
AUT | 2019 | school | 3014 | 3014.000 | -0.0039656 | -0.0039656 |
AUT | 2020 | school | 3014 | 3014.000 | 0.0000000 | 0.0000000 |
Switzerland Example
We have some historical data on Switzerland, and also some recent data.
# str(ppts_easia_weuro_long)
# print
kable(ppts_easia_weuro_long %>%
filter(location_code == 'CHE' & variable == 'teacher') %>%
ungroup() %>%
select(location_code, year, variable,
value, value_interp1, pchg_yr1, pchg_yr1_interp1),
caption="Raw and interpolated Switzerland schools results")
location_code | year | variable | value | value_interp1 | pchg_yr1 | pchg_yr1_interp1 |
---|---|---|---|---|---|---|
CHE | 1920 | teacher | NA | 13497.39 | NA | -0.0023365 |
CHE | 1921 | teacher | NA | 13465.85 | NA | -0.0023365 |
CHE | 1922 | teacher | NA | 13434.39 | NA | -0.0023365 |
CHE | 1923 | teacher | 13403 | 13403.00 | NA | -0.0023365 |
CHE | 1924 | teacher | NA | 13345.89 | NA | -0.0042613 |
CHE | 1925 | teacher | NA | 13289.02 | NA | -0.0042613 |
CHE | 1926 | teacher | NA | 13232.39 | NA | -0.0042613 |
CHE | 1927 | teacher | 13176 | 13176.00 | NA | -0.0042613 |
CHE | 1928 | teacher | NA | 13207.88 | NA | 0.0024199 |
CHE | 1929 | teacher | NA | 13239.85 | NA | 0.0024199 |
CHE | 1930 | teacher | NA | 13271.88 | NA | 0.0024199 |
CHE | 1931 | teacher | 13304 | 13304.00 | NA | 0.0024199 |
CHE | 1932 | teacher | NA | 13371.73 | NA | 0.0050910 |
CHE | 1933 | teacher | NA | 13439.81 | NA | 0.0050910 |
CHE | 1934 | teacher | NA | 13508.23 | NA | 0.0050910 |
CHE | 1935 | teacher | 13577 | 13577.00 | NA | 0.0050910 |
CHE | 1936 | teacher | NA | 13572.50 | NA | -0.0003316 |
CHE | 1937 | teacher | NA | 13568.00 | NA | -0.0003316 |
CHE | 1938 | teacher | NA | 13563.50 | NA | -0.0003316 |
CHE | 1939 | teacher | 13559 | 13559.00 | NA | -0.0003316 |
CHE | 1940 | teacher | NA | 13533.93 | NA | -0.0018489 |
CHE | 1941 | teacher | NA | 13508.91 | NA | -0.0018489 |
CHE | 1942 | teacher | NA | 13483.93 | NA | -0.0018489 |
CHE | 1943 | teacher | 13459 | 13459.00 | NA | -0.0018489 |
CHE | 1944 | teacher | NA | 13575.72 | NA | 0.0086725 |
CHE | 1945 | teacher | NA | 13693.46 | NA | 0.0086725 |
CHE | 1946 | teacher | NA | 13812.21 | NA | 0.0086725 |
CHE | 1947 | teacher | 13932 | 13932.00 | NA | 0.0086725 |
CHE | 1948 | teacher | NA | 14066.05 | NA | 0.0096219 |
CHE | 1949 | teacher | NA | 14201.40 | NA | 0.0096219 |
CHE | 1950 | teacher | NA | 14338.04 | NA | 0.0096219 |
CHE | 1951 | teacher | 14476 | 14476.00 | NA | 0.0096219 |
CHE | 1952 | teacher | NA | 14847.08 | NA | 0.0256342 |
CHE | 1953 | teacher | NA | 15227.67 | NA | 0.0256342 |
CHE | 1954 | teacher | NA | 15618.02 | NA | 0.0256342 |
CHE | 1955 | teacher | NA | 16018.38 | NA | 0.0256342 |
CHE | 1956 | teacher | 16429 | 16429.00 | NA | 0.0256342 |
CHE | 1957 | teacher | NA | 16678.32 | NA | 0.0151754 |
CHE | 1958 | teacher | NA | 16931.42 | NA | 0.0151754 |
CHE | 1959 | teacher | NA | 17188.36 | NA | 0.0151754 |
CHE | 1960 | teacher | NA | 17449.20 | NA | 0.0151754 |
CHE | 1961 | teacher | 17714 | 17714.00 | NA | 0.0151754 |
CHE | 1962 | teacher | NA | 18046.84 | NA | 0.0187896 |
CHE | 1963 | teacher | NA | 18385.93 | NA | 0.0187896 |
CHE | 1964 | teacher | NA | 18731.40 | NA | 0.0187896 |
CHE | 1965 | teacher | NA | 19083.36 | NA | 0.0187896 |
CHE | 1966 | teacher | NA | 19441.92 | NA | 0.0187896 |
CHE | 1967 | teacher | NA | 19807.23 | NA | 0.0187896 |
CHE | 1968 | teacher | NA | 20179.40 | NA | 0.0187896 |
CHE | 1969 | teacher | NA | 20558.57 | NA | 0.0187896 |
CHE | 1970 | teacher | NA | 20944.85 | NA | 0.0187896 |
CHE | 1971 | teacher | NA | 21338.40 | NA | 0.0187896 |
CHE | 1972 | teacher | NA | 21739.34 | NA | 0.0187896 |
CHE | 1973 | teacher | NA | 22147.82 | NA | 0.0187896 |
CHE | 1974 | teacher | NA | 22563.97 | NA | 0.0187896 |
CHE | 1975 | teacher | NA | 22987.94 | NA | 0.0187896 |
CHE | 1976 | teacher | NA | 23419.87 | NA | 0.0187896 |
CHE | 1977 | teacher | NA | 23859.92 | NA | 0.0187896 |
CHE | 1978 | teacher | NA | 24308.24 | NA | 0.0187896 |
CHE | 1979 | teacher | NA | 24764.98 | NA | 0.0187896 |
CHE | 1980 | teacher | NA | 25230.31 | NA | 0.0187896 |
CHE | 1981 | teacher | NA | 25704.38 | NA | 0.0187896 |
CHE | 1982 | teacher | NA | 26187.35 | NA | 0.0187896 |
CHE | 1983 | teacher | NA | 26679.41 | NA | 0.0187896 |
CHE | 1984 | teacher | NA | 27180.70 | NA | 0.0187896 |
CHE | 1985 | teacher | NA | 27691.42 | NA | 0.0187896 |
CHE | 1986 | teacher | NA | 28211.73 | NA | 0.0187896 |
CHE | 1987 | teacher | NA | 28741.82 | NA | 0.0187896 |
CHE | 1988 | teacher | NA | 29281.87 | NA | 0.0187896 |
CHE | 1989 | teacher | NA | 29832.06 | NA | 0.0187896 |
CHE | 1990 | teacher | NA | 30392.60 | NA | 0.0187896 |
CHE | 1991 | teacher | NA | 30963.66 | NA | 0.0187896 |
CHE | 1992 | teacher | NA | 31545.46 | NA | 0.0187896 |
CHE | 1993 | teacher | NA | 32138.19 | NA | 0.0187896 |
CHE | 1994 | teacher | NA | 32742.05 | NA | 0.0187896 |
CHE | 1995 | teacher | NA | 33357.26 | NA | 0.0187896 |
CHE | 1996 | teacher | NA | 33984.04 | NA | 0.0187896 |
CHE | 1997 | teacher | NA | 34622.58 | NA | 0.0187896 |
CHE | 1998 | teacher | NA | 35273.13 | NA | 0.0187896 |
CHE | 1999 | teacher | NA | 35935.90 | NA | 0.0187896 |
CHE | 2000 | teacher | NA | 36611.12 | NA | 0.0187896 |
CHE | 2001 | teacher | NA | 37299.03 | NA | 0.0187896 |
CHE | 2002 | teacher | NA | 37999.87 | NA | 0.0187896 |
CHE | 2003 | teacher | NA | 38713.87 | NA | 0.0187896 |
CHE | 2004 | teacher | NA | 39441.29 | NA | 0.0187896 |
CHE | 2005 | teacher | NA | 40182.38 | NA | 0.0187896 |
CHE | 2006 | teacher | NA | 40937.39 | NA | 0.0187896 |
CHE | 2007 | teacher | NA | 41706.59 | NA | 0.0187896 |
CHE | 2008 | teacher | NA | 42490.25 | NA | 0.0187896 |
CHE | 2009 | teacher | NA | 43288.62 | NA | 0.0187896 |
CHE | 2010 | teacher | 44102 | 44102.00 | NA | 0.0187896 |
CHE | 2011 | teacher | 45666 | 45666.00 | 0.0354632 | 0.0354632 |
CHE | 2012 | teacher | 46104 | 46104.00 | 0.0095914 | 0.0095914 |
CHE | 2013 | teacher | 48345 | 48345.00 | 0.0486075 | 0.0486075 |
CHE | 2014 | teacher | 47223 | 47223.00 | -0.0232082 | -0.0232082 |
CHE | 2015 | teacher | 48595 | 48595.00 | 0.0290536 | 0.0290536 |
CHE | 2016 | teacher | 50672 | 50672.00 | 0.0427410 | 0.0427410 |
CHE | 2017 | teacher | 50879 | 50879.00 | 0.0040851 | 0.0040851 |
CHE | 2018 | teacher | 51844 | 51844.00 | 0.0189666 | 0.0189666 |
CHE | 2019 | teacher | 52738 | 52738.00 | 0.0172440 | 0.0172440 |
CHE | 2020 | teacher | 54777 | 54777.00 | 0.0386628 | 0.0386628 |
Germany Example
Germany reunified in 1992, we extrapolate to obtain data predictions for 1990 to facilitate computing percentage changes.
# str(ppts_easia_weuro_long)
# print
kable(ppts_easia_weuro_long %>%
filter(location_code == 'DEU' & variable == 'school') %>%
ungroup() %>%
select(location_code, year, variable,
value, value_interp1, pchg_yr1, pchg_yr1_interp1),
caption="Raw and interpolated Germany schools results")
location_code | year | variable | value | value_interp1 | pchg_yr1 | pchg_yr1_interp1 |
---|---|---|---|---|---|---|
DEU | 1987 | school | NA | 18091.76 | NA | -0.0016721 |
DEU | 1988 | school | NA | 18061.50 | NA | -0.0016721 |
DEU | 1989 | school | NA | 18031.30 | NA | -0.0016721 |
DEU | 1990 | school | NA | 18001.15 | NA | -0.0016721 |
DEU | 1991 | school | NA | 17971.05 | NA | -0.0016721 |
DEU | 1992 | school | 17941 | 17941.00 | NA | -0.0016721 |
DEU | 1993 | school | 17911 | 17911.00 | -0.0016721 | -0.0016721 |
DEU | 1994 | school | 17895 | 17895.00 | -0.0008933 | -0.0008933 |
DEU | 1995 | school | 17910 | 17910.00 | 0.0008382 | 0.0008382 |
DEU | 1996 | school | 17892 | 17892.00 | -0.0010050 | -0.0010050 |
DEU | 1997 | school | 17829 | 17829.00 | -0.0035211 | -0.0035211 |
DEU | 1998 | school | 17662 | 17662.00 | -0.0093668 | -0.0093668 |
DEU | 1999 | school | 17503 | 17503.00 | -0.0090024 | -0.0090024 |
DEU | 2000 | school | 17275 | 17275.00 | -0.0130263 | -0.0130263 |
DEU | 2001 | school | 17175 | 17175.00 | -0.0057887 | -0.0057887 |
DEU | 2002 | school | 17075 | 17075.00 | -0.0058224 | -0.0058224 |
DEU | 2003 | school | 16992 | 16992.00 | -0.0048609 | -0.0048609 |
DEU | 2004 | school | 16932 | 16932.00 | -0.0035311 | -0.0035311 |
DEU | 2005 | school | 16814 | 16814.00 | -0.0069691 | -0.0069691 |
DEU | 2006 | school | 16743 | 16743.00 | -0.0042227 | -0.0042227 |
DEU | 2007 | school | 16649 | 16649.00 | -0.0056143 | -0.0056143 |
DEU | 2008 | school | 16500 | 16500.00 | -0.0089495 | -0.0089495 |
DEU | 2009 | school | 16431 | 16431.00 | -0.0041818 | -0.0041818 |
DEU | 2010 | school | 16290 | 16290.00 | -0.0085813 | -0.0085813 |
DEU | 2011 | school | 16103 | 16103.00 | -0.0114794 | -0.0114794 |
DEU | 2012 | school | 15971 | 15971.00 | -0.0081972 | -0.0081972 |
DEU | 2013 | school | 15749 | 15749.00 | -0.0139002 | -0.0139002 |
DEU | 2014 | school | 15578 | 15578.00 | -0.0108578 | -0.0108578 |
DEU | 2015 | school | 15483 | 15483.00 | -0.0060983 | -0.0060983 |
DEU | 2016 | school | 15456 | 15456.00 | -0.0017438 | -0.0017438 |
DEU | 2017 | school | 15409 | 15409.00 | -0.0030409 | -0.0030409 |
DEU | 2018 | school | 15398 | 15398.00 | -0.0007139 | -0.0007139 |
DEU | 2019 | school | 15431 | 15431.00 | 0.0021431 | 0.0021431 |
DEU | 2020 | school | 15447 | 15447.00 | 0.0010369 | 0.0010369 |
Korean Busan Interpolate and Extrapolate Results
We do not do any extrapolation or interpolation in Busan Korea, because the data is available for school counts between 1965 and 2020, the min and max years of the Korean data, and data is available every single year.
# str(ppts_easia_weuro_long)
# print
kable(ppts_easia_weuro_long %>%
filter(location_code == 'KOR_Busan' & variable == 'school') %>%
ungroup() %>%
select(location_code, year, variable,
value, value_interp1, pchg_yr1, pchg_yr1_interp1),
caption="Raw and interpolated Korean Busan province schools results")
location_code | year | variable | value | value_interp1 | pchg_yr1 | pchg_yr1_interp1 |
---|---|---|---|---|---|---|
KOR_Busan | 1965 | school | 79 | 79 | NA | 0.0886076 |
KOR_Busan | 1966 | school | 86 | 86 | 0.0886076 | 0.0886076 |
KOR_Busan | 1967 | school | 93 | 93 | 0.0813953 | 0.0813953 |
KOR_Busan | 1968 | school | 96 | 96 | 0.0322581 | 0.0322581 |
KOR_Busan | 1969 | school | 98 | 98 | 0.0208333 | 0.0208333 |
KOR_Busan | 1970 | school | 99 | 99 | 0.0102041 | 0.0102041 |
KOR_Busan | 1971 | school | 100 | 100 | 0.0101010 | 0.0101010 |
KOR_Busan | 1972 | school | 103 | 103 | 0.0300000 | 0.0300000 |
KOR_Busan | 1973 | school | 106 | 106 | 0.0291262 | 0.0291262 |
KOR_Busan | 1974 | school | 107 | 107 | 0.0094340 | 0.0094340 |
KOR_Busan | 1975 | school | 112 | 112 | 0.0467290 | 0.0467290 |
KOR_Busan | 1976 | school | 114 | 114 | 0.0178571 | 0.0178571 |
KOR_Busan | 1977 | school | 114 | 114 | 0.0000000 | 0.0000000 |
KOR_Busan | 1978 | school | 128 | 128 | 0.1228070 | 0.1228070 |
KOR_Busan | 1979 | school | 131 | 131 | 0.0234375 | 0.0234375 |
KOR_Busan | 1980 | school | 137 | 137 | 0.0458015 | 0.0458015 |
KOR_Busan | 1981 | school | 149 | 149 | 0.0875912 | 0.0875912 |
KOR_Busan | 1982 | school | 158 | 158 | 0.0604027 | 0.0604027 |
KOR_Busan | 1983 | school | 169 | 169 | 0.0696203 | 0.0696203 |
KOR_Busan | 1984 | school | 184 | 184 | 0.0887574 | 0.0887574 |
KOR_Busan | 1985 | school | 193 | 193 | 0.0489130 | 0.0489130 |
KOR_Busan | 1986 | school | 196 | 196 | 0.0155440 | 0.0155440 |
KOR_Busan | 1987 | school | 200 | 200 | 0.0204082 | 0.0204082 |
KOR_Busan | 1988 | school | 202 | 202 | 0.0100000 | 0.0100000 |
KOR_Busan | 1989 | school | 217 | 217 | 0.0742574 | 0.0742574 |
KOR_Busan | 1990 | school | 221 | 221 | 0.0184332 | 0.0184332 |
KOR_Busan | 1991 | school | 222 | 222 | 0.0045249 | 0.0045249 |
KOR_Busan | 1992 | school | 227 | 227 | 0.0225225 | 0.0225225 |
KOR_Busan | 1993 | school | 230 | 230 | 0.0132159 | 0.0132159 |
KOR_Busan | 1994 | school | 228 | 228 | -0.0086957 | -0.0086957 |
KOR_Busan | 1995 | school | 245 | 245 | 0.0745614 | 0.0745614 |
KOR_Busan | 1996 | school | 250 | 250 | 0.0204082 | 0.0204082 |
KOR_Busan | 1997 | school | 257 | 257 | 0.0280000 | 0.0280000 |
KOR_Busan | 1998 | school | 259 | 259 | 0.0077821 | 0.0077821 |
KOR_Busan | 1999 | school | 265 | 265 | 0.0231660 | 0.0231660 |
KOR_Busan | 2000 | school | 267 | 267 | 0.0075472 | 0.0075472 |
KOR_Busan | 2001 | school | 269 | 269 | 0.0074906 | 0.0074906 |
KOR_Busan | 2002 | school | 273 | 273 | 0.0148699 | 0.0148699 |
KOR_Busan | 2003 | school | 279 | 279 | 0.0219780 | 0.0219780 |
KOR_Busan | 2004 | school | 283 | 283 | 0.0143369 | 0.0143369 |
KOR_Busan | 2005 | school | 285 | 285 | 0.0070671 | 0.0070671 |
KOR_Busan | 2006 | school | 292 | 292 | 0.0245614 | 0.0245614 |
KOR_Busan | 2007 | school | 293 | 293 | 0.0034247 | 0.0034247 |
KOR_Busan | 2008 | school | 293 | 293 | 0.0000000 | 0.0000000 |
KOR_Busan | 2009 | school | 297 | 297 | 0.0136519 | 0.0136519 |
KOR_Busan | 2010 | school | 298 | 298 | 0.0033670 | 0.0033670 |
KOR_Busan | 2011 | school | 297 | 297 | -0.0033557 | -0.0033557 |
KOR_Busan | 2012 | school | 299 | 299 | 0.0067340 | 0.0067340 |
KOR_Busan | 2013 | school | 302 | 302 | 0.0100334 | 0.0100334 |
KOR_Busan | 2014 | school | 305 | 305 | 0.0099338 | 0.0099338 |
KOR_Busan | 2015 | school | 306 | 306 | 0.0032787 | 0.0032787 |
KOR_Busan | 2016 | school | 308 | 308 | 0.0065359 | 0.0065359 |
KOR_Busan | 2017 | school | 308 | 308 | 0.0000000 | 0.0000000 |
KOR_Busan | 2018 | school | 305 | 305 | -0.0097403 | -0.0097403 |
KOR_Busan | 2019 | school | 304 | 304 | -0.0032787 | -0.0032787 |
KOR_Busan | 2020 | school | 304 | 304 | 0.0000000 | 0.0000000 |
Generate percentage changes every 5, 10, 15, 20 years
We now consider several different cuts, over the end-points of which we compute percentage changes.
# Every five years from 1940 until 2020
ar_it_cuts_1940t2020_i05 <- seq(1920, 2020, length.out = 21)
st_cuts_1940t2020_i05 <- "1920t2020i05"
# Every ten years from 1940 until 2020
ar_it_cuts_1940t2020_i10 <- seq(1920, 2020, length.out = 11)
st_cuts_1940t2020_i10 <- "1920t2020i10"
# Every 15 years from 1940 until 2020
ar_it_cuts_1940t2020_i15 <- seq(1925, 2015, length.out = 7)
st_cuts_1940t2020_i15 <- "1925t2015i15"
# Every 20 years from 1940 until 2020
ar_it_cuts_1940t2020_i20 <- seq(1920, 2020, length.out = 6)
st_cuts_1940t2020_i20 <- "1920t2020i20"
We put the cuts and the associated string names into two lists.
# List of cuts
ls_ar_it_cuts <- list(
ar_it_cuts_1940t2020_i05,
ar_it_cuts_1940t2020_i10,
ar_it_cuts_1940t2020_i15,
ar_it_cuts_1940t2020_i20
)
# Add names
names(ls_ar_it_cuts) <- c(
st_cuts_1940t2020_i05,
st_cuts_1940t2020_i10,
st_cuts_1940t2020_i15,
st_cuts_1940t2020_i20
)
# Display
for (st_cuts_name in names(ls_ar_it_cuts)) {
print(glue::glue(
"cutTypeName={st_cuts_name}:\n",
"bins={ls_ar_it_cuts[st_cuts_name]}"))
}
#> cutTypeName=1920t2020i05:
#> bins=c(1920, 1925, 1930, 1935, 1940, 1945, 1950, 1955, 1960, 1965, 1970, 1975, 1980, 1985, 1990, 1995, 2000, 2005, 2010, 2015, 2020)
#> cutTypeName=1920t2020i10:
#> bins=c(1920, 1930, 1940, 1950, 1960, 1970, 1980, 1990, 2000, 2010, 2020)
#> cutTypeName=1925t2015i15:
#> bins=c(1925, 1940, 1955, 1970, 1985, 2000, 2015)
#> cutTypeName=1920t2020i20:
#> bins=c(1920, 1940, 1960, 1980, 2000, 2020)
Generate percentage changes across start end end points of each bin, for bins of varying length. For both raw and interpolated data.
# E. Generate cuts -----
it_avg_type <- 1
it_cut_type <- 1
for (it_avg_type in c(1, 2)) {
if (it_avg_type == 1) {
svr_chg_var <- "pchg_yr1"
# common var across all spans
svr_chg_var_new <- "pchg"
svr_var_val <- "value"
} else if (it_avg_type == 2) {
svr_chg_var <- "pchg_yr1_interp1"
# common var across all spans
svr_chg_var_new <- "pchg_interp1"
svr_var_val <- "value_interp1"
}
# Add to stack the annual results
ppts_easia_weuro_pchg <- ppts_easia_weuro_long %>%
select(location_code, location_level,
variable, year, one_of(svr_chg_var, svr_var_val)) %>%
filter(!is.na(!!sym(svr_chg_var)) | !is.na(!!sym(svr_var_val))) %>%
rename(!!sym(svr_chg_var_new) := !!sym(svr_chg_var),
year_bins = year) %>%
mutate(year_bins = as.factor(year_bins)) %>%
mutate(year_bins_type = "1940t2020i01") %>%
ungroup()
# Loop over cut types
for (st_cuts_name in names(ls_ar_it_cuts)) {
print(glue::glue(
"cutTypeName={st_cuts_name}:\n",
"bins={ls_ar_it_cuts[st_cuts_name]}"))
# temp dataframe
ppts_easia_weuro_long_cut <- ppts_easia_weuro_long %>%
select(location_code, location_level,
variable, year,
one_of(svr_chg_var, svr_var_val))
# E.1 Cut types
ar_it_cuts <- ls_ar_it_cuts[[st_cuts_name]]
it_gap <- ar_it_cuts[2] - ar_it_cuts[1]
ar_it_end_seg <- ar_it_cuts[2:length(ar_it_cuts)]
ar_it_start_seg <- ar_it_end_seg - it_gap + 1
ar_st_lab <- paste0(ar_it_start_seg, "-", ar_it_end_seg)
# E.2 Generate new year groupings, consider only full-segments
# consider only sub-segments with observations in all years
ppts_easia_weuro_long_cut <- ppts_easia_weuro_long_cut %>%
mutate(year_bins = cut(year,
breaks = ar_it_cuts,
labels = ar_st_lab,
right = TRUE)) %>%
group_by(location_code, location_level,
variable, year_bins) %>%
mutate(val_n_in_bin = sum(!is.na(!!sym(svr_chg_var))))
# filter(val_n_in_bin == it_gap)
# E.3 cumulative product
ppts_easia_weuro_long_cut <- ppts_easia_weuro_long_cut %>%
arrange(location_code, location_level, variable, year_bins, year) %>%
group_by(location_code, location_level, variable, year_bins) %>%
mutate(!!sym(svr_chg_var_new) := cumprod(1 + !!sym(svr_chg_var)) - 1) %>%
mutate(!!sym(svr_chg_var_new) :=
case_when(val_n_in_bin == it_gap ~ !!sym(svr_chg_var_new),
TRUE ~ NA))
# View(ppts_easia_weuro_long_cut)
# E.4, slices last row
ppts_easia_weuro_long_cut <- ppts_easia_weuro_long_cut %>%
slice(n()) %>%
select(location_code, location_level,
variable, year_bins, one_of(svr_chg_var_new, svr_var_val)) %>%
ungroup() %>%
mutate(year_bins_type = st_cuts_name) %>%
filter(!is.na(!!sym(svr_chg_var_new)) | !is.na(!!sym(svr_var_val))) %>%
drop_na(year_bins)
# E.5 Stack
ppts_easia_weuro_pchg <- bind_rows(
ppts_easia_weuro_pchg, ppts_easia_weuro_long_cut)
}
# export
if (it_avg_type == 1) {
ppts_easia_weuro_pchg_raw <- ppts_easia_weuro_pchg
} else if (it_avg_type == 2) {
ppts_easia_weuro_pchg_interp1 <- ppts_easia_weuro_pchg
}
# # Print results
# print(kable(ppts_easia_weuro_long %>%
# filter(location_code == 'AFG' & variable == 'student'),
# caption= paste0("breaks=", st_cuts_name, ", variable=", svr_chg_var_new)))
}
#> cutTypeName=1920t2020i05:
#> bins=c(1920, 1925, 1930, 1935, 1940, 1945, 1950, 1955, 1960, 1965, 1970, 1975, 1980, 1985, 1990, 1995, 2000, 2005, 2010, 2015, 2020)
#> cutTypeName=1920t2020i10:
#> bins=c(1920, 1930, 1940, 1950, 1960, 1970, 1980, 1990, 2000, 2010, 2020)
#> cutTypeName=1925t2015i15:
#> bins=c(1925, 1940, 1955, 1970, 1985, 2000, 2015)
#> cutTypeName=1920t2020i20:
#> bins=c(1920, 1940, 1960, 1980, 2000, 2020)
#> cutTypeName=1920t2020i05:
#> bins=c(1920, 1925, 1930, 1935, 1940, 1945, 1950, 1955, 1960, 1965, 1970, 1975, 1980, 1985, 1990, 1995, 2000, 2005, 2010, 2015, 2020)
#> cutTypeName=1920t2020i10:
#> bins=c(1920, 1930, 1940, 1950, 1960, 1970, 1980, 1990, 2000, 2010, 2020)
#> cutTypeName=1925t2015i15:
#> bins=c(1925, 1940, 1955, 1970, 1985, 2000, 2015)
#> cutTypeName=1920t2020i20:
#> bins=c(1920, 1940, 1960, 1980, 2000, 2020)
Merge raw and interpolated results together, again
After generating percentage changes over different spans, merge again.
# D.3 merge together
# full_join same as left_join,
ppts_easia_weuro_world_pchg <- ppts_easia_weuro_pchg_interp1 %>%
full_join(ppts_easia_weuro_pchg_raw,
by = (c('location_code' = 'location_code',
'location_level' = 'location_level',
'variable' = 'variable',
'year_bins_type' = 'year_bins_type',
'year_bins' = 'year_bins'
))) %>%
mutate(variable = as.factor(variable),
year_bins_type = as.factor(year_bins_type),
# year_bins as string to allow for correct sorting
year_bins = as.character(year_bins)) %>%
select(location_code, location_level,
variable,
year_bins_type, year_bins,
pchg, pchg_interp1,
value, value_interp1)
# %>%
# value has missing data at the start year of each perentage calculation
# value_interp1 is just raw data but complete
# select(-value) %>% rename(value = value_interp1)
str(ppts_easia_weuro_world_pchg)
#> tibble [82,685 × 9] (S3: tbl_df/tbl/data.frame)
#> $ location_code : Factor w/ 286 levels "ABW","AFE","AFG",..: 1 1 1 1 1 1 1 1 1 1 ...
#> $ location_level: Factor w/ 4 levels "country","multicountry",..: 1 1 1 1 1 1 1 1 1 1 ...
#> $ variable : Factor w/ 5 levels "gdp","school",..: 1 1 1 1 1 1 1 1 1 1 ...
#> $ year_bins_type: Factor w/ 5 levels "1920t2020i05",..: 5 5 5 5 5 5 5 5 5 5 ...
#> $ year_bins : chr [1:82685] "1981" "1982" "1983" "1984" ...
#> $ pchg : num [1:82685] NA NA NA NA NA ...
#> $ pchg_interp1 : num [1:82685] 0.176 0.176 0.176 0.176 0.176 ...
#> $ value : num [1:82685] NA NA NA NA NA ...
#> $ value_interp1 : num [1:82685] 7662 9010 10596 12460 14653 ...
# print(ppts_easia_weuro_world_pchg[1:50,])
Sort and Display
We sort variables for more clearly organized output file.
# Arrange results
ppts_easia_weuro_world_pchg <- ppts_easia_weuro_world_pchg %>%
arrange(year_bins_type,
location_level, location_code,
variable, year_bins)
Afghanistan as Example
We print results from year year group aggregation. The Afghanistan example demonstrates that we are able to compute more interval-percentage changes after interpolation, given filled out values at key years. We use student count as example.
Note that the “value” column shows value at the last year in the interval, if this was observed.
# print
for (st_cuts_name in names(ls_ar_it_cuts)) {
print(
kable(ppts_easia_weuro_world_pchg %>%
filter(location_code == 'AFG' &
variable == 'student' &
year_bins_type == st_cuts_name) %>%
select(-location_level, -year_bins_type),
caption = paste0("Afghanistan example, aggregate=", st_cuts_name)))
}
location_code | variable | year_bins | pchg | pchg_interp1 | value | value_interp1 |
---|---|---|---|---|---|---|
AFG | student | 1961-1965 | NA | NA | NA | 404712.6 |
AFG | student | 1966-1970 | NA | 0.3359727 | 540685 | 540685.0 |
AFG | student | 1971-1975 | 0.2804905 | 0.2804905 | 692342 | 692342.0 |
AFG | student | 1976-1980 | NA | 0.3859956 | 959583 | 959583.0 |
AFG | student | 1981-1985 | NA | -0.5006685 | 479150 | 479150.0 |
AFG | student | 1986-1990 | NA | 0.2992028 | 622513 | 622513.0 |
AFG | student | 1991-1995 | NA | 1.1079030 | 1312197 | 1312197.0 |
AFG | student | 1996-2000 | NA | -0.4289272 | 749360 | 749360.0 |
AFG | student | 2001-2005 | 4.7633434 | 4.7633434 | 4318819 | 4318819.0 |
AFG | student | 2006-2010 | 0.2224004 | 0.2224004 | 5279326 | 5279326.0 |
AFG | student | 2011-2015 | 0.1742652 | 0.1742652 | 6199329 | 6199329.0 |
AFG | student | 2016-2020 | NA | 0.1322113 | NA | 7018950.2 |
location_code | variable | year_bins | pchg | pchg_interp1 | value | value_interp1 |
---|---|---|---|---|---|---|
AFG | student | 1961-1970 | NA | NA | 540685 | 540685 |
AFG | student | 1971-1980 | NA | 0.7747542 | 959583 | 959583 |
AFG | student | 1981-1990 | NA | -0.3512672 | 622513 | 622513 |
AFG | student | 1991-2000 | NA | 0.2037660 | 749360 | 749360 |
AFG | student | 2001-2010 | 6.045113 | 6.0451132 | 5279326 | 5279326 |
AFG | student | 2011-2020 | NA | 0.3295164 | NA | 7018950 |
location_code | variable | year_bins | pchg | pchg_interp1 | value | value_interp1 |
---|---|---|---|---|---|---|
AFG | student | 1956-1970 | NA | NA | 540685 | 540685 |
AFG | student | 1971-1985 | NA | -0.1138093 | 479150 | 479150 |
AFG | student | 1986-2000 | NA | 0.5639361 | 749360 | 749360 |
AFG | student | 2001-2015 | 7.272831 | 7.2728315 | 6199329 | 6199329 |
location_code | variable | year_bins | pchg | pchg_interp1 | value | value_interp1 |
---|---|---|---|---|---|---|
AFG | student | 1961-1980 | NA | NA | 959583 | 959583 |
AFG | student | 1981-2000 | NA | -0.2190775 | 749360 | 749360 |
AFG | student | 2001-2020 | NA | 8.3665931 | NA | 7018950 |
Austria as Example
Below we illustrate with Austria as example, showing data for school count. We are able to fill out some initial decades/years a little bit more with interpolation.
# print
for (st_cuts_name in names(ls_ar_it_cuts)) {
print(
kable(ppts_easia_weuro_world_pchg %>%
filter(location_code == 'AUT' &
variable == 'school' &
year_bins_type == st_cuts_name) %>%
select(-location_level, -year_bins_type),
caption = paste0("Austria example, aggregate=", st_cuts_name)))
}
location_code | variable | year_bins | pchg | pchg_interp1 | value | value_interp1 |
---|---|---|---|---|---|---|
AUT | school | 1921-1925 | NA | NA | 4697 | 4697.000 |
AUT | school | 1926-1930 | NA | 0.0004258 | 4699 | 4699.000 |
AUT | school | 1931-1935 | NA | -0.0053203 | 4674 | 4674.000 |
AUT | school | 1936-1940 | NA | -0.0415880 | NA | 4479.618 |
AUT | school | 1941-1945 | NA | -0.0415880 | NA | 4293.319 |
AUT | school | 1946-1950 | NA | 0.0248480 | 4400 | 4400.000 |
AUT | school | 1951-1955 | 0.0061364 | 0.0061364 | 4427 | 4427.000 |
AUT | school | 1956-1960 | -0.0076801 | -0.0076801 | 4393 | 4393.000 |
AUT | school | 1961-1965 | -0.0223082 | -0.0223082 | 4295 | 4295.000 |
AUT | school | 1966-1970 | -0.0749709 | -0.0749709 | 3973 | 3973.000 |
AUT | school | 1971-1975 | -0.0964007 | -0.0964007 | 3590 | 3590.000 |
AUT | school | 1976-1980 | -0.0389972 | -0.0389972 | 3450 | 3450.000 |
AUT | school | 1981-1985 | -0.0113043 | -0.0113043 | 3411 | 3411.000 |
AUT | school | 1986-1990 | -0.0073292 | -0.0073292 | 3386 | 3386.000 |
AUT | school | 1991-1995 | -0.0008860 | -0.0008860 | 3383 | 3383.000 |
AUT | school | 1996-2000 | -0.0067987 | -0.0067987 | 3360 | 3360.000 |
AUT | school | 2001-2005 | -0.0190476 | -0.0190476 | 3296 | 3296.000 |
AUT | school | 2006-2010 | -0.0379248 | -0.0379248 | 3171 | 3171.000 |
AUT | school | 2011-2015 | -0.0416272 | -0.0416272 | 3039 | 3039.000 |
AUT | school | 2016-2020 | -0.0082264 | -0.0082264 | 3014 | 3014.000 |
location_code | variable | year_bins | pchg | pchg_interp1 | value | value_interp1 |
---|---|---|---|---|---|---|
AUT | school | 1921-1930 | NA | NA | 4699 | 4699.000 |
AUT | school | 1931-1940 | NA | -0.0466870 | NA | 4479.618 |
AUT | school | 1941-1950 | NA | -0.0177733 | 4400 | 4400.000 |
AUT | school | 1951-1960 | -0.0015909 | -0.0015909 | 4393 | 4393.000 |
AUT | school | 1961-1970 | -0.0956066 | -0.0956066 | 3973 | 3973.000 |
AUT | school | 1971-1980 | -0.1316386 | -0.1316386 | 3450 | 3450.000 |
AUT | school | 1981-1990 | -0.0185507 | -0.0185507 | 3386 | 3386.000 |
AUT | school | 1991-2000 | -0.0076787 | -0.0076787 | 3360 | 3360.000 |
AUT | school | 2001-2010 | -0.0562500 | -0.0562500 | 3171 | 3171.000 |
AUT | school | 2011-2020 | -0.0495112 | -0.0495112 | 3014 | 3014.000 |
location_code | variable | year_bins | pchg | pchg_interp1 | value | value_interp1 |
---|---|---|---|---|---|---|
AUT | school | 1926-1940 | NA | -0.0462811 | NA | 4479.618 |
AUT | school | 1941-1955 | NA | -0.0117460 | 4427 | 4427.000 |
AUT | school | 1956-1970 | -0.1025525 | -0.1025525 | 3973 | 3973.000 |
AUT | school | 1971-1985 | -0.1414548 | -0.1414548 | 3411 | 3411.000 |
AUT | school | 1986-2000 | -0.0149516 | -0.0149516 | 3360 | 3360.000 |
AUT | school | 2001-2015 | -0.0955357 | -0.0955357 | 3039 | 3039.000 |
location_code | variable | year_bins | pchg | pchg_interp1 | value | value_interp1 |
---|---|---|---|---|---|---|
AUT | school | 1921-1940 | NA | NA | NA | 4479.618 |
AUT | school | 1941-1960 | NA | -0.0193360 | 4393 | 4393.000 |
AUT | school | 1961-1980 | -0.2146597 | -0.2146597 | 3450 | 3450.000 |
AUT | school | 1981-2000 | -0.0260870 | -0.0260870 | 3360 | 3360.000 |
AUT | school | 2001-2020 | -0.1029762 | -0.1029762 | 3014 | 3014.000 |
Switerland as Example
Below we illustrate with Germany as example, where we filled in values.
# print
for (st_cuts_name in names(ls_ar_it_cuts)) {
print(
kable(ppts_easia_weuro_world_pchg %>%
filter(location_code == 'CHE' &
variable == 'teacher' &
year_bins_type == st_cuts_name) %>%
select(-location_level, -year_bins_type),
caption = paste0("Germany example, aggregate=", st_cuts_name)))
}
location_code | variable | year_bins | pchg | pchg_interp1 | value | value_interp1 |
---|---|---|---|---|---|---|
CHE | teacher | 1921-1925 | NA | -0.0154381 | NA | 13289.02 |
CHE | teacher | 1926-1930 | NA | -0.0012891 | NA | 13271.88 |
CHE | teacher | 1931-1935 | NA | 0.0229897 | 13577 | 13577.00 |
CHE | teacher | 1936-1940 | NA | -0.0031722 | NA | 13533.93 |
CHE | teacher | 1941-1945 | NA | 0.0117872 | NA | 13693.46 |
CHE | teacher | 1946-1950 | NA | 0.0470723 | NA | 14338.04 |
CHE | teacher | 1951-1955 | NA | 0.1171946 | NA | 16018.38 |
CHE | teacher | 1956-1960 | NA | 0.0893236 | NA | 17449.20 |
CHE | teacher | 1961-1965 | NA | 0.0936521 | NA | 19083.36 |
CHE | teacher | 1966-1970 | NA | 0.0975457 | NA | 20944.85 |
CHE | teacher | 1971-1975 | NA | 0.0975457 | NA | 22987.94 |
CHE | teacher | 1976-1980 | NA | 0.0975457 | NA | 25230.31 |
CHE | teacher | 1981-1985 | NA | 0.0975457 | NA | 27691.42 |
CHE | teacher | 1986-1990 | NA | 0.0975457 | NA | 30392.60 |
CHE | teacher | 1991-1995 | NA | 0.0975457 | NA | 33357.26 |
CHE | teacher | 1996-2000 | NA | 0.0975457 | NA | 36611.12 |
CHE | teacher | 2001-2005 | NA | 0.0975457 | NA | 40182.38 |
CHE | teacher | 2006-2010 | NA | 0.0975457 | 44102 | 44102.00 |
CHE | teacher | 2011-2015 | 0.1018775 | 0.1018775 | 48595 | 48595.00 |
CHE | teacher | 2016-2020 | 0.1272147 | 0.1272147 | 54777 | 54777.00 |
location_code | variable | year_bins | pchg | pchg_interp1 | value | value_interp1 |
---|---|---|---|---|---|---|
CHE | teacher | 1921-1930 | NA | -0.0167073 | NA | 13271.88 |
CHE | teacher | 1931-1940 | NA | 0.0197445 | NA | 13533.93 |
CHE | teacher | 1941-1950 | NA | 0.0594144 | NA | 14338.04 |
CHE | teacher | 1951-1960 | NA | 0.2169864 | NA | 17449.20 |
CHE | teacher | 1961-1970 | NA | 0.2003332 | NA | 20944.85 |
CHE | teacher | 1971-1980 | NA | 0.2046066 | NA | 25230.31 |
CHE | teacher | 1981-1990 | NA | 0.2046066 | NA | 30392.60 |
CHE | teacher | 1991-2000 | NA | 0.2046066 | NA | 36611.12 |
CHE | teacher | 2001-2010 | NA | 0.2046066 | 44102 | 44102.00 |
CHE | teacher | 2011-2020 | 0.2420525 | 0.2420525 | 54777 | 54777.00 |
location_code | variable | year_bins | pchg | pchg_interp1 | value | value_interp1 |
---|---|---|---|---|---|---|
CHE | teacher | 1926-1940 | NA | 0.0184299 | NA | 13533.93 |
CHE | teacher | 1941-1955 | NA | 0.1835720 | NA | 16018.38 |
CHE | teacher | 1956-1970 | NA | 0.3075513 | NA | 20944.85 |
CHE | teacher | 1971-1985 | NA | 0.3221108 | NA | 27691.42 |
CHE | teacher | 1986-2000 | NA | 0.3221108 | NA | 36611.12 |
CHE | teacher | 2001-2015 | NA | 0.3273289 | 48595 | 48595.00 |
location_code | variable | year_bins | pchg | pchg_interp1 | value | value_interp1 |
---|---|---|---|---|---|---|
CHE | teacher | 1921-1940 | NA | 0.0027073 | NA | 13533.93 |
CHE | teacher | 1941-1960 | NA | 0.2892929 | NA | 17449.20 |
CHE | teacher | 1961-1980 | NA | 0.4459293 | NA | 25230.31 |
CHE | teacher | 1981-2000 | NA | 0.4510770 | NA | 36611.12 |
CHE | teacher | 2001-2020 | NA | 0.4961846 | 54777 | 54777.00 |
Germany as Example
Below we illustrate with Germany as example, where we filled in values.
# print
for (st_cuts_name in names(ls_ar_it_cuts)) {
print(
kable(ppts_easia_weuro_world_pchg %>%
filter(location_code == 'DEU' &
variable == 'school' &
year_bins_type == st_cuts_name) %>%
select(-location_level, -year_bins_type),
caption = paste0("Germany example, aggregate=", st_cuts_name)))
}
location_code | variable | year_bins | pchg | pchg_interp1 | value | value_interp1 |
---|---|---|---|---|---|---|
DEU | school | 1986-1990 | NA | NA | NA | 18001.15 |
DEU | school | 1991-1995 | NA | -0.0050636 | 17910 | 17910.00 |
DEU | school | 1996-2000 | -0.0354551 | -0.0354551 | 17275 | 17275.00 |
DEU | school | 2001-2005 | -0.0266860 | -0.0266860 | 16814 | 16814.00 |
DEU | school | 2006-2010 | -0.0311645 | -0.0311645 | 16290 | 16290.00 |
DEU | school | 2011-2015 | -0.0495396 | -0.0495396 | 15483 | 15483.00 |
DEU | school | 2016-2020 | -0.0023251 | -0.0023251 | 15447 | 15447.00 |
location_code | variable | year_bins | pchg | pchg_interp1 | value | value_interp1 |
---|---|---|---|---|---|---|
DEU | school | 1981-1990 | NA | NA | NA | 18001.15 |
DEU | school | 1991-2000 | NA | -0.0403391 | 17275 | 17275.00 |
DEU | school | 2001-2010 | -0.0570188 | -0.0570188 | 16290 | 16290.00 |
DEU | school | 2011-2020 | -0.0517495 | -0.0517495 | 15447 | 15447.00 |
location_code | variable | year_bins | pchg | pchg_interp1 | value | value_interp1 |
---|---|---|---|---|---|---|
DEU | school | 1986-2000 | NA | NA | 17275 | 17275 |
DEU | school | 2001-2015 | -0.1037337 | -0.1037337 | 15483 | 15483 |
location_code | variable | year_bins | pchg | pchg_interp1 | value | value_interp1 |
---|---|---|---|---|---|---|
DEU | school | 1981-2000 | NA | NA | 17275 | 17275 |
DEU | school | 2001-2020 | -0.1058177 | -0.1058177 | 15447 | 15447 |
Store to file
Finally, we save results to file in the data folder.
# Write to CSV and write to rda
if (bl_resave_to_data) {
write_csv(ppts_easia_weuro_world_pchg, "../data/ppts_easia_weuro_world_pchg.csv", na="")
usethis::use_data(ppts_easia_weuro_world_pchg, overwrite = TRUE)
}