Go to the RMD, R, PDF, or HTML version of this file. Go back to fan’s REconTools Package, R Code Examples Repository (bookdown site), or Intro Stats with R Repository (bookdown site).
There is a dataset where there are different types of individuals, perhaps household size, that is the grouping variable. Within each group, we compute the incremental marginal propensity to consume for each additional check. We now also want to know the average propensity to consume up to each check considering all allocated checks. We needed to calculatet this for Nygaard, Sørensen and Wang (2021). This can be dealt with by using the cumall function.
Use the df_hgt_wgt as the testing dataset. In the example below, group by individual id, sort by survey month, and cumulative mean over the protein variable.
In the protein example
First select the testing dataset and variables.
# Load the REconTools Dataset df_hgt_wgt
data("df_hgt_wgt")
# str(df_hgt_wgt)
# Select several rows
df_hgt_wgt_sel <- df_hgt_wgt %>%
filter(S.country == "Cebu") %>%
select(indi.id, svymthRound, prot)
Second, arrange, groupby, and cumulative mean. The protein variable is protein for each survey month, from month 2 to higher as babies grow. The protein intake observed is increasing quickly, hence, the cumulative mean is lower than the observed value for the survey month of the baby.
# Group by indi.id and sort by protein
df_hgt_wgt_sel_cummean <- df_hgt_wgt_sel %>%
arrange(indi.id, svymthRound) %>%
group_by(indi.id) %>%
mutate(prot_cummean = cummean(prot))
# display results
REconTools::ff_summ_percentiles(df_hgt_wgt_sel_cummean)
# display results
df_hgt_wgt_sel_cummean %>% filter(indi.id %in% c(17, 18)) %>%
kable() %>% kable_styling_fc()
indi.id | svymthRound | prot | prot_cummean |
---|---|---|---|
17 | 0 | 0.5 | 0.5000000 |
17 | 2 | 0.7 | 0.6000000 |
17 | 4 | 0.5 | 0.5666667 |
17 | 6 | 0.5 | 0.5500000 |
17 | 8 | 6.1 | 1.6600000 |
17 | 10 | 5.0 | 2.2166667 |
17 | 12 | 6.4 | 2.8142857 |
17 | 14 | 20.1 | 4.9750000 |
17 | 16 | 20.1 | 6.6555556 |
17 | 18 | 23.0 | 8.2900000 |
17 | 20 | 24.9 | 9.8000000 |
17 | 22 | 20.1 | 10.6583333 |
17 | 24 | 10.1 | 10.6153846 |
17 | 102 | NA | NA |
17 | 138 | NA | NA |
17 | 187 | NA | NA |
17 | 224 | NA | NA |
17 | 258 | NA | NA |
18 | 0 | 1.2 | 1.2000000 |
18 | 2 | 4.7 | 2.9500000 |
18 | 4 | 17.2 | 7.7000000 |
18 | 6 | 18.6 | 10.4250000 |
18 | 8 | NA | NA |
18 | 10 | 16.8 | NA |
18 | 12 | NA | NA |
18 | 14 | NA | NA |
18 | 16 | NA | NA |
18 | 18 | NA | NA |
18 | 20 | NA | NA |
18 | 22 | 15.7 | NA |
18 | 24 | 22.5 | NA |
18 | 102 | NA | NA |
18 | 138 | NA | NA |
18 | 187 | NA | NA |
18 | 224 | NA | NA |
18 | 258 | NA | NA |
Third, in the basic implementation above, if an incremental month has NA, no values computed at that point or after. This is the case for individual 18 above. To ignore NA, we have, from this. Note how results for individual 18 changes.
# https://stackoverflow.com/a/49906718/8280804
# Group by indi.id and sort by protein
df_hgt_wgt_sel_cummean_noNA <- df_hgt_wgt_sel %>%
arrange(indi.id, svymthRound) %>%
group_by(indi.id, isna = is.na(prot)) %>%
mutate(prot_cummean = ifelse(isna, NA, cummean(prot)))
# display results
df_hgt_wgt_sel_cummean_noNA %>% filter(indi.id %in% c(17, 18)) %>%
kable() %>% kable_styling_fc()
indi.id | svymthRound | prot | isna | prot_cummean |
---|---|---|---|---|
17 | 0 | 0.5 | FALSE | 0.5000000 |
17 | 2 | 0.7 | FALSE | 0.6000000 |
17 | 4 | 0.5 | FALSE | 0.5666667 |
17 | 6 | 0.5 | FALSE | 0.5500000 |
17 | 8 | 6.1 | FALSE | 1.6600000 |
17 | 10 | 5.0 | FALSE | 2.2166667 |
17 | 12 | 6.4 | FALSE | 2.8142857 |
17 | 14 | 20.1 | FALSE | 4.9750000 |
17 | 16 | 20.1 | FALSE | 6.6555556 |
17 | 18 | 23.0 | FALSE | 8.2900000 |
17 | 20 | 24.9 | FALSE | 9.8000000 |
17 | 22 | 20.1 | FALSE | 10.6583333 |
17 | 24 | 10.1 | FALSE | 10.6153846 |
17 | 102 | NA | TRUE | NA |
17 | 138 | NA | TRUE | NA |
17 | 187 | NA | TRUE | NA |
17 | 224 | NA | TRUE | NA |
17 | 258 | NA | TRUE | NA |
18 | 0 | 1.2 | FALSE | 1.2000000 |
18 | 2 | 4.7 | FALSE | 2.9500000 |
18 | 4 | 17.2 | FALSE | 7.7000000 |
18 | 6 | 18.6 | FALSE | 10.4250000 |
18 | 8 | NA | TRUE | NA |
18 | 10 | 16.8 | FALSE | 11.7000000 |
18 | 12 | NA | TRUE | NA |
18 | 14 | NA | TRUE | NA |
18 | 16 | NA | TRUE | NA |
18 | 18 | NA | TRUE | NA |
18 | 20 | NA | TRUE | NA |
18 | 22 | 15.7 | FALSE | 12.3666667 |
18 | 24 | 22.5 | FALSE | 13.8142857 |
18 | 102 | NA | TRUE | NA |
18 | 138 | NA | TRUE | NA |
18 | 187 | NA | TRUE | NA |
18 | 224 | NA | TRUE | NA |
18 | 258 | NA | TRUE | NA |