Compute Distributional Statistics from a Discrete Random Variable
back to Fan's Dynamic Assets Repository Table of Content.
Contents
function [ds_stats_map] = fft_disc_rand_var_stats(varargin)
FFT_DISCT_RAND_VAR_STATS compute mean, sd, percentiles
Model simulation generates discrete random variables, to analyze, need to calculate various statistics
Statistics include:
- percentiles:
- fraction of outcome held by up to percentiles:
@param st_var_name string name of the variable (choice/outcome) been analyzed
@param ar_choice_unique_sorted array 1 by N elements in the sample space of the discrete random variable ordered. Unique consumption values ordered. Unique asset choices ordered. etc.
@param ar_choice_prob array 1 by N probability mass function associated with each element ar_choice_unique_sorted
@param ar_fl_percentiles array 1 by M some vector of percentiles (0 to 100) that we would like to compute based on the discrete random variable's probability mass function and x values.
@return ds_stats_map container various distributional statistics
@example
% run function st_var_name = 'binom'; [ds_stats_map] = fft_disc_rand_var_stats(st_var_name, ar_choice_unique_sorted, ar_choice_prob, ar_fl_percentiles); % retrieve scalar statistics % fl_choice_mean = ds_stats_map('fl_choice_mean'); fl_choice_sd = ds_stats_map('fl_choice_sd'); fl_choice_prob_zero = ds_stats_map('fl_choice_prob_zero'); fl_choice_prob_max = ds_stats_map('fl_choice_prob_max'); % retrieve distributional array stats ar_choice_percentiles = ds_stats_map('ar_choice_percentiles'); ar_choice_perc_fracheld = ds_stats_map('ar_choice_perc_fracheld');
Default
use binomial as test case
fl_binom_n = 30; fl_binom_p = 0.3; ar_binom_x = 0:1:fl_binom_n; % f(x) ar_choice_prob = binopdf(ar_binom_x, fl_binom_n, fl_binom_p); % x ar_choice_unique_sorted = ar_binom_x - 10; % percentiles of interest ar_fl_percentiles = [0.1 1 5:5:25 35:15:65 75:5:95 99 99.9]; % display st_var_name = 'binom'; % display bl_display_drvstats = true; % default default_params = {st_var_name ar_choice_unique_sorted ar_choice_prob bl_display_drvstats ar_fl_percentiles};
Parse Parameters
[default_params{1:length(varargin)}] = varargin{:}; [st_var_name, ar_choice_unique_sorted, ar_choice_prob, bl_display_drvstats, ar_fl_percentiles] = default_params{:};
f(y), f(c), f(a): Compute Scalar Statistics for outcomes
Compute these outcomes:
- mean:
- sd:
- prob(outcome=0):
- prob(outcome=max(outcome)):
% Mean of discrete random variable fl_choice_mean = ar_choice_prob*ar_choice_unique_sorted'; % SD of discrete random variable fl_choice_sd = sqrt(ar_choice_prob*((ar_choice_unique_sorted'-fl_choice_mean).^2)); % Coef of Variation of discrete random variable fl_choice_coefofvar = fl_choice_sd/fl_choice_mean; % min of y from policy function, p(y) might be 0 fl_choice_min = min(ar_choice_unique_sorted); % max of y from policy function, p(y) might be 0 fl_choice_max = max(ar_choice_unique_sorted); % prob(outcome=min(outcome)), fraction of people not saving for example fl_choice_prob_min = sum(ar_choice_prob(ar_choice_unique_sorted == min(ar_choice_unique_sorted))); % prob(outcome=0), fraction of people not saving for example fl_choice_prob_zero = sum(ar_choice_prob(ar_choice_unique_sorted == 0)); % prob(outcome<0), fraction of people borrowing fl_choice_prob_below_zero = sum(ar_choice_prob(ar_choice_unique_sorted < 0)); % prob(outcome>0), fraction of people borrowing fl_choice_prob_above_zero = sum(ar_choice_prob(ar_choice_unique_sorted > 0)); % prob(outcome=max(outcome)), fraction of people saving up to max of grid, % in principle if this is large, need to increase grid max value fl_choice_prob_max = sum(ar_choice_prob(ar_choice_unique_sorted == max(ar_choice_unique_sorted)));
f(y), f(c), f(a): Compute Distributional Statistics for outcomes
Compute these outcomes:
- percentiles:
- share of outcome (consumption/assets) held by households below this percentile: . Note that this statistics could exceed 1. Suppose the average level is negative, but there are both positive and negative , then the statistics will first be what fraction of overall debt is held by up to this percentile, then it will exceed 100 percent, as we move towards the final values, then as it goes through the values, we will move back to 100 percent.
% cumulative share of total outcome held by up to this level for outcomes % like fraction of asset held by lowest highest fractions: E(X<x) ar_choice_unique_cumufrac = cumsum(ar_choice_prob.*ar_choice_unique_sorted)/fl_choice_mean; % Key Percentile Statistics ar_choice_prob_cumsum = cumsum(ar_choice_prob)*100; % ar_choice_percentiles: percentiles for the outcome variable ar_choice_percentiles = zeros(size(ar_fl_percentiles)); % fraction of aggregate outcome variable held up to this percentile ar_choice_perc_fracheld = zeros(size(ar_fl_percentiles)); for it_percentile = 1:length(ar_fl_percentiles) % get percentile of interest fl_cur_percentile = ar_fl_percentiles(it_percentile); % in the cumu prob array, first element higher or equal to current % percentile it_first_higher_idx = (cumsum(ar_choice_prob_cumsum >= fl_cur_percentile) == 1); % assign percentile fl_percentile = ar_choice_unique_sorted(it_first_higher_idx); fl_cumfrac = ar_choice_unique_cumufrac(it_first_higher_idx); if (length(fl_percentile) > 1) fl_percentile = fl_percentile(1); fl_cumfrac = fl_cumfrac(1); end ar_choice_percentiles(it_percentile) = fl_percentile; % asset held by up to this percentile ar_choice_perc_fracheld(it_percentile) = fl_cumfrac; end
Collect Statistics
ds_stats_map = containers.Map('KeyType','char', 'ValueType','any'); % scalar statistics ds_stats_map('fl_choice_mean') = fl_choice_mean; ds_stats_map('fl_choice_sd') = fl_choice_sd; ds_stats_map('fl_choice_coefofvar') = fl_choice_coefofvar; ds_stats_map('fl_choice_min') = fl_choice_min; ds_stats_map('fl_choice_max') = fl_choice_max; ds_stats_map('fl_choice_prob_zero') = fl_choice_prob_zero; ds_stats_map('fl_choice_prob_below_zero') = fl_choice_prob_below_zero; ds_stats_map('fl_choice_prob_above_zero') = fl_choice_prob_above_zero; ds_stats_map('fl_choice_prob_min') = fl_choice_prob_min; ds_stats_map('fl_choice_prob_max') = fl_choice_prob_max; % distributional array stats ds_stats_map('ar_fl_percentiles') = ar_fl_percentiles; ds_stats_map('ar_choice_percentiles') = ar_choice_percentiles; ds_stats_map('ar_choice_perc_fracheld') = ar_choice_perc_fracheld;
Display
if (bl_display_drvstats) disp('----------------------------------------'); disp('xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'); disp(['Summary Statistics for: ' char(st_var_name)]) disp('xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'); disp('----------------------------------------'); disp('fl_choice_mean'); disp(fl_choice_mean); disp('fl_choice_sd'); disp(fl_choice_sd); disp('fl_choice_coefofvar'); disp(fl_choice_coefofvar); disp('fl_choice_prob_zero'); disp(fl_choice_prob_zero); disp('fl_choice_prob_below_zero'); disp(fl_choice_prob_below_zero); disp('fl_choice_prob_above_zero'); disp(fl_choice_prob_above_zero); disp('fl_choice_prob_max'); disp(fl_choice_prob_max); disp('tb_disc_cumu'); tb_disc_cumu = table(ar_choice_unique_sorted', ar_choice_prob', ... ar_choice_prob_cumsum', ar_choice_unique_cumufrac'); st_var_name = [char(st_var_name) ' discrete val']; st_var_name_p = [char(st_var_name) ' prob mass']; tb_disc_cumu.Properties.VariableNames = ... matlab.lang.makeValidName([st_var_name, st_var_name_p, "CDF", "cumsum frac"]); disp(head(tb_disc_cumu,10)); disp(tail(tb_disc_cumu,10)); disp('tb_prob_drv'); tb_prob_drv = table(ar_fl_percentiles', ar_choice_percentiles', ar_choice_perc_fracheld'); st_var_name = [char(st_var_name) ' percentile values']; tb_prob_drv.Properties.VariableNames = matlab.lang.makeValidName(["percentiles", st_var_name, "frac of sum held below this percentile"]); disp(tb_prob_drv); % fft_container_map_display(ds_stats_map) end
---------------------------------------- xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Summary Statistics for: binom xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx ---------------------------------------- fl_choice_mean -1.0000 fl_choice_sd 2.5100 fl_choice_coefofvar -2.5100 fl_choice_prob_zero 0.1416 fl_choice_prob_below_zero 0.5888 fl_choice_prob_above_zero 0.2696 fl_choice_prob_max 2.0589e-16 tb_disc_cumu binomDiscreteVal binomDiscreteValProbMass CDF cumsumFrac ________________ ________________________ _________ __________ -10 2.2539e-05 0.0022539 0.00022539 -9 0.00028979 0.031233 0.0028335 -8 0.0018008 0.21132 0.01724 -7 0.0072034 0.93166 0.067664 -6 0.020838 3.0155 0.19269 -5 0.04644 7.6595 0.42489 -4 0.082928 15.952 0.75661 -3 0.12185 28.138 1.1222 -2 0.15014 43.152 1.4224 -1 0.15729 58.881 1.5797 binomDiscreteVal binomDiscreteValProbMass CDF cumsumFrac ________________ ________________________ ___ __________ 11 6.0392e-06 100 1 12 1.0588e-06 100 1 13 1.5784e-07 100 1 14 1.973e-08 100 1 15 2.0293e-09 100 1 16 1.6725e-10 100 1 17 1.0619e-11 100 1 18 4.8762e-13 100 1 19 1.4412e-14 100 1 20 2.0589e-16 100 1 tb_prob_drv percentiles binomDiscreteValPercentileValues fracOfSumHeldBelowThisPercentile ___________ ________________________________ ________________________________ 0.1 -8 0.01724 1 -6 0.19269 5 -5 0.42489 10 -4 0.75661 15 -4 0.75661 20 -3 1.1222 25 -3 1.1222 35 -2 1.4224 50 -1 1.5797 65 0 1.5797 75 1 1.4694 80 1 1.4694 85 2 1.3197 90 2 1.3197 95 3 1.1865 99 5 1.0412 99.9 7 1.0052
end
ans = Map with properties: Count: 13 KeyType: char ValueType: any