Compute Distributional Statistics from a Discrete Random Variable

back to Fan's Dynamic Assets Repository Table of Content.

Contents

function [ds_stats_map] = fft_disc_rand_var_stats(varargin)

FFT_DISCT_RAND_VAR_STATS compute mean, sd, percentiles

Model simulation generates discrete random variables, to analyze, need to calculate various statistics

Statistics include:

@param st_var_name string name of the variable (choice/outcome) been analyzed

@param ar_choice_unique_sorted array 1 by N elements in the sample space of the discrete random variable ordered. Unique consumption values ordered. Unique asset choices ordered. etc.

@param ar_choice_prob array 1 by N probability mass function associated with each element ar_choice_unique_sorted

@param ar_fl_percentiles array 1 by M some vector of percentiles (0 to 100) that we would like to compute based on the discrete random variable's probability mass function and x values.

@return ds_stats_map container various distributional statistics

@example

% run function
st_var_name = 'binom';
[ds_stats_map] = fft_disc_rand_var_stats(st_var_name, ar_choice_unique_sorted, ar_choice_prob, ar_fl_percentiles);
% retrieve scalar statistics
% fl_choice_mean = ds_stats_map('fl_choice_mean');
fl_choice_sd = ds_stats_map('fl_choice_sd');
fl_choice_prob_zero = ds_stats_map('fl_choice_prob_zero');
fl_choice_prob_max = ds_stats_map('fl_choice_prob_max');
% retrieve distributional array stats
ar_choice_percentiles = ds_stats_map('ar_choice_percentiles');
ar_choice_perc_fracheld = ds_stats_map('ar_choice_perc_fracheld');

Default

use binomial as test case

fl_binom_n = 30;
fl_binom_p = 0.3;
ar_binom_x = 0:1:fl_binom_n;

% f(x)
ar_choice_prob = binopdf(ar_binom_x, fl_binom_n, fl_binom_p);
% x
ar_choice_unique_sorted = ar_binom_x - 10;
% percentiles of interest
ar_fl_percentiles = [0.1 1 5:5:25 35:15:65 75:5:95 99 99.9];

% display
st_var_name = 'binom';

% display
bl_display_drvstats = true;

% default
default_params = {st_var_name ar_choice_unique_sorted ar_choice_prob bl_display_drvstats ar_fl_percentiles};

Parse Parameters

[default_params{1:length(varargin)}] = varargin{:};
[st_var_name, ar_choice_unique_sorted, ar_choice_prob, bl_display_drvstats, ar_fl_percentiles] = default_params{:};

f(y), f(c), f(a): Compute Scalar Statistics for outcomes

Compute these outcomes:

% Mean of discrete random variable
fl_choice_mean = ar_choice_prob*ar_choice_unique_sorted';
% SD of discrete random variable
fl_choice_sd = sqrt(ar_choice_prob*((ar_choice_unique_sorted'-fl_choice_mean).^2));
% Coef of Variation of discrete random variable
fl_choice_coefofvar = fl_choice_sd/fl_choice_mean;
% min of y from policy function, p(y) might be 0
fl_choice_min = min(ar_choice_unique_sorted);
% max of y from policy function, p(y) might be 0
fl_choice_max = max(ar_choice_unique_sorted);
% prob(outcome=min(outcome)), fraction of people not saving for example
fl_choice_prob_min = sum(ar_choice_prob(ar_choice_unique_sorted == min(ar_choice_unique_sorted)));
% prob(outcome=0), fraction of people not saving for example
fl_choice_prob_zero = sum(ar_choice_prob(ar_choice_unique_sorted == 0));
% prob(outcome<0), fraction of people borrowing
fl_choice_prob_below_zero = sum(ar_choice_prob(ar_choice_unique_sorted < 0));
% prob(outcome>0), fraction of people borrowing
fl_choice_prob_above_zero = sum(ar_choice_prob(ar_choice_unique_sorted > 0));
% prob(outcome=max(outcome)), fraction of people saving up to max of grid,
% in principle if this is large, need to increase grid max value
fl_choice_prob_max = sum(ar_choice_prob(ar_choice_unique_sorted == max(ar_choice_unique_sorted)));

f(y), f(c), f(a): Compute Distributional Statistics for outcomes

Compute these outcomes:

% cumulative share of total outcome held by up to this level for outcomes
% like fraction of asset held by lowest highest fractions: E(X<x)
ar_choice_unique_cumufrac = cumsum(ar_choice_prob.*ar_choice_unique_sorted)/fl_choice_mean;

% Key Percentile Statistics
ar_choice_prob_cumsum = cumsum(ar_choice_prob)*100;

% ar_choice_percentiles: percentiles for the outcome variable
ar_choice_percentiles = zeros(size(ar_fl_percentiles));

% fraction of aggregate outcome variable held up to this percentile
ar_choice_perc_fracheld = zeros(size(ar_fl_percentiles));

for it_percentile = 1:length(ar_fl_percentiles)
    % get percentile of interest
    fl_cur_percentile = ar_fl_percentiles(it_percentile);
    % in the cumu prob array, first element higher or equal to current
    % percentile
    it_first_higher_idx = (cumsum(ar_choice_prob_cumsum >= fl_cur_percentile) == 1);
    % assign percentile
    fl_percentile = ar_choice_unique_sorted(it_first_higher_idx);
    fl_cumfrac = ar_choice_unique_cumufrac(it_first_higher_idx);
    if (length(fl_percentile) > 1)
        fl_percentile = fl_percentile(1);
        fl_cumfrac = fl_cumfrac(1);
    end
    ar_choice_percentiles(it_percentile) = fl_percentile;
    % asset held by up to this percentile
    ar_choice_perc_fracheld(it_percentile) = fl_cumfrac;
end

Collect Statistics

ds_stats_map = containers.Map('KeyType','char', 'ValueType','any');

% scalar statistics
ds_stats_map('fl_choice_mean') = fl_choice_mean;
ds_stats_map('fl_choice_sd') = fl_choice_sd;
ds_stats_map('fl_choice_coefofvar') = fl_choice_coefofvar;
ds_stats_map('fl_choice_min') = fl_choice_min;
ds_stats_map('fl_choice_max') = fl_choice_max;
ds_stats_map('fl_choice_prob_zero') = fl_choice_prob_zero;
ds_stats_map('fl_choice_prob_below_zero') = fl_choice_prob_below_zero;
ds_stats_map('fl_choice_prob_above_zero') = fl_choice_prob_above_zero;
ds_stats_map('fl_choice_prob_min') = fl_choice_prob_min;
ds_stats_map('fl_choice_prob_max') = fl_choice_prob_max;

% distributional array stats
ds_stats_map('ar_fl_percentiles') = ar_fl_percentiles;
ds_stats_map('ar_choice_percentiles') = ar_choice_percentiles;
ds_stats_map('ar_choice_perc_fracheld') = ar_choice_perc_fracheld;

Display

if (bl_display_drvstats)

    disp('----------------------------------------');
    disp('xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx');
    disp(['Summary Statistics for: ' char(st_var_name)])
    disp('xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx');
    disp('----------------------------------------');

    disp('fl_choice_mean');
    disp(fl_choice_mean);
    disp('fl_choice_sd');
    disp(fl_choice_sd);
    disp('fl_choice_coefofvar');
    disp(fl_choice_coefofvar);

    disp('fl_choice_prob_zero');
    disp(fl_choice_prob_zero);
    disp('fl_choice_prob_below_zero');
    disp(fl_choice_prob_below_zero);
    disp('fl_choice_prob_above_zero');
    disp(fl_choice_prob_above_zero);
    disp('fl_choice_prob_max');
    disp(fl_choice_prob_max);


    disp('tb_disc_cumu');
    tb_disc_cumu = table(ar_choice_unique_sorted', ar_choice_prob', ...
                         ar_choice_prob_cumsum', ar_choice_unique_cumufrac');
    st_var_name = [char(st_var_name) ' discrete val'];
    st_var_name_p = [char(st_var_name) ' prob mass'];
    tb_disc_cumu.Properties.VariableNames = ...
        matlab.lang.makeValidName([st_var_name, st_var_name_p, "CDF", "cumsum frac"]);
    disp(head(tb_disc_cumu,10));
    disp(tail(tb_disc_cumu,10));

    disp('tb_prob_drv');
    tb_prob_drv = table(ar_fl_percentiles', ar_choice_percentiles', ar_choice_perc_fracheld');
    st_var_name = [char(st_var_name) ' percentile values'];
    tb_prob_drv.Properties.VariableNames = matlab.lang.makeValidName(["percentiles", st_var_name, "frac of sum held below this percentile"]);
    disp(tb_prob_drv);

%     fft_container_map_display(ds_stats_map)
end
----------------------------------------
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Summary Statistics for: binom
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
----------------------------------------
fl_choice_mean
   -1.0000

fl_choice_sd
    2.5100

fl_choice_coefofvar
   -2.5100

fl_choice_prob_zero
    0.1416

fl_choice_prob_below_zero
    0.5888

fl_choice_prob_above_zero
    0.2696

fl_choice_prob_max
   2.0589e-16

tb_disc_cumu
    binomDiscreteVal    binomDiscreteValProbMass       CDF       cumsumFrac
    ________________    ________________________    _________    __________

          -10                  2.2539e-05           0.0022539    0.00022539
           -9                  0.00028979            0.031233     0.0028335
           -8                   0.0018008             0.21132       0.01724
           -7                   0.0072034             0.93166      0.067664
           -6                    0.020838              3.0155       0.19269
           -5                     0.04644              7.6595       0.42489
           -4                    0.082928              15.952       0.75661
           -3                     0.12185              28.138        1.1222
           -2                     0.15014              43.152        1.4224
           -1                     0.15729              58.881        1.5797

    binomDiscreteVal    binomDiscreteValProbMass    CDF    cumsumFrac
    ________________    ________________________    ___    __________

           11                  6.0392e-06           100        1     
           12                  1.0588e-06           100        1     
           13                  1.5784e-07           100        1     
           14                   1.973e-08           100        1     
           15                  2.0293e-09           100        1     
           16                  1.6725e-10           100        1     
           17                  1.0619e-11           100        1     
           18                  4.8762e-13           100        1     
           19                  1.4412e-14           100        1     
           20                  2.0589e-16           100        1     

tb_prob_drv
    percentiles    binomDiscreteValPercentileValues    fracOfSumHeldBelowThisPercentile
    ___________    ________________________________    ________________________________

        0.1                       -8                               0.01724             
          1                       -6                               0.19269             
          5                       -5                               0.42489             
         10                       -4                               0.75661             
         15                       -4                               0.75661             
         20                       -3                                1.1222             
         25                       -3                                1.1222             
         35                       -2                                1.4224             
         50                       -1                                1.5797             
         65                        0                                1.5797             
         75                        1                                1.4694             
         80                        1                                1.4694             
         85                        2                                1.3197             
         90                        2                                1.3197             
         95                        3                                1.1865             
         99                        5                                1.0412             
       99.9                        7                                1.0052             

end
ans = 

  Map with properties:

        Count: 13
      KeyType: char
    ValueType: any