Chapter 7 Tables
7.1 Basic Table Generation
7.1.1 Generate a Table and Fill with Data Row by Row or Random Data
Go back to fan’s MEconTools Package, Matlab Code Examples Repository (bookdown site), or Math for Econ with Matlab Repository (bookdown site).
7.1.1.1 An Empty Table, Filled with Loop Row by Row
First, generate an empty table.
% Make N by 2 matrix of fieldname + value type
mt_st_variable_names_types = [["category", "string"]; ...
["wage", "double"]; ...
["skilled", "logical"]; ...
["labdemand", "double"]; ...
["labsupply", "double"]; ...
["labsupplyprob", "string"]; ...
["rho_manual", "double"]; ...
["rho_routine", "double"]; ...
["rho_analytical", "double"]; ...
];
% Make table using fieldnames & value types from above
tb_equilibrium = table('Size',[0,size(mt_st_variable_names_types,1)],...
'VariableNames', mt_st_variable_names_types(:,1),...
'VariableTypes', mt_st_variable_names_types(:,2));
% display table
disp(size(tb_equilibrium));
0 9
Second, over a loop, fill the table with values row by row.
for it_rho_set=[1,2,3]
if (it_rho_set == 1)
cl_data_row = {'C001',1, true,1,1,0.5,0.5,0.5,0.5};
elseif (it_rho_set == 2)
cl_data_row = {'C002',1, 0 ,1.2,0.6,0.5,0.45,0.5,0.5};
elseif (it_rho_set == 3)
cl_data_row = {'C011',1.1,false,1.2,0.6,0.5,0.45,0.45,0.45};
end
tb_equilibrium = [tb_equilibrium; cl_data_row];
end
disp(tb_equilibrium);
category wage skilled labdemand labsupply labsupplyprob rho_manual rho_routine rho_analytical
________ ____ _______ _________ _________ _____________ __________ ___________ ______________
"C001" 1 1 1 1 "0.5" 0.5 0.5 0.5
"C002" 1 0 1.2 0.6 "0.5" 0.45 0.5 0.5
"123" 1.1 0 1.2 0.6 "0.5" 0.45 0.45 0.45
7.1.1.2 Generate a Table with M Variables of Random Data
Generate a numeric table with random varlues and a string column
% Numeric Matrix
it_num_cols = 4;
it_num_rows = 5;
mt_data = rand([it_num_rows, it_num_cols]);
% Generate Table
tb_test = array2table(mt_data);
% Generate Row and Column Names
cl_col_names = strcat('col_', string((1:it_num_cols)));
cl_row_names = strcat('row_', string((1:it_num_rows)));
tb_test.Properties.VariableNames = matlab.lang.makeValidName(cl_col_names);
tb_test.Properties.RowNames = matlab.lang.makeValidName(cl_row_names);
% Generate two string variable
rng(456);
cl_st_var1 = strcat('data=', string(rand([it_num_rows,1])));
cl_st_var2 = strcat('data=', string(rand([it_num_rows,1])));
tb_test = addvars(tb_test, cl_st_var1, cl_st_var2);
% Display Table
disp(tb_test);
col_1 col_2 col_3 col_4 cl_st_var1 cl_st_var2
_______ _______ _________ _______ ______________ ______________
row_1 0.43568 0.4688 0.18092 0.14604 "data=0.24876" "data=0.60411"
row_2 0.38527 0.57 0.11816 0.54272 "data=0.16307" "data=0.8857"
row_3 0.57571 0.6457 0.24273 0.8571 "data=0.78364" "data=0.75912"
row_4 0.14609 0.72334 0.0081834 0.20021 "data=0.80852" "data=0.18111"
row_5 0.68659 0.68067 0.36007 0.13463 "data=0.62563" "data=0.15017"
7.1.2 Tables Order, Sort, Add, Rename and Drop Columns
Go back to fan’s MEconTools Package, Matlab Code Examples Repository (bookdown site), or Math for Econ with Matlab Repository (bookdown site).
7.1.2.1 Given Table, Show Some Columns First
% Generate Table
it_num_cols = 4;
it_num_rows = 5;
mt_data = rand([it_num_rows, it_num_cols]);
tb_test = array2table(mt_data);
cl_col_names = strcat('col_', string((1:it_num_cols)));
cl_row_names = strcat('row_', string((1:it_num_rows)));
tb_test.Properties.VariableNames = matlab.lang.makeValidName(cl_col_names);
tb_test.Properties.RowNames = matlab.lang.makeValidName(cl_row_names);
rng(123);
mean = strcat('data=', string(rand([it_num_rows,1])));
sd = strcat('data=', string(rand([it_num_rows,1])));
tb_test_ori = addvars(tb_test, mean, sd);
% Move Variable
tb_test_varmove = movevars(tb_test_ori, {'mean', 'sd'}, 'Before', 'col_1');
% Display
disp(tb_test_ori);
col_1 col_2 col_3 col_4 mean sd
________ _______ _______ _______ ______________ ______________
row_1 0.34318 0.738 0.6344 0.32296 "data=0.69647" "data=0.42311"
row_2 0.72905 0.18249 0.84943 0.36179 "data=0.28614" "data=0.98076"
row_3 0.43857 0.17545 0.72446 0.22826 "data=0.22685" "data=0.68483"
row_4 0.059678 0.53155 0.61102 0.29371 "data=0.55131" "data=0.48093"
row_5 0.39804 0.53183 0.72244 0.63098 "data=0.71947" "data=0.39212"
disp(tb_test_varmove);
mean sd col_1 col_2 col_3 col_4
______________ ______________ ________ _______ _______ _______
row_1 "data=0.69647" "data=0.42311" 0.34318 0.738 0.6344 0.32296
row_2 "data=0.28614" "data=0.98076" 0.72905 0.18249 0.84943 0.36179
row_3 "data=0.22685" "data=0.68483" 0.43857 0.17545 0.72446 0.22826
row_4 "data=0.55131" "data=0.48093" 0.059678 0.53155 0.61102 0.29371
row_5 "data=0.71947" "data=0.39212" 0.39804 0.53183 0.72244 0.63098
7.1.2.2 Rename Table Columns
Rename the first Column, rename the ‘sd’ column, then rename the 3rd and 4th Columns. Note for multiple column renaming, use parenthesis, but for single column renaming, use bracket.
tb_test_varmove.Properties.VariableNames{1} = 'RenameMean';
tb_test_varmove.Properties.VariableNames{'sd'} = 'RenameSDCol';
tb_test_varmove.Properties.VariableNames([3 4]) = {'3rd' '4th'};
disp(tb_test_varmove);
RenameMean RenameSDCol 3rd 4th col_3 col_4
______________ ______________ ________ _______ _______ _______
row_1 "data=0.69647" "data=0.42311" 0.34318 0.738 0.6344 0.32296
row_2 "data=0.28614" "data=0.98076" 0.72905 0.18249 0.84943 0.36179
row_3 "data=0.22685" "data=0.68483" 0.43857 0.17545 0.72446 0.22826
row_4 "data=0.55131" "data=0.48093" 0.059678 0.53155 0.61102 0.29371
row_5 "data=0.71947" "data=0.39212" 0.39804 0.53183 0.72244 0.63098
7.1.2.3 Remove Table Column
Remove columns from the Table
tb_test_varmove_drop = removevars(tb_test_varmove, {'3rd', 'col_3'});
disp(tb_test_varmove_drop);
RenameMean RenameSDCol 4th col_4
______________ ______________ _______ _______
row_1 "data=0.69647" "data=0.42311" 0.738 0.32296
row_2 "data=0.28614" "data=0.98076" 0.18249 0.36179
row_3 "data=0.22685" "data=0.68483" 0.17545 0.22826
row_4 "data=0.55131" "data=0.48093" 0.53155 0.29371
row_5 "data=0.71947" "data=0.39212" 0.53183 0.63098
7.1.3 Row and Column Names for Table based on Arrays
Go back to fan’s MEconTools Package, Matlab Code Examples Repository (bookdown site), or Math for Econ with Matlab Repository (bookdown site).
7.1.3.1 Generate Table with Row and Column Names based on Multiple Numeric Array
Two numeric arrays describe the column names, combine numeric arrays together to form string array which becomes table variable/column names.
close all;
% Generate Table 1
ar_fl_abc1 = [0.4 0.1 0.25 0.3 0.4];
ar_fl_abc2 = [0.4 0.1 0.2 0.3 0.4];
number1 = '123';
number2 = '456';
mt_data_a = [ar_fl_abc1' ar_fl_abc2'];
tb_test_a = array2table(mt_data_a);
cl_col_names_a = {['col' num2str(number1)], ['col' num2str(number2)]};
cl_row_names_a = strcat('rowA=', string((1:size(mt_data_a,1))));
tb_test_a.Properties.VariableNames = cl_col_names_a;
tb_test_a.Properties.RowNames = cl_row_names_a;
disp(tb_test_a);
col123 col456
______ ______
rowA=1 0.4 0.4
rowA=2 0.1 0.1
rowA=3 0.25 0.2
rowA=4 0.3 0.3
rowA=5 0.4 0.4
7.1.3.2 Include Row Names as a a String Cell Variable
% a and b must have the same row names
cl_st_varrownames = tb_test_a.Properties.RowNames;
tb_test_a = addvars(tb_test_a, cl_st_varrownames, 'Before', 1);
disp(tb_test_a);
cl_st_varrownames col123 col456
_________________ ______ ______
rowA=1 {'rowA=1'} 0.4 0.4
rowA=2 {'rowA=2'} 0.1 0.1
rowA=3 {'rowA=3'} 0.25 0.2
rowA=4 {'rowA=4'} 0.3 0.3
rowA=5 {'rowA=5'} 0.4 0.4
7.1.3.3 Include Row Names as a String Variable
% a and b must have the same row names
st_varrownames = string(cl_st_varrownames);
tb_test_a = addvars(tb_test_a, st_varrownames, 'Before', 1);
disp(tb_test_a);
st_varrownames cl_st_varrownames col123 col456
______________ _________________ ______ ______
rowA=1 "rowA=1" {'rowA=1'} 0.4 0.4
rowA=2 "rowA=2" {'rowA=2'} 0.1 0.1
rowA=3 "rowA=3" {'rowA=3'} 0.25 0.2
rowA=4 "rowA=4" {'rowA=4'} 0.3 0.3
rowA=5 "rowA=5" {'rowA=5'} 0.4 0.4
7.1.3.4 Remove Row Names
Remove row names
tb_test_a.Properties.RowNames = {};
disp(tb_test_a);
st_varrownames cl_st_varrownames col123 col456
______________ _________________ ______ ______
"rowA=1" {'rowA=1'} 0.4 0.4
"rowA=2" {'rowA=2'} 0.1 0.1
"rowA=3" {'rowA=3'} 0.25 0.2
"rowA=4" {'rowA=4'} 0.3 0.3
"rowA=5" {'rowA=5'} 0.4 0.4
7.1.3.5 Generate String Based on Row Values and Column Names
Suppose we are looping over meshed grid of parameter values, want to generate a KEY that is based on three of the parameters, but not the remaining parameter. One strategy is to use the current values of the three parameters, combine them with the string column names, and concatenate together. This generate a string key.
cl_ar_identifier = cell([size(tb_test_a,1), 1]);
for esti_row_idx=1:size(tb_test_a,1)
% Get the current row, 3rd and 4th columns
ar_fl_colvals = tb_test_a{esti_row_idx,[3,4]};
ar_st_colnames = tb_test_a.Properties.VariableNames([3,4]);
ar_st_colvals = cellfun(@(x) strtrim(x), cellstr(num2str(ar_fl_colvals')), 'UniformOutput', false);
ar_st_identifier = strcat(ar_st_colnames', '=', ar_st_colvals);
esti_identifier = strjoin(ar_st_identifier, "#");
% add to cell
cl_ar_identifier{esti_row_idx} = esti_identifier;
end
% this is a group identifier
ar_st_identifier = string(cl_ar_identifier);
tb_test_a = addvars(tb_test_a, cl_ar_identifier, 'Before', 1);
tb_test_a = addvars(tb_test_a, ar_st_identifier, 'Before', 1);
disp(tb_test_a);
ar_st_identifier cl_ar_identifier st_varrownames cl_st_varrownames col123 col456
________________________ __________________________ ______________ _________________ ______ ______
"col123=0.4#col456=0.4" {'col123=0.4#col456=0.4' } "rowA=1" {'rowA=1'} 0.4 0.4
"col123=0.1#col456=0.1" {'col123=0.1#col456=0.1' } "rowA=2" {'rowA=2'} 0.1 0.1
"col123=0.25#col456=0.2" {'col123=0.25#col456=0.2'} "rowA=3" {'rowA=3'} 0.25 0.2
"col123=0.3#col456=0.3" {'col123=0.3#col456=0.3' } "rowA=4" {'rowA=4'} 0.3 0.3
"col123=0.4#col456=0.4" {'col123=0.4#col456=0.4' } "rowA=5" {'rowA=5'} 0.4 0.4
7.1.4 Select Subset of Rows and Columns
Go back to fan’s MEconTools Package, Matlab Code Examples Repository (bookdown site), or Math for Econ with Matlab Repository (bookdown site).
7.1.4.1 Generate a Table
close all;
% Generate Table 1
ar_fl_abc1 = [0.4 0.1 0.25 0.3 0.4 1 1.1];
ar_fl_abc2 = [0.4 0.1 0.2 0.3 0.4 2 2.2];
number1 = '123';
number2 = '456';
mt_data_a = [ar_fl_abc1' ar_fl_abc2'];
tb_test_a = array2table(mt_data_a);
cl_col_names_a = {['col' num2str(number1)], ['col' num2str(number2)]};
cl_row_names_a = strcat('rowA=', string((1:size(mt_data_a,1))));
tb_test_a.Properties.VariableNames = cl_col_names_a;
tb_test_a.Properties.RowNames = cl_row_names_a;
% a and b must have the same row names
cl_st_varrownames = tb_test_a.Properties.RowNames;
tb_test_a = addvars(tb_test_a, cl_st_varrownames, 'Before', 1);
% a and b must have the same row names
st_varrownames = string(cl_st_varrownames);
tb_test_a = addvars(tb_test_a, st_varrownames, 'Before', 1);
tb_test_a = addvars(tb_test_a, ["a", "b", "cc", "aa", "b", "z", "zz"]', 'Before', 1);
disp(tb_test_a);
Var1 st_varrownames cl_st_varrownames col123 col456
____ ______________ _________________ ______ ______
rowA=1 "a" "rowA=1" {'rowA=1'} 0.4 0.4
rowA=2 "b" "rowA=2" {'rowA=2'} 0.1 0.1
rowA=3 "cc" "rowA=3" {'rowA=3'} 0.25 0.2
rowA=4 "aa" "rowA=4" {'rowA=4'} 0.3 0.3
rowA=5 "b" "rowA=5" {'rowA=5'} 0.4 0.4
rowA=6 "z" "rowA=6" {'rowA=6'} 1 2
rowA=7 "zz" "rowA=7" {'rowA=7'} 1.1 2.2
7.1.4.2 Select Rows if ColX is Equal to Something
Select a subset of rows based on the variable value in one column
% select the rows where Var1="b"
disp(tb_test_a(strcmp(tb_test_a.Var1, "b"),:));
Var1 st_varrownames cl_st_varrownames col123 col456
____ ______________ _________________ ______ ______
rowA=2 "b" "rowA=2" {'rowA=2'} 0.1 0.1
rowA=5 "b" "rowA=5" {'rowA=5'} 0.4 0.4
% select the rows where col123=0.4
disp(tb_test_a(tb_test_a.col123==0.4,:));
Var1 st_varrownames cl_st_varrownames col123 col456
____ ______________ _________________ ______ ______
rowA=1 "a" "rowA=1" {'rowA=1'} 0.4 0.4
rowA=5 "b" "rowA=5" {'rowA=5'} 0.4 0.4
7.1.4.3 Select Rows if ColX is Equal to Something or Something else
Select if the value in Var1 is either the string a or the string b, below, specify these explicitly
% select the rows where Var1="b" or Var1="a"
disp(tb_test_a(strcmp(tb_test_a.Var1, "b") | strcmp(tb_test_a.Var1, "a"),:));
Var1 st_varrownames cl_st_varrownames col123 col456
____ ______________ _________________ ______ ______
rowA=1 "a" "rowA=1" {'rowA=1'} 0.4 0.4
rowA=2 "b" "rowA=2" {'rowA=2'} 0.1 0.1
rowA=5 "b" "rowA=5" {'rowA=5'} 0.4 0.4
Alternatively, use matches, to find if the variable is equal to either a or b, the list of potential match is a string array.
% Using matches
ar_st_potential_matches = ["a", "b"];
disp(tb_test_a(matches(tb_test_a.Var1, ar_st_potential_matches),:));
Var1 st_varrownames cl_st_varrownames col123 col456
____ ______________ _________________ ______ ______
rowA=1 "a" "rowA=1" {'rowA=1'} 0.4 0.4
rowA=2 "b" "rowA=2" {'rowA=2'} 0.1 0.1
rowA=5 "b" "rowA=5" {'rowA=5'} 0.4 0.4
Now match over any a to z letters, picking up any letters a to z if they appear in column Var1.
% Using matches
ar_st_match_atz = string(('a':'z')')';
disp(tb_test_a(matches(tb_test_a.Var1, ar_st_match_atz),:));
Var1 st_varrownames cl_st_varrownames col123 col456
____ ______________ _________________ ______ ______
rowA=1 "a" "rowA=1" {'rowA=1'} 0.4 0.4
rowA=2 "b" "rowA=2" {'rowA=2'} 0.1 0.1
rowA=5 "b" "rowA=5" {'rowA=5'} 0.4 0.4
rowA=6 "z" "rowA=6" {'rowA=6'} 1 2
7.1.4.4 Read in a Table from an Excel File
There are estimates stored in a table. Each row is a different estimation result, with a different set of estimates, for each row some fixed (not-estimated) parameter might vary. Each column represents a different parameter, or the parameter’s state (initial value, estimated value, standard error, etc).
The estimatetion results file is stored in: M4Econ\table\_exa\excel_exa.xlsx. We want to load in this file. Directory is one root up and one root down. The file has multiple sheets, automatically loads in the first sheet. And print table variables names, column names.
srn_excel_exa = 'C:\Users\fan\M4Econ\table\_exa\excel_exa.xlsx';
tb_read = readtable(srn_excel_exa);
disp((tb_read.Properties.VariableNames)');
{'estimodelctr' }
{'Var1' }
{'FVAL' }
{'EXITFLAG' }
{'esti_iterations' }
{'esti_funccount' }
{'mean_h_sd' }
{'NPquad_esti' }
{'NPquad_se' }
{'NPquad_actl' }
{'gamma_esti' }
{'gamma_se' }
{'gamma_actl' }
{'lambda_esti' }
{'lambda_se' }
{'lambda_actl' }
{'msrErrProtSD_esti' }
{'msrErrProtSD_se' }
{'msrErrProtSD_actl' }
{'freePriceFrac_esti' }
{'freePriceFrac_se' }
{'freePriceFrac_actl' }
{'h_exoshk_sd_esti' }
{'h_exoshk_sd_se' }
{'h_exoshk_sd_actl' }
{'h_endoshk_sd_esti' }
{'h_endoshk_sd_se' }
{'h_endoshk_sd_actl' }
{'parm_sk_mean_init' }
{'parm_sk_sd_init' }
{'NPquad_init' }
{'gamma_init' }
{'HAquad_init' }
{'theta_init' }
{'lambda_init' }
{'msrErrProtSD_init' }
{'logProt_init' }
{'freePriceFrac_init' }
{'h_exoshk_sd_init' }
{'h_endoshk_sd_init' }
{'prod_hgt0_coef_init'}
{'prod_prot_coef_init'}
{'prod_cons_coef_init'}
{'prod_male_coef_init'}
{'prod_wgt0_coef_init'}
{'endoshkCount' }
{'guasshermite' }
{'len_curEstiParam' }
{'fixedVarIndex' }
{'esti_method' }
{'esti_option_type' }
{'subset_iter_rounds' }
{'lambda_frac_disc' }
7.1.4.5 Select Table Columns based on Column Name Strings
Given the table that we loaded in above, select only the columns that start with some string like "gamma", or columns that end with certain strings, like "_esti".
The startsWith, contains, and endsWith are string functions that generate logical arrays based on which elements of the tring array satisfies the criteria. So this is not a table function, it is a string function.
ar_st_col_names = tb_read.Properties.VariableNames;
ar_st_col_names_prod = ar_st_col_names(startsWith(ar_st_col_names, 'prod_'));
ar_st_col_names_esti = ar_st_col_names(endsWith(ar_st_col_names, '_esti'));
ar_st_col_names_sd = ar_st_col_names(contains(ar_st_col_names, '_sd_'));
disp(ar_st_col_names_prod');
{'prod_hgt0_coef_init'}
{'prod_prot_coef_init'}
{'prod_cons_coef_init'}
{'prod_male_coef_init'}
{'prod_wgt0_coef_init'}
disp(ar_st_col_names_esti');
{'NPquad_esti' }
{'gamma_esti' }
{'lambda_esti' }
{'msrErrProtSD_esti' }
{'freePriceFrac_esti'}
{'h_exoshk_sd_esti' }
{'h_endoshk_sd_esti' }
disp(ar_st_col_names_sd');
{'h_exoshk_sd_esti' }
{'h_exoshk_sd_se' }
{'h_exoshk_sd_actl' }
{'h_endoshk_sd_esti'}
{'h_endoshk_sd_se' }
{'h_endoshk_sd_actl'}
{'parm_sk_sd_init' }
{'h_exoshk_sd_init' }
{'h_endoshk_sd_init'}
We can select columns that contain the string sd as well as actl in them, by considering joint conditions.
ar_it_select = contains(ar_st_col_names, '_sd_').*endsWith(ar_st_col_names, '_actl');
ar_st_col_names_selected = ar_st_col_names(ar_it_select==1);
disp(ar_st_col_names_selected');
{'h_exoshk_sd_actl' }
{'h_endoshk_sd_actl'}
% show values from selected columns
disp(tb_read(1:10, ar_st_col_names_selected));
h_exoshk_sd_actl h_endoshk_sd_actl
________________ _________________
0.042154 0.012103
0.042514 0.011849
0.042571 0.011352
0.04263 0.010598
0.042662 0.0089398
0.042664 0.0088495
0.042638 0.0078152
0.042689 0.0027549
0.042692 0.0024652
0.042625 0.002309
7.2 Table Joining
7.2.1 Matlab Join Tables by Keys
Go back to fan’s MEconTools Package, Matlab Code Examples Repository (bookdown site), or Math for Econ with Matlab Repository (bookdown site).
7.2.1.1 Left Join Two Tables Together
There is a table with simulation results, and a table with data information. Join together with common keys.
First generate the simulation table.
% Make N by 2 matrix of fieldname + value type
mt_st_variable_names_types = [...
["year", "double"];["category", "string"];["rho", "double"]; ...
["numberWorkersSimu", "logical"]; ["meanWageSimu", "double"]];
% Make table using fieldnames & value types from above
tb_solution = table('Size',[0,size(mt_st_variable_names_types,1)],...
'VariableNames', mt_st_variable_names_types(:,1),...
'VariableTypes', mt_st_variable_names_types(:,2));
% Table with data to be merged
tb_solution = [tb_solution;...
{1, 'C001', 0.50, 5.5, 6.6}; {2, 'C002', 0.50, 3.3, 4.4}; ...
{1, 'C001', 0.25, 2.5, 3.6}; {2, 'C002', 0.25, 0.3, 1.4}];
disp(tb_solution);
year category rho numberWorkersSimu meanWageSimu
____ ________ ____ _________________ ____________
1 "C001" 0.5 5.5 6.6
2 "C002" 0.5 3.3 4.4
1 "C001" 0.25 2.5 3.6
2 "C002" 0.25 0.3 1.4
Second generate the data table.
% Make N by 2 matrix of fieldname + value type
mt_st_variable_names_types = [...
["year", "double"];["category", "string"]; ...
["numberWorkersData", "logical"]; ["meanWageData", "double"]];
% Make table using fieldnames & value types from above
tb_data = table('Size',[0,size(mt_st_variable_names_types,1)],...
'VariableNames', mt_st_variable_names_types(:,1),...
'VariableTypes', mt_st_variable_names_types(:,2));
% Table with data to be merged
tb_data = [tb_data; {1, 'C001', 1, 2}; {2, 'C002', 2, 3}];
disp(tb_data);
year category numberWorkersData meanWageData
____ ________ _________________ ____________
1 "C001" 1 2
2 "C002" 2 3
Third merge the tables together with left-join, that will match by all variables with the same name.
% left-join tables together
tb_solu_joined_data = join(tb_solution, tb_data);
% Rescale a variable
tb_solu_joined_data{:, "numberWorkersData"} = tb_solu_joined_data{:, "numberWorkersData"}*10;
% Display
disp(tb_solu_joined_data);
year category rho numberWorkersSimu meanWageSimu numberWorkersData meanWageData
____ ________ ____ _________________ ____________ _________________ ____________
1 "C001" 0.5 5.5 6.6 10 2
2 "C002" 0.5 3.3 4.4 20 3
1 "C001" 0.25 2.5 3.6 10 2
2 "C002" 0.25 0.3 1.4 20 3
7.2.2 Row and Column Combine Stack Tables and Matrices
Go back to fan’s MEconTools Package, Matlab Code Examples Repository (bookdown site), or Math for Econ with Matlab Repository (bookdown site).
7.2.2.1 Generate Some Tables and Matrixes for Combination
close all;
% Generate Table 1
ar_fl_abc1 = [0.4 0.1 0.25 0.3 0.4];
ar_fl_abc2 = [0.4 0.1 0.2 0.3 0.4];
number1 = '123';
number2 = '456';
mt_data_a = [ar_fl_abc1' ar_fl_abc2'];
tb_test_a = array2table(mt_data_a);
cl_col_names_a = {['col' num2str(number1)], ['col' num2str(number2)]};
cl_row_names_a = strcat('rowA=', string((1:size(mt_data_a,1))));
tb_test_a.Properties.VariableNames = cl_col_names_a;
tb_test_a.Properties.RowNames = cl_row_names_a;
disp(tb_test_a);
col123 col456
______ ______
rowA=1 0.4 0.4
rowA=2 0.1 0.1
rowA=3 0.25 0.2
rowA=4 0.3 0.3
rowA=5 0.4 0.4
% Generate Table 2
rng(123);
ar_fl_abc3 = rand(size(ar_fl_abc1));
ar_fl_abc4 = rand(size(ar_fl_abc1));
ar_fl_abc5 = rand(size(ar_fl_abc1));
mt_data_b = [ar_fl_abc3' ar_fl_abc4' ar_fl_abc5'];
tb_test_b = array2table(mt_data_b);
cl_col_names_b = {['col' num2str(33)], ['col' num2str(44)], ['col' num2str(55)]};
cl_row_names_b = strcat('rowB=', string((1:size(mt_data_a,1))));
tb_test_b.Properties.VariableNames = cl_col_names_b;
tb_test_b.Properties.RowNames = cl_row_names_b;
disp(tb_test_b);
col33 col44 col55
_______ _______ ________
rowB=1 0.69647 0.42311 0.34318
rowB=2 0.28614 0.98076 0.72905
rowB=3 0.22685 0.68483 0.43857
rowB=4 0.55131 0.48093 0.059678
rowB=5 0.71947 0.39212 0.39804
7.2.2.2 Combine Tables Together Stack Columns
Tables with the same number of rows, add more columns with named variables
% a and b must have the same row names
tb_test_b_withArownames = tb_test_b;
tb_test_b_withArownames.Properties.RowNames = tb_test_a.Properties.RowNames;
tb_ab_col_stacked = [tb_test_a tb_test_b_withArownames];
disp(tb_ab_col_stacked);
col123 col456 col33 col44 col55
______ ______ _______ _______ ________
rowA=1 0.4 0.4 0.69647 0.42311 0.34318
rowA=2 0.1 0.1 0.28614 0.98076 0.72905
rowA=3 0.25 0.2 0.22685 0.68483 0.43857
rowA=4 0.3 0.3 0.55131 0.48093 0.059678
rowA=5 0.4 0.4 0.71947 0.39212 0.39804
7.2.2.3 Combine Tables Together Stack Rows
Tables with the same number of columns, dd more rows variables
% Select only 2 columns to match table a column count
tb_test_b_subset = tb_test_b(:,1:2);
% Make Column Names consistent
tb_test_b_subset.Properties.VariableNames = cl_col_names_a;
% Reset Row Names, can not have identical row names
tb_test_a.Properties.RowNames = strcat('row=', string((1:size(mt_data_a,1))));
tb_test_b_subset.Properties.RowNames = ...
strcat('row=', string(((size(mt_data_a,1)+1):(size(mt_data_a,1)+size(tb_test_b_subset,1)))));
% tb_test_b_subset.Properties.RowNames =
% Stack Rows
tb_ab_row_stacked = [tb_test_a; tb_test_b_subset];
disp(tb_ab_row_stacked);
col123 col456
_______ _______
row=1 0.4 0.4
row=2 0.1 0.1
row=3 0.25 0.2
row=4 0.3 0.3
row=5 0.4 0.4
row=6 0.69647 0.42311
row=7 0.28614 0.98076
row=8 0.22685 0.68483
row=9 0.55131 0.48093
row=10 0.71947 0.39212
7.2.3 Matlab Table Stack and Join Estimation and Simulation Results
Go back to fan’s MEconTools Package, Matlab Code Examples Repository (bookdown site), or Math for Econ with Matlab Repository (bookdown site).
7.2.3.1 Combine Tables Together Stack Rows Loop Template Common Columns
There is an estimation routine, each time the routine outputs a table with a single row, the single row contains estimation outputs including estiamtes, standard erros, initial parameters etc. We loop over different estimation routines, with different starting values etc, and rather than saving many tables, we want to save a joint table with all rows stacked together.
This simply means that we have a loop, during each iteration, generating a table, we want to stack things together. For this assume that the column names are the same.
rng("default");
tb_saveCoef_stack = [];
for row_idx=1:5
% a row of coefficent estimates
rng(123+row_idx);
it_num_cols = 4;
it_num_rows = 1;
mt_saveCoef = rand([it_num_rows, it_num_cols]);
% row to table
ar_st_col_names = ["FVAL", "EXITFLAG", "esti_iterations", "esti_funccount"];
tb_saveCoef = array2table(mt_saveCoef);
tb_saveCoef.Properties.VariableNames = ar_st_col_names;
% Stack all results
tb_saveCoef_stack = [tb_saveCoef_stack; tb_saveCoef];
end
% Add esti Counter as column
estimodelctr = (1:size(tb_saveCoef_stack,1))';
tb_saveCoef_stack = addvars(tb_saveCoef_stack, estimodelctr, 'Before', 1);
% Add a row name as a variable
cl_row_names_a = strcat('esti', string((1:size(tb_saveCoef_stack,1))));
tb_saveCoef_stack.Properties.RowNames = cl_row_names_a;
% display results
disp(tb_saveCoef_stack);
estimodelctr FVAL EXITFLAG esti_iterations esti_funccount
____________ _______ ________ _______________ ______________
esti1 1 0.10606 0.74547 0.57231 0.45824
esti2 2 0.50673 0.057531 0.62758 0.13255
esti3 3 0.10517 0.12814 0.087406 0.11548
esti4 4 0.52383 0.039963 0.18597 0.77279
esti5 5 0.86664 0.26314 0.13141 0.041593
7.2.3.2 Combine Tables Together Stack Rows Loop Template Outterjoin
Similar to the previous estimation problem, however, now during different iterations, the column names, i.e. the parameters been estiamted are different. For example, there are 10 parameters, sometimes we estimate 5 of the 10, sometimes 10 or the 10. Want to stack all results together similar to above.
This is accomplished in the following example with the outerjoin function.
for row_idx=1:5
% a row of coefficent estimates
rng(123+row_idx);
it_num_rows = 1;
if (row_idx <= 2)
it_num_cols = 4;
mt_saveCoef = rand([it_num_rows, it_num_cols]);
% row to table
ar_st_col_names = ["FVAL", "EXITFLAG", "esti_iterations", "esti_funccount"];
elseif (row_idx <= 4)
it_num_cols = 2;
mt_saveCoef = rand([it_num_rows, it_num_cols]);
% row to table
ar_st_col_names = ["FVAL", "EXITFLAG"];
else
it_num_cols = 3;
mt_saveCoef = rand([it_num_rows, it_num_cols]);
% row to table
ar_st_col_names = ["FVAL", "esti_iterations", "esti_funccount"];
end
tb_saveCoef = array2table(mt_saveCoef);
tb_saveCoef.Properties.VariableNames = ar_st_col_names;
tb_saveCoef = addvars(tb_saveCoef, row_idx, 'Before', 1);
% Stack all results
if(row_idx == 1)
tb_saveCoef_stack = tb_saveCoef;
else
tb_saveCoef_stack = outerjoin(tb_saveCoef_stack, tb_saveCoef, 'MergeKeys', true);
end
end
% Add esti Counter as column
estimodelctr = (1:size(tb_saveCoef_stack,1))';
tb_saveCoef_stack = addvars(tb_saveCoef_stack, estimodelctr, 'Before', 1);
% Add a row name as a variable
cl_row_names_a = strcat('esti', string((1:size(tb_saveCoef_stack,1))));
tb_saveCoef_stack.Properties.RowNames = cl_row_names_a;
% display results
disp(tb_saveCoef_stack);
estimodelctr row_idx FVAL EXITFLAG esti_iterations esti_funccount
____________ _______ _______ ________ _______________ ______________
esti1 1 1 0.10606 0.74547 0.57231 0.45824
esti2 2 2 0.50673 0.057531 0.62758 0.13255
esti3 3 3 0.10517 0.12814 NaN NaN
esti4 4 4 0.52383 0.039963 NaN NaN
esti5 5 5 0.86664 NaN 0.26314 0.13141
7.2.3.3 Combine Tables Outterjoin with Parfor
Same as above, but iterate/solve over loops with parfor
% Start cluster
delete(gcp('nocreate'));
myCluster = parcluster('local');
it_workers = 4;
myCluster.NumWorkers = it_workers;
parpool(it_workers);
Starting parallel pool (parpool) using the 'local' profile ...
Connected to the parallel pool (number of workers: 4).
% prepare storage
cl_stores = cell([4,1]);
% Loop over
parfor row_idx=1:4
% a row of coefficent estimates
rng(123+row_idx);
ar_st_col_names = ["FVAL", "EXITFLAG", "esti_iterations", "esti_funccount"];
it_num_rows = 1;
it_num_cols = 4;
mt_saveCoef = rand([it_num_rows, it_num_cols]);
tb_saveCoef = array2table(mt_saveCoef);
tb_saveCoef.Properties.VariableNames = ar_st_col_names;
tb_saveCoef = addvars(tb_saveCoef, row_idx, 'Before', 1);
% Stack all results
cl_stores{row_idx} = tb_saveCoef;
end
% delete cluster
delete(gcp('nocreate'));
Stack tables stored in cells together:
% Combine
tb_saveCoef_stack = [];
for row_idx=1:4
tb_saveCoef_stack = [tb_saveCoef_stack; cl_stores{row_idx}];
end
% display results
disp(tb_saveCoef_stack);
Var1 FVAL EXITFLAG esti_iterations esti_funccount
____ _______ ________ _______________ ______________
1 0.48946 0.58215 0.2796 0.23945
2 0.28746 0.3907 0.12685 0.694
3 0.76682 0.48312 0.50655 0.24223
4 0.71522 0.15045 0.43025 0.4845
7.2.3.4 ND Dimensional Parameter Arrays, Simulate Model and Stack Output Tables
Now we will first column combine matrixes, model parameters and model outcomes, and then row combine matrixes from different simulations.
A model takes a N parameters, solve the model over M sets of parameters. Each time when the model is solved, a P by Q table of results is generated. Each column is a different statistics (mean, std, etc.), and each row is a different outcome variable (consumption, asset choices, etc.). Stack these P by Q Tables together, and add in information about the N parameters, each of the tables been stacked initially had the same column and row names.
The resulting table should have P times M rows, for M sets of model simulations each with P rows of results. And there should be N + Q columns, storing the N parameters as well as the Q columns of different outcomes.
rng(123);
% Generate A P by Q matrix of random parameter Values
it_param_groups_m = 5;
it_params_n = 2;
it_outcomes_p = 3;
it_stats_q = 3;
% Parameter Matrix and Names
ar_param_names = strcat('param_', string(1:it_params_n));
mt_param_m_by_n = round(rand([it_param_groups_m, it_params_n])*5, 2);
% Loop over the parameters
for it_cur_param_group=1:1:it_param_groups_m
% Current Parameters
ar_param = mt_param_m_by_n(it_cur_param_group,:);
% Some Model is simulated
mt_model_simu = normrnd(mean(ar_param), std(ar_param), [it_outcomes_p, it_stats_q]);
% Model Results are Saved As Table With Column and Row Information
tb_model_simu = array2table(mt_model_simu);
cl_col_names = strcat('stats_', string((1:size(mt_model_simu,2))));
cl_row_names = strcat('outvar_', string((1:size(mt_model_simu,1))));
tb_model_simu.Properties.VariableNames = cl_col_names;
tb_model_simu.Properties.RowNames = cl_row_names;
% Convert Row Variable Names to a Column String
outvar = string(tb_model_simu.Properties.RowNames);
tb_model_simu = addvars(tb_model_simu, outvar, 'Before', 1);
% Parameter Information Table that Shares Row Names as Simu Results
mt_param_info = zeros([it_outcomes_p,it_params_n]) + ar_param;
tb_param_info = array2table(mt_param_info);
tb_param_info.Properties.VariableNames = ar_param_names;
tb_param_info.Properties.RowNames = cl_row_names;
% Combine Parameter Information and Simulation Contents
tb_model_simu_w_info = [tb_param_info tb_model_simu];
% Update Row Names based on total row available
ar_rows_allsimu = (1:it_stats_q)' + (it_cur_param_group-1)*it_stats_q;
tb_model_simu_w_info.Properties.RowNames = strcat('row=', string(ar_rows_allsimu));
% Show One Example Table before Stacking
if (it_cur_param_group == round(it_param_groups_m/2))
disp(tb_model_simu);
disp(tb_param_info);
disp(tb_model_simu_w_info);
end
% Stack all results
if(it_cur_param_group == 1)
tb_model_allsimu_w_info = tb_model_simu_w_info;
else
tb_model_allsimu_w_info = [tb_model_allsimu_w_info; tb_model_simu_w_info];
end
end
outvar stats_1 stats_2 stats_3
__________ ________ _______ _______
outvar_1 "outvar_1" 0.056853 2.1703 2.1098
outvar_2 "outvar_2" 3.1545 2.0634 0.7798
outvar_3 "outvar_3" -0.49033 2.2566 1.7896
param_1 param_2
_______ _______
outvar_1 1.13 3.42
outvar_2 1.13 3.42
outvar_3 1.13 3.42
param_1 param_2 outvar stats_1 stats_2 stats_3
_______ _______ __________ ________ _______ _______
row=7 1.13 3.42 "outvar_1" 0.056853 2.1703 2.1098
row=8 1.13 3.42 "outvar_2" 3.1545 2.0634 0.7798
row=9 1.13 3.42 "outvar_3" -0.49033 2.2566 1.7896
Show all Simulation Joint Table Outputs:
disp(tb_model_allsimu_w_info);
param_1 param_2 outvar stats_1 stats_2 stats_3
_______ _______ __________ ________ _______ _______
row=1 3.48 2.12 "outvar_1" 2.2665 1.1885 1.924
row=2 3.48 2.12 "outvar_2" 3.3427 2.4647 2.3548
row=3 3.48 2.12 "outvar_3" 2.6714 3.6132 2.918
row=4 1.43 4.9 "outvar_1" 3.3859 5.3759 1.5816
row=5 1.43 4.9 "outvar_2" 3.9499 3.8698 2.2693
row=6 1.43 4.9 "outvar_3" 5.7745 4.6871 1.7334
row=7 1.13 3.42 "outvar_1" 0.056853 2.1703 2.1098
row=8 1.13 3.42 "outvar_2" 3.1545 2.0634 0.7798
row=9 1.13 3.42 "outvar_3" -0.49033 2.2566 1.7896
row=10 2.76 2.4 "outvar_1" 2.9611 2.6847 2.4986
row=11 2.76 2.4 "outvar_2" 2.9333 2.3457 3.0629
row=12 2.76 2.4 "outvar_3" 2.5814 2.4372 2.4806
row=13 3.6 1.96 "outvar_1" 2.7199 3.3129 3.0577
row=14 3.6 1.96 "outvar_2" 3.9804 1.4529 2.9285
row=15 3.6 1.96 "outvar_3" 2.8445 4.4117 2.6576
7.3 Summarize
7.3.1 Matlab Table Summarize and Aggregate by Groups
Go back to fan’s MEconTools Package, Matlab Code Examples Repository (bookdown site), or Math for Econ with Matlab Repository (bookdown site).
7.3.1.1 Group Table Rows and Sum within Group
There is a table where subsets of rows belong to different simulations, with exogenous fixed \(\rho\) parameters. For each \(\rho\) parameter combination, there are, stored as different rows, a number of model predictions and data moments, and corresponding difference. Find the total difference between model and data for subsets of rows based for each \(\rho\) parameter set.
First, create a table where each \(\rho\) group is identified jointly by \(\rho_a\) and \(\rho_b\), stored in the 3rd and 4th rows.
% Make N by 2 matrix of fieldname + value type
mt_st_variable_names_types = [...
["year", "double"];["category", "string"];...
["rhoa", "double"];["rhob", "double"]; ...
["numberWorkersSimu", "logical"]; ["numberWorkersData", "double"]];
% Make table using fieldnames & value types from above
tb_agg_exa = table('Size',[0,size(mt_st_variable_names_types,1)],...
'VariableNames', mt_st_variable_names_types(:,1),...
'VariableTypes', mt_st_variable_names_types(:,2));
% Table with data inputs
tb_agg_exa = [tb_agg_exa;...
{1, 'C001', 0.50, 0.50, 5.5, 6.05}; {2, 'C002', 0.50, 0.50, 3.7, 4.4}; ...
{1, 'C001', 0.25, 0.30, 2.5, 3.65}; {2, 'C002', 0.25, 0.30, 0.1, 1.6}; ...
{3, 'C001', 0.25, 0.50, 0.01, 1.66}];
% Generate model and data difference
tb_agg_exa{:, "diff"} = tb_agg_exa{:, "numberWorkersSimu"} - tb_agg_exa{:, "numberWorkersData"};
% Display
disp(tb_agg_exa);
year category rhoa rhob numberWorkersSimu numberWorkersData diff
____ ________ ____ ____ _________________ _________________ _____
1 "C001" 0.5 0.5 5.5 6.05 -0.55
2 "C002" 0.5 0.5 3.7 4.4 -0.7
1 "C001" 0.25 0.3 2.5 3.65 -1.15
2 "C002" 0.25 0.3 0.1 1.6 -1.5
3 "C001" 0.25 0.5 0.01 1.66 -1.65
Second, select the subset of columns that are relevant for aggregation.
% Select
tb_agg_exa = tb_agg_exa(:, ["rhoa", "rhob", "diff"]);
% Display
disp(tb_agg_exa);
rhoa rhob diff
____ ____ _____
0.5 0.5 -0.55
0.5 0.5 -0.7
0.25 0.3 -1.15
0.25 0.3 -1.5
0.25 0.5 -1.65
Third, group by unique combinations of rhoa, rhob, and aggregate. Then generate group ID.
% Sum within groupo
tb_groupby_agg_sum = groupsummary(tb_agg_exa, ["rhoa", "rhob"], "sum");
% Generate grouping ID
tb_groupby_agg_sum{:, "ID"} = (1:1:size(tb_groupby_agg_sum, 1))';
tb_groupby_agg_sum = movevars(tb_groupby_agg_sum, "ID", "Before", 1);
disp(tb_groupby_agg_sum);
ID rhoa rhob GroupCount sum_diff
__ ____ ____ __________ ________
1 0.25 0.3 2 -2.65
2 0.25 0.5 1 -1.65
3 0.5 0.5 2 -1.25