Chapter 7 Tables

7.1 Basic Table Generation

7.1.1 Generate a Table and Fill with Data Row by Row or Random Data

Go back to fan’s MEconTools Package, Matlab Code Examples Repository (bookdown site), or Math for Econ with Matlab Repository (bookdown site).

7.1.1.1 An Empty Table, Filled with Loop Row by Row

First, generate an empty table.

% Make N by 2 matrix of fieldname + value type
mt_st_variable_names_types = [["category", "string"]; ...
    ["wage", "double"]; ...
    ["skilled", "logical"]; ...
    ["labdemand", "double"]; ...
    ["labsupply", "double"]; ...
    ["labsupplyprob", "string"]; ...
    ["rho_manual", "double"]; ...
    ["rho_routine", "double"]; ...
    ["rho_analytical", "double"]; ...
    ];
% Make table using fieldnames & value types from above
tb_equilibrium = table('Size',[0,size(mt_st_variable_names_types,1)],... 
    'VariableNames', mt_st_variable_names_types(:,1),...
    'VariableTypes', mt_st_variable_names_types(:,2));
% display table
disp(size(tb_equilibrium));

     0     9

Second, over a loop, fill the table with values row by row.

for it_rho_set=[1,2,3]
    if (it_rho_set == 1)
        cl_data_row = {'C001',1,  true,1,1,0.5,0.5,0.5,0.5};
    elseif (it_rho_set == 2)
        cl_data_row = {'C002',1,  0   ,1.2,0.6,0.5,0.45,0.5,0.5};
    elseif (it_rho_set == 3)
        cl_data_row = {'C011',1.1,false,1.2,0.6,0.5,0.45,0.45,0.45};
    end
    tb_equilibrium = [tb_equilibrium; cl_data_row];
end
disp(tb_equilibrium);

    category    wage    skilled    labdemand    labsupply    labsupplyprob    rho_manual    rho_routine    rho_analytical
    ________    ____    _______    _________    _________    _____________    __________    ___________    ______________

     "C001"       1        1            1            1           "0.5"            0.5           0.5              0.5     
     "C002"       1        0          1.2          0.6           "0.5"           0.45           0.5              0.5     
     "123"      1.1        0          1.2          0.6           "0.5"           0.45          0.45             0.45     

7.1.1.2 Generate a Table with M Variables of Random Data

Generate a numeric table with random varlues and a string column

% Numeric Matrix
it_num_cols = 4;
it_num_rows = 5;
mt_data = rand([it_num_rows, it_num_cols]);

% Generate Table
tb_test = array2table(mt_data);

% Generate Row and Column Names
cl_col_names = strcat('col_', string((1:it_num_cols)));
cl_row_names = strcat('row_', string((1:it_num_rows)));

tb_test.Properties.VariableNames = matlab.lang.makeValidName(cl_col_names);
tb_test.Properties.RowNames = matlab.lang.makeValidName(cl_row_names);

% Generate two string variable
rng(456);
cl_st_var1 = strcat('data=', string(rand([it_num_rows,1])));
cl_st_var2 = strcat('data=', string(rand([it_num_rows,1])));
tb_test = addvars(tb_test, cl_st_var1, cl_st_var2);

% Display Table
disp(tb_test);

              col_1      col_2       col_3       col_4       cl_st_var1        cl_st_var2  
             _______    _______    _________    _______    ______________    ______________

    row_1    0.43568     0.4688      0.18092    0.14604    "data=0.24876"    "data=0.60411"
    row_2    0.38527       0.57      0.11816    0.54272    "data=0.16307"    "data=0.8857" 
    row_3    0.57571     0.6457      0.24273     0.8571    "data=0.78364"    "data=0.75912"
    row_4    0.14609    0.72334    0.0081834    0.20021    "data=0.80852"    "data=0.18111"
    row_5    0.68659    0.68067      0.36007    0.13463    "data=0.62563"    "data=0.15017"

7.1.2 Tables Order, Sort, Add, Rename and Drop Columns

Go back to fan’s MEconTools Package, Matlab Code Examples Repository (bookdown site), or Math for Econ with Matlab Repository (bookdown site).

7.1.2.1 Given Table, Show Some Columns First

% Generate Table
it_num_cols = 4;
it_num_rows = 5;
mt_data = rand([it_num_rows, it_num_cols]);
tb_test = array2table(mt_data);
cl_col_names = strcat('col_', string((1:it_num_cols)));
cl_row_names = strcat('row_', string((1:it_num_rows)));
tb_test.Properties.VariableNames = matlab.lang.makeValidName(cl_col_names);
tb_test.Properties.RowNames = matlab.lang.makeValidName(cl_row_names);

rng(123);
mean = strcat('data=', string(rand([it_num_rows,1])));
sd = strcat('data=', string(rand([it_num_rows,1])));
tb_test_ori = addvars(tb_test, mean, sd);

% Move Variable
tb_test_varmove = movevars(tb_test_ori, {'mean', 'sd'}, 'Before', 'col_1');

% Display
disp(tb_test_ori);

              col_1       col_2      col_3      col_4          mean               sd      
             ________    _______    _______    _______    ______________    ______________

    row_1     0.34318      0.738     0.6344    0.32296    "data=0.69647"    "data=0.42311"
    row_2     0.72905    0.18249    0.84943    0.36179    "data=0.28614"    "data=0.98076"
    row_3     0.43857    0.17545    0.72446    0.22826    "data=0.22685"    "data=0.68483"
    row_4    0.059678    0.53155    0.61102    0.29371    "data=0.55131"    "data=0.48093"
    row_5     0.39804    0.53183    0.72244    0.63098    "data=0.71947"    "data=0.39212"

disp(tb_test_varmove);

                  mean               sd           col_1       col_2      col_3      col_4 
             ______________    ______________    ________    _______    _______    _______

    row_1    "data=0.69647"    "data=0.42311"     0.34318      0.738     0.6344    0.32296
    row_2    "data=0.28614"    "data=0.98076"     0.72905    0.18249    0.84943    0.36179
    row_3    "data=0.22685"    "data=0.68483"     0.43857    0.17545    0.72446    0.22826
    row_4    "data=0.55131"    "data=0.48093"    0.059678    0.53155    0.61102    0.29371
    row_5    "data=0.71947"    "data=0.39212"     0.39804    0.53183    0.72244    0.63098

7.1.2.2 Rename Table Columns

Rename the first Column, rename the ‘sd’ column, then rename the 3rd and 4th Columns. Note for multiple column renaming, use parenthesis, but for single column renaming, use bracket.

tb_test_varmove.Properties.VariableNames{1} = 'RenameMean';
tb_test_varmove.Properties.VariableNames{'sd'} = 'RenameSDCol';
tb_test_varmove.Properties.VariableNames([3 4]) = {'3rd' '4th'};
disp(tb_test_varmove);

               RenameMean       RenameSDCol        3rd         4th       col_3      col_4 
             ______________    ______________    ________    _______    _______    _______

    row_1    "data=0.69647"    "data=0.42311"     0.34318      0.738     0.6344    0.32296
    row_2    "data=0.28614"    "data=0.98076"     0.72905    0.18249    0.84943    0.36179
    row_3    "data=0.22685"    "data=0.68483"     0.43857    0.17545    0.72446    0.22826
    row_4    "data=0.55131"    "data=0.48093"    0.059678    0.53155    0.61102    0.29371
    row_5    "data=0.71947"    "data=0.39212"     0.39804    0.53183    0.72244    0.63098

7.1.2.3 Remove Table Column

Remove columns from the Table

tb_test_varmove_drop = removevars(tb_test_varmove, {'3rd', 'col_3'});
disp(tb_test_varmove_drop);

               RenameMean       RenameSDCol        4th       col_4 
             ______________    ______________    _______    _______

    row_1    "data=0.69647"    "data=0.42311"      0.738    0.32296
    row_2    "data=0.28614"    "data=0.98076"    0.18249    0.36179
    row_3    "data=0.22685"    "data=0.68483"    0.17545    0.22826
    row_4    "data=0.55131"    "data=0.48093"    0.53155    0.29371
    row_5    "data=0.71947"    "data=0.39212"    0.53183    0.63098

7.1.3 Row and Column Names for Table based on Arrays

Go back to fan’s MEconTools Package, Matlab Code Examples Repository (bookdown site), or Math for Econ with Matlab Repository (bookdown site).

7.1.3.1 Generate Table with Row and Column Names based on Multiple Numeric Array

Two numeric arrays describe the column names, combine numeric arrays together to form string array which becomes table variable/column names.

close all;

% Generate Table 1
ar_fl_abc1 = [0.4 0.1 0.25 0.3 0.4];
ar_fl_abc2 = [0.4 0.1 0.2 0.3 0.4];
number1 = '123';
number2 = '456';
mt_data_a = [ar_fl_abc1' ar_fl_abc2'];

tb_test_a = array2table(mt_data_a);
cl_col_names_a = {['col' num2str(number1)], ['col' num2str(number2)]};
cl_row_names_a = strcat('rowA=', string((1:size(mt_data_a,1))));

tb_test_a.Properties.VariableNames = cl_col_names_a;
tb_test_a.Properties.RowNames = cl_row_names_a;
disp(tb_test_a);

              col123    col456
              ______    ______

    rowA=1      0.4      0.4  
    rowA=2      0.1      0.1  
    rowA=3     0.25      0.2  
    rowA=4      0.3      0.3  
    rowA=5      0.4      0.4  

7.1.3.2 Include Row Names as a a String Cell Variable

% a and b must have the same row names
cl_st_varrownames = tb_test_a.Properties.RowNames;
tb_test_a = addvars(tb_test_a, cl_st_varrownames, 'Before', 1);

disp(tb_test_a);

              cl_st_varrownames    col123    col456
              _________________    ______    ______

    rowA=1       {'rowA=1'}          0.4      0.4  
    rowA=2       {'rowA=2'}          0.1      0.1  
    rowA=3       {'rowA=3'}         0.25      0.2  
    rowA=4       {'rowA=4'}          0.3      0.3  
    rowA=5       {'rowA=5'}          0.4      0.4  

7.1.3.3 Include Row Names as a String Variable

% a and b must have the same row names
st_varrownames = string(cl_st_varrownames);
tb_test_a = addvars(tb_test_a, st_varrownames, 'Before', 1);
disp(tb_test_a);

              st_varrownames    cl_st_varrownames    col123    col456
              ______________    _________________    ______    ______

    rowA=1       "rowA=1"          {'rowA=1'}          0.4      0.4  
    rowA=2       "rowA=2"          {'rowA=2'}          0.1      0.1  
    rowA=3       "rowA=3"          {'rowA=3'}         0.25      0.2  
    rowA=4       "rowA=4"          {'rowA=4'}          0.3      0.3  
    rowA=5       "rowA=5"          {'rowA=5'}          0.4      0.4  

7.1.3.4 Remove Row Names

Remove row names

tb_test_a.Properties.RowNames = {};
disp(tb_test_a);

    st_varrownames    cl_st_varrownames    col123    col456
    ______________    _________________    ______    ______

       "rowA=1"          {'rowA=1'}          0.4      0.4  
       "rowA=2"          {'rowA=2'}          0.1      0.1  
       "rowA=3"          {'rowA=3'}         0.25      0.2  
       "rowA=4"          {'rowA=4'}          0.3      0.3  
       "rowA=5"          {'rowA=5'}          0.4      0.4  

7.1.3.5 Generate String Based on Row Values and Column Names

Suppose we are looping over meshed grid of parameter values, want to generate a KEY that is based on three of the parameters, but not the remaining parameter. One strategy is to use the current values of the three parameters, combine them with the string column names, and concatenate together. This generate a string key.

cl_ar_identifier = cell([size(tb_test_a,1), 1]);
for esti_row_idx=1:size(tb_test_a,1)
    
    % Get the current row, 3rd and 4th columns
    ar_fl_colvals = tb_test_a{esti_row_idx,[3,4]};
    ar_st_colnames = tb_test_a.Properties.VariableNames([3,4]);
    ar_st_colvals = cellfun(@(x) strtrim(x), cellstr(num2str(ar_fl_colvals')), 'UniformOutput', false);
    ar_st_identifier = strcat(ar_st_colnames', '=', ar_st_colvals);
    esti_identifier = strjoin(ar_st_identifier, "#");
    
    % add to cell
    cl_ar_identifier{esti_row_idx} = esti_identifier;
end
% this is a group identifier
ar_st_identifier = string(cl_ar_identifier);
tb_test_a = addvars(tb_test_a, cl_ar_identifier, 'Before', 1);
tb_test_a = addvars(tb_test_a, ar_st_identifier, 'Before', 1);
disp(tb_test_a);

        ar_st_identifier             cl_ar_identifier         st_varrownames    cl_st_varrownames    col123    col456
    ________________________    __________________________    ______________    _________________    ______    ______

    "col123=0.4#col456=0.4"     {'col123=0.4#col456=0.4' }       "rowA=1"          {'rowA=1'}          0.4      0.4  
    "col123=0.1#col456=0.1"     {'col123=0.1#col456=0.1' }       "rowA=2"          {'rowA=2'}          0.1      0.1  
    "col123=0.25#col456=0.2"    {'col123=0.25#col456=0.2'}       "rowA=3"          {'rowA=3'}         0.25      0.2  
    "col123=0.3#col456=0.3"     {'col123=0.3#col456=0.3' }       "rowA=4"          {'rowA=4'}          0.3      0.3  
    "col123=0.4#col456=0.4"     {'col123=0.4#col456=0.4' }       "rowA=5"          {'rowA=5'}          0.4      0.4  

7.1.4 Select Subset of Rows and Columns

Go back to fan’s MEconTools Package, Matlab Code Examples Repository (bookdown site), or Math for Econ with Matlab Repository (bookdown site).

7.1.4.1 Generate a Table

close all;
% Generate Table 1
ar_fl_abc1 = [0.4 0.1 0.25 0.3 0.4 1 1.1];
ar_fl_abc2 = [0.4 0.1 0.2 0.3 0.4  2 2.2];
number1 = '123';
number2 = '456';
mt_data_a = [ar_fl_abc1' ar_fl_abc2'];
tb_test_a = array2table(mt_data_a);
cl_col_names_a = {['col' num2str(number1)], ['col' num2str(number2)]};
cl_row_names_a = strcat('rowA=', string((1:size(mt_data_a,1))));
tb_test_a.Properties.VariableNames = cl_col_names_a;
tb_test_a.Properties.RowNames = cl_row_names_a;
% a and b must have the same row names
cl_st_varrownames = tb_test_a.Properties.RowNames;
tb_test_a = addvars(tb_test_a, cl_st_varrownames, 'Before', 1);
% a and b must have the same row names
st_varrownames = string(cl_st_varrownames);
tb_test_a = addvars(tb_test_a, st_varrownames, 'Before', 1);
tb_test_a = addvars(tb_test_a, ["a", "b", "cc", "aa", "b", "z", "zz"]', 'Before', 1);
disp(tb_test_a);

              Var1    st_varrownames    cl_st_varrownames    col123    col456
              ____    ______________    _________________    ______    ______

    rowA=1    "a"        "rowA=1"          {'rowA=1'}          0.4      0.4  
    rowA=2    "b"        "rowA=2"          {'rowA=2'}          0.1      0.1  
    rowA=3    "cc"       "rowA=3"          {'rowA=3'}         0.25      0.2  
    rowA=4    "aa"       "rowA=4"          {'rowA=4'}          0.3      0.3  
    rowA=5    "b"        "rowA=5"          {'rowA=5'}          0.4      0.4  
    rowA=6    "z"        "rowA=6"          {'rowA=6'}            1        2  
    rowA=7    "zz"       "rowA=7"          {'rowA=7'}          1.1      2.2  

7.1.4.2 Select Rows if ColX is Equal to Something

Select a subset of rows based on the variable value in one column

% select the rows where Var1="b"
disp(tb_test_a(strcmp(tb_test_a.Var1, "b"),:));

              Var1    st_varrownames    cl_st_varrownames    col123    col456
              ____    ______________    _________________    ______    ______

    rowA=2    "b"        "rowA=2"          {'rowA=2'}         0.1       0.1  
    rowA=5    "b"        "rowA=5"          {'rowA=5'}         0.4       0.4  

% select the rows where col123=0.4
disp(tb_test_a(tb_test_a.col123==0.4,:));

              Var1    st_varrownames    cl_st_varrownames    col123    col456
              ____    ______________    _________________    ______    ______

    rowA=1    "a"        "rowA=1"          {'rowA=1'}         0.4       0.4  
    rowA=5    "b"        "rowA=5"          {'rowA=5'}         0.4       0.4  

7.1.4.3 Select Rows if ColX is Equal to Something or Something else

Select if the value in Var1 is either the string a or the string b, below, specify these explicitly

% select the rows where Var1="b" or Var1="a"
disp(tb_test_a(strcmp(tb_test_a.Var1, "b") | strcmp(tb_test_a.Var1, "a"),:));

              Var1    st_varrownames    cl_st_varrownames    col123    col456
              ____    ______________    _________________    ______    ______

    rowA=1    "a"        "rowA=1"          {'rowA=1'}         0.4       0.4  
    rowA=2    "b"        "rowA=2"          {'rowA=2'}         0.1       0.1  
    rowA=5    "b"        "rowA=5"          {'rowA=5'}         0.4       0.4  

Alternatively, use matches, to find if the variable is equal to either a or b, the list of potential match is a string array.

% Using matches
ar_st_potential_matches = ["a", "b"];
disp(tb_test_a(matches(tb_test_a.Var1, ar_st_potential_matches),:));

              Var1    st_varrownames    cl_st_varrownames    col123    col456
              ____    ______________    _________________    ______    ______

    rowA=1    "a"        "rowA=1"          {'rowA=1'}         0.4       0.4  
    rowA=2    "b"        "rowA=2"          {'rowA=2'}         0.1       0.1  
    rowA=5    "b"        "rowA=5"          {'rowA=5'}         0.4       0.4  

Now match over any a to z letters, picking up any letters a to z if they appear in column Var1.

% Using matches
ar_st_match_atz = string(('a':'z')')';
disp(tb_test_a(matches(tb_test_a.Var1, ar_st_match_atz),:));

              Var1    st_varrownames    cl_st_varrownames    col123    col456
              ____    ______________    _________________    ______    ______

    rowA=1    "a"        "rowA=1"          {'rowA=1'}         0.4       0.4  
    rowA=2    "b"        "rowA=2"          {'rowA=2'}         0.1       0.1  
    rowA=5    "b"        "rowA=5"          {'rowA=5'}         0.4       0.4  
    rowA=6    "z"        "rowA=6"          {'rowA=6'}           1         2  

7.1.4.4 Read in a Table from an Excel File

There are estimates stored in a table. Each row is a different estimation result, with a different set of estimates, for each row some fixed (not-estimated) parameter might vary. Each column represents a different parameter, or the parameter’s state (initial value, estimated value, standard error, etc).

The estimatetion results file is stored in: M4Econ\table\_exa\excel_exa.xlsx. We want to load in this file. Directory is one root up and one root down. The file has multiple sheets, automatically loads in the first sheet. And print table variables names, column names.

srn_excel_exa = 'C:\Users\fan\M4Econ\table\_exa\excel_exa.xlsx';
tb_read = readtable(srn_excel_exa);
disp((tb_read.Properties.VariableNames)');

    {'estimodelctr'       }
    {'Var1'               }
    {'FVAL'               }
    {'EXITFLAG'           }
    {'esti_iterations'    }
    {'esti_funccount'     }
    {'mean_h_sd'          }
    {'NPquad_esti'        }
    {'NPquad_se'          }
    {'NPquad_actl'        }
    {'gamma_esti'         }
    {'gamma_se'           }
    {'gamma_actl'         }
    {'lambda_esti'        }
    {'lambda_se'          }
    {'lambda_actl'        }
    {'msrErrProtSD_esti'  }
    {'msrErrProtSD_se'    }
    {'msrErrProtSD_actl'  }
    {'freePriceFrac_esti' }
    {'freePriceFrac_se'   }
    {'freePriceFrac_actl' }
    {'h_exoshk_sd_esti'   }
    {'h_exoshk_sd_se'     }
    {'h_exoshk_sd_actl'   }
    {'h_endoshk_sd_esti'  }
    {'h_endoshk_sd_se'    }
    {'h_endoshk_sd_actl'  }
    {'parm_sk_mean_init'  }
    {'parm_sk_sd_init'    }
    {'NPquad_init'        }
    {'gamma_init'         }
    {'HAquad_init'        }
    {'theta_init'         }
    {'lambda_init'        }
    {'msrErrProtSD_init'  }
    {'logProt_init'       }
    {'freePriceFrac_init' }
    {'h_exoshk_sd_init'   }
    {'h_endoshk_sd_init'  }
    {'prod_hgt0_coef_init'}
    {'prod_prot_coef_init'}
    {'prod_cons_coef_init'}
    {'prod_male_coef_init'}
    {'prod_wgt0_coef_init'}
    {'endoshkCount'       }
    {'guasshermite'       }
    {'len_curEstiParam'   }
    {'fixedVarIndex'      }
    {'esti_method'        }
    {'esti_option_type'   }
    {'subset_iter_rounds' }
    {'lambda_frac_disc'   }

7.1.4.5 Select Table Columns based on Column Name Strings

Given the table that we loaded in above, select only the columns that start with some string like "gamma", or columns that end with certain strings, like "_esti".

The startsWith, contains, and endsWith are string functions that generate logical arrays based on which elements of the tring array satisfies the criteria. So this is not a table function, it is a string function.

ar_st_col_names = tb_read.Properties.VariableNames;
ar_st_col_names_prod = ar_st_col_names(startsWith(ar_st_col_names, 'prod_'));
ar_st_col_names_esti = ar_st_col_names(endsWith(ar_st_col_names, '_esti'));
ar_st_col_names_sd = ar_st_col_names(contains(ar_st_col_names, '_sd_'));
disp(ar_st_col_names_prod');

    {'prod_hgt0_coef_init'}
    {'prod_prot_coef_init'}
    {'prod_cons_coef_init'}
    {'prod_male_coef_init'}
    {'prod_wgt0_coef_init'}

disp(ar_st_col_names_esti');

    {'NPquad_esti'       }
    {'gamma_esti'        }
    {'lambda_esti'       }
    {'msrErrProtSD_esti' }
    {'freePriceFrac_esti'}
    {'h_exoshk_sd_esti'  }
    {'h_endoshk_sd_esti' }

disp(ar_st_col_names_sd');

    {'h_exoshk_sd_esti' }
    {'h_exoshk_sd_se'   }
    {'h_exoshk_sd_actl' }
    {'h_endoshk_sd_esti'}
    {'h_endoshk_sd_se'  }
    {'h_endoshk_sd_actl'}
    {'parm_sk_sd_init'  }
    {'h_exoshk_sd_init' }
    {'h_endoshk_sd_init'}

We can select columns that contain the string sd as well as actl in them, by considering joint conditions.

ar_it_select = contains(ar_st_col_names, '_sd_').*endsWith(ar_st_col_names, '_actl');
ar_st_col_names_selected = ar_st_col_names(ar_it_select==1);
disp(ar_st_col_names_selected');

    {'h_exoshk_sd_actl' }
    {'h_endoshk_sd_actl'}

% show values from selected columns
disp(tb_read(1:10, ar_st_col_names_selected));

    h_exoshk_sd_actl    h_endoshk_sd_actl
    ________________    _________________

        0.042154             0.012103    
        0.042514             0.011849    
        0.042571             0.011352    
         0.04263             0.010598    
        0.042662            0.0089398    
        0.042664            0.0088495    
        0.042638            0.0078152    
        0.042689            0.0027549    
        0.042692            0.0024652    
        0.042625             0.002309    

7.2 Table Joining

7.2.1 Matlab Join Tables by Keys

Go back to fan’s MEconTools Package, Matlab Code Examples Repository (bookdown site), or Math for Econ with Matlab Repository (bookdown site).

7.2.1.1 Left Join Two Tables Together

There is a table with simulation results, and a table with data information. Join together with common keys.

First generate the simulation table.

% Make N by 2 matrix of fieldname + value type
mt_st_variable_names_types = [...
    ["year", "double"];["category", "string"];["rho", "double"]; ...
    ["numberWorkersSimu", "logical"]; ["meanWageSimu", "double"]];
% Make table using fieldnames & value types from above
tb_solution = table('Size',[0,size(mt_st_variable_names_types,1)],... 
    'VariableNames', mt_st_variable_names_types(:,1),...
    'VariableTypes', mt_st_variable_names_types(:,2));
% Table with data to be merged
tb_solution = [tb_solution;...
    {1, 'C001', 0.50, 5.5, 6.6}; {2, 'C002', 0.50, 3.3, 4.4}; ...
    {1, 'C001', 0.25, 2.5, 3.6}; {2, 'C002', 0.25, 0.3, 1.4}];
disp(tb_solution);

    year    category    rho     numberWorkersSimu    meanWageSimu
    ____    ________    ____    _________________    ____________

     1       "C001"      0.5           5.5               6.6     
     2       "C002"      0.5           3.3               4.4     
     1       "C001"     0.25           2.5               3.6     
     2       "C002"     0.25           0.3               1.4     

Second generate the data table.

% Make N by 2 matrix of fieldname + value type
mt_st_variable_names_types = [...
    ["year", "double"];["category", "string"]; ...
    ["numberWorkersData", "logical"]; ["meanWageData", "double"]];
% Make table using fieldnames & value types from above
tb_data = table('Size',[0,size(mt_st_variable_names_types,1)],... 
    'VariableNames', mt_st_variable_names_types(:,1),...
    'VariableTypes', mt_st_variable_names_types(:,2));
% Table with data to be merged
tb_data = [tb_data; {1, 'C001', 1, 2}; {2, 'C002', 2, 3}];
disp(tb_data);

    year    category    numberWorkersData    meanWageData
    ____    ________    _________________    ____________

     1       "C001"             1                 2      
     2       "C002"             2                 3      

Third merge the tables together with left-join, that will match by all variables with the same name.

% left-join tables together
tb_solu_joined_data = join(tb_solution, tb_data);
% Rescale a variable
tb_solu_joined_data{:, "numberWorkersData"} = tb_solu_joined_data{:, "numberWorkersData"}*10;
% Display
disp(tb_solu_joined_data);

    year    category    rho     numberWorkersSimu    meanWageSimu    numberWorkersData    meanWageData
    ____    ________    ____    _________________    ____________    _________________    ____________

     1       "C001"      0.5           5.5               6.6                10                 2      
     2       "C002"      0.5           3.3               4.4                20                 3      
     1       "C001"     0.25           2.5               3.6                10                 2      
     2       "C002"     0.25           0.3               1.4                20                 3      

7.2.2 Row and Column Combine Stack Tables and Matrices

Go back to fan’s MEconTools Package, Matlab Code Examples Repository (bookdown site), or Math for Econ with Matlab Repository (bookdown site).

7.2.2.1 Generate Some Tables and Matrixes for Combination

close all;

% Generate Table 1
ar_fl_abc1 = [0.4 0.1 0.25 0.3 0.4];
ar_fl_abc2 = [0.4 0.1 0.2 0.3 0.4];
number1 = '123';
number2 = '456';
mt_data_a = [ar_fl_abc1' ar_fl_abc2'];

tb_test_a = array2table(mt_data_a);
cl_col_names_a = {['col' num2str(number1)], ['col' num2str(number2)]};
cl_row_names_a = strcat('rowA=', string((1:size(mt_data_a,1))));

tb_test_a.Properties.VariableNames = cl_col_names_a;
tb_test_a.Properties.RowNames = cl_row_names_a;
disp(tb_test_a);

              col123    col456
              ______    ______

    rowA=1      0.4      0.4  
    rowA=2      0.1      0.1  
    rowA=3     0.25      0.2  
    rowA=4      0.3      0.3  
    rowA=5      0.4      0.4  

% Generate Table 2
rng(123);
ar_fl_abc3 = rand(size(ar_fl_abc1));
ar_fl_abc4 = rand(size(ar_fl_abc1));
ar_fl_abc5 = rand(size(ar_fl_abc1));

mt_data_b = [ar_fl_abc3' ar_fl_abc4' ar_fl_abc5'];

tb_test_b = array2table(mt_data_b);
cl_col_names_b = {['col' num2str(33)], ['col' num2str(44)], ['col' num2str(55)]};
cl_row_names_b = strcat('rowB=', string((1:size(mt_data_a,1))));

tb_test_b.Properties.VariableNames = cl_col_names_b;
tb_test_b.Properties.RowNames = cl_row_names_b;
disp(tb_test_b);

               col33      col44      col55  
              _______    _______    ________

    rowB=1    0.69647    0.42311     0.34318
    rowB=2    0.28614    0.98076     0.72905
    rowB=3    0.22685    0.68483     0.43857
    rowB=4    0.55131    0.48093    0.059678
    rowB=5    0.71947    0.39212     0.39804

7.2.2.2 Combine Tables Together Stack Columns

Tables with the same number of rows, add more columns with named variables

% a and b must have the same row names
tb_test_b_withArownames = tb_test_b;
tb_test_b_withArownames.Properties.RowNames = tb_test_a.Properties.RowNames;
tb_ab_col_stacked = [tb_test_a tb_test_b_withArownames];
disp(tb_ab_col_stacked);

              col123    col456     col33      col44      col55  
              ______    ______    _______    _______    ________

    rowA=1      0.4      0.4      0.69647    0.42311     0.34318
    rowA=2      0.1      0.1      0.28614    0.98076     0.72905
    rowA=3     0.25      0.2      0.22685    0.68483     0.43857
    rowA=4      0.3      0.3      0.55131    0.48093    0.059678
    rowA=5      0.4      0.4      0.71947    0.39212     0.39804

7.2.2.3 Combine Tables Together Stack Rows

Tables with the same number of columns, dd more rows variables

% Select only 2 columns to match table a column count
tb_test_b_subset = tb_test_b(:,1:2);

% Make Column Names consistent
tb_test_b_subset.Properties.VariableNames = cl_col_names_a;

% Reset Row Names, can not have identical row names
tb_test_a.Properties.RowNames = strcat('row=', string((1:size(mt_data_a,1))));
tb_test_b_subset.Properties.RowNames = ...
    strcat('row=', string(((size(mt_data_a,1)+1):(size(mt_data_a,1)+size(tb_test_b_subset,1)))));
% tb_test_b_subset.Properties.RowNames =

% Stack Rows
tb_ab_row_stacked = [tb_test_a; tb_test_b_subset];
disp(tb_ab_row_stacked);

              col123     col456 
              _______    _______

    row=1         0.4        0.4
    row=2         0.1        0.1
    row=3        0.25        0.2
    row=4         0.3        0.3
    row=5         0.4        0.4
    row=6     0.69647    0.42311
    row=7     0.28614    0.98076
    row=8     0.22685    0.68483
    row=9     0.55131    0.48093
    row=10    0.71947    0.39212

7.2.3 Matlab Table Stack and Join Estimation and Simulation Results

Go back to fan’s MEconTools Package, Matlab Code Examples Repository (bookdown site), or Math for Econ with Matlab Repository (bookdown site).

7.2.3.1 Combine Tables Together Stack Rows Loop Template Common Columns

There is an estimation routine, each time the routine outputs a table with a single row, the single row contains estimation outputs including estiamtes, standard erros, initial parameters etc. We loop over different estimation routines, with different starting values etc, and rather than saving many tables, we want to save a joint table with all rows stacked together.

This simply means that we have a loop, during each iteration, generating a table, we want to stack things together. For this assume that the column names are the same.

rng("default");
tb_saveCoef_stack = [];
for row_idx=1:5
    % a row of coefficent estimates    
    rng(123+row_idx);
    it_num_cols = 4;
    it_num_rows = 1;
    mt_saveCoef = rand([it_num_rows, it_num_cols]);
    
    % row to table
    ar_st_col_names = ["FVAL", "EXITFLAG", "esti_iterations", "esti_funccount"];        
    tb_saveCoef = array2table(mt_saveCoef);
    tb_saveCoef.Properties.VariableNames = ar_st_col_names;
    
    % Stack all results
    tb_saveCoef_stack = [tb_saveCoef_stack; tb_saveCoef];
end
% Add esti Counter as column
estimodelctr = (1:size(tb_saveCoef_stack,1))';
tb_saveCoef_stack = addvars(tb_saveCoef_stack, estimodelctr, 'Before', 1);
% Add a row name as a variable
cl_row_names_a = strcat('esti', string((1:size(tb_saveCoef_stack,1))));
tb_saveCoef_stack.Properties.RowNames = cl_row_names_a;
% display results
disp(tb_saveCoef_stack);

             estimodelctr     FVAL      EXITFLAG    esti_iterations    esti_funccount
             ____________    _______    ________    _______________    ______________

    esti1         1          0.10606     0.74547        0.57231            0.45824   
    esti2         2          0.50673    0.057531        0.62758            0.13255   
    esti3         3          0.10517     0.12814       0.087406            0.11548   
    esti4         4          0.52383    0.039963        0.18597            0.77279   
    esti5         5          0.86664     0.26314        0.13141           0.041593   

7.2.3.2 Combine Tables Together Stack Rows Loop Template Outterjoin

Similar to the previous estimation problem, however, now during different iterations, the column names, i.e. the parameters been estiamted are different. For example, there are 10 parameters, sometimes we estimate 5 of the 10, sometimes 10 or the 10. Want to stack all results together similar to above.

This is accomplished in the following example with the outerjoin function.

for row_idx=1:5
    % a row of coefficent estimates
    rng(123+row_idx);
    
    it_num_rows = 1;
    if (row_idx <= 2) 
        it_num_cols = 4;
        mt_saveCoef = rand([it_num_rows, it_num_cols]);    
        % row to table    
        ar_st_col_names = ["FVAL", "EXITFLAG", "esti_iterations", "esti_funccount"];
    elseif (row_idx <= 4) 
        it_num_cols = 2;
        mt_saveCoef = rand([it_num_rows, it_num_cols]);
        % row to table    
        ar_st_col_names = ["FVAL", "EXITFLAG"];
    else
        it_num_cols = 3;
        mt_saveCoef = rand([it_num_rows, it_num_cols]);
        % row to table    
        ar_st_col_names = ["FVAL", "esti_iterations", "esti_funccount"];
    end
    
    tb_saveCoef = array2table(mt_saveCoef);
    tb_saveCoef.Properties.VariableNames = ar_st_col_names;
    tb_saveCoef = addvars(tb_saveCoef, row_idx, 'Before', 1);
    
    % Stack all results
    if(row_idx == 1)
        tb_saveCoef_stack = tb_saveCoef;
    else
        tb_saveCoef_stack = outerjoin(tb_saveCoef_stack, tb_saveCoef, 'MergeKeys', true);
    end
end
% Add esti Counter as column
estimodelctr = (1:size(tb_saveCoef_stack,1))';
tb_saveCoef_stack = addvars(tb_saveCoef_stack, estimodelctr, 'Before', 1);
% Add a row name as a variable
cl_row_names_a = strcat('esti', string((1:size(tb_saveCoef_stack,1))));
tb_saveCoef_stack.Properties.RowNames = cl_row_names_a;
% display results
disp(tb_saveCoef_stack);

             estimodelctr    row_idx     FVAL      EXITFLAG    esti_iterations    esti_funccount
             ____________    _______    _______    ________    _______________    ______________

    esti1         1             1       0.10606     0.74547        0.57231           0.45824    
    esti2         2             2       0.50673    0.057531        0.62758           0.13255    
    esti3         3             3       0.10517     0.12814            NaN               NaN    
    esti4         4             4       0.52383    0.039963            NaN               NaN    
    esti5         5             5       0.86664         NaN        0.26314           0.13141    

7.2.3.3 Combine Tables Outterjoin with Parfor

Same as above, but iterate/solve over loops with parfor

% Start cluster
delete(gcp('nocreate'));
myCluster = parcluster('local');
it_workers = 4;
myCluster.NumWorkers = it_workers;
parpool(it_workers);

Starting parallel pool (parpool) using the 'local' profile ...
Connected to the parallel pool (number of workers: 4).

% prepare storage
cl_stores = cell([4,1]);
% Loop over
parfor row_idx=1:4
    % a row of coefficent estimates
    rng(123+row_idx);    
    ar_st_col_names = ["FVAL", "EXITFLAG", "esti_iterations", "esti_funccount"];
    it_num_rows = 1;
    it_num_cols = 4;
    mt_saveCoef = rand([it_num_rows, it_num_cols]);        
    tb_saveCoef = array2table(mt_saveCoef);
    tb_saveCoef.Properties.VariableNames = ar_st_col_names;
    tb_saveCoef = addvars(tb_saveCoef, row_idx, 'Before', 1);    
    % Stack all results
    cl_stores{row_idx} = tb_saveCoef;
end
% delete cluster
delete(gcp('nocreate'));

Stack tables stored in cells together:

% Combine
tb_saveCoef_stack = [];
for row_idx=1:4
    tb_saveCoef_stack = [tb_saveCoef_stack; cl_stores{row_idx}];
end
% display results
disp(tb_saveCoef_stack);

    Var1     FVAL      EXITFLAG    esti_iterations    esti_funccount
    ____    _______    ________    _______________    ______________

     1      0.48946    0.58215          0.2796           0.23945    
     2      0.28746     0.3907         0.12685             0.694    
     3      0.76682    0.48312         0.50655           0.24223    
     4      0.71522    0.15045         0.43025            0.4845    

7.2.3.4 ND Dimensional Parameter Arrays, Simulate Model and Stack Output Tables

Now we will first column combine matrixes, model parameters and model outcomes, and then row combine matrixes from different simulations.

A model takes a N parameters, solve the model over M sets of parameters. Each time when the model is solved, a P by Q table of results is generated. Each column is a different statistics (mean, std, etc.), and each row is a different outcome variable (consumption, asset choices, etc.). Stack these P by Q Tables together, and add in information about the N parameters, each of the tables been stacked initially had the same column and row names.

The resulting table should have P times M rows, for M sets of model simulations each with P rows of results. And there should be N + Q columns, storing the N parameters as well as the Q columns of different outcomes.

rng(123);
% Generate A P by Q matrix of random parameter Values
it_param_groups_m = 5; 
it_params_n = 2;
it_outcomes_p = 3;
it_stats_q = 3;

% Parameter Matrix and Names
ar_param_names = strcat('param_', string(1:it_params_n));
mt_param_m_by_n = round(rand([it_param_groups_m, it_params_n])*5, 2);

% Loop over the parameters
for it_cur_param_group=1:1:it_param_groups_m
    
    % Current Parameters
    ar_param = mt_param_m_by_n(it_cur_param_group,:);
    
    % Some Model is simulated
    mt_model_simu = normrnd(mean(ar_param), std(ar_param), [it_outcomes_p, it_stats_q]);
    
    % Model Results are Saved As Table With Column and Row Information
    tb_model_simu = array2table(mt_model_simu);
    cl_col_names = strcat('stats_', string((1:size(mt_model_simu,2))));
    cl_row_names = strcat('outvar_', string((1:size(mt_model_simu,1))));
    tb_model_simu.Properties.VariableNames = cl_col_names;
    tb_model_simu.Properties.RowNames = cl_row_names;    
        
    % Convert Row Variable Names to a Column String
    outvar = string(tb_model_simu.Properties.RowNames);
    tb_model_simu = addvars(tb_model_simu, outvar, 'Before', 1);
    
    % Parameter Information Table that Shares Row Names as Simu Results
    mt_param_info = zeros([it_outcomes_p,it_params_n]) + ar_param;
    tb_param_info = array2table(mt_param_info);
    tb_param_info.Properties.VariableNames = ar_param_names;
    tb_param_info.Properties.RowNames = cl_row_names;
    
    % Combine Parameter Information and Simulation Contents
    tb_model_simu_w_info = [tb_param_info tb_model_simu];
    % Update Row Names based on total row available
    ar_rows_allsimu = (1:it_stats_q)' + (it_cur_param_group-1)*it_stats_q;
    tb_model_simu_w_info.Properties.RowNames = strcat('row=', string(ar_rows_allsimu));
    
    % Show One Example Table before Stacking
    if (it_cur_param_group == round(it_param_groups_m/2))
        disp(tb_model_simu);
        disp(tb_param_info);
        disp(tb_model_simu_w_info);
    end
    
    % Stack all results
    if(it_cur_param_group == 1)
        tb_model_allsimu_w_info = tb_model_simu_w_info;
    else
        tb_model_allsimu_w_info = [tb_model_allsimu_w_info; tb_model_simu_w_info];
    end
    
end

                  outvar      stats_1     stats_2    stats_3
                __________    ________    _______    _______

    outvar_1    "outvar_1"    0.056853    2.1703     2.1098 
    outvar_2    "outvar_2"      3.1545    2.0634     0.7798 
    outvar_3    "outvar_3"    -0.49033    2.2566     1.7896 
                param_1    param_2
                _______    _______

    outvar_1     1.13       3.42  
    outvar_2     1.13       3.42  
    outvar_3     1.13       3.42  
             param_1    param_2      outvar      stats_1     stats_2    stats_3
             _______    _______    __________    ________    _______    _______

    row=7     1.13       3.42      "outvar_1"    0.056853    2.1703     2.1098 
    row=8     1.13       3.42      "outvar_2"      3.1545    2.0634     0.7798 
    row=9     1.13       3.42      "outvar_3"    -0.49033    2.2566     1.7896 

Show all Simulation Joint Table Outputs:

disp(tb_model_allsimu_w_info);

              param_1    param_2      outvar      stats_1     stats_2    stats_3
              _______    _______    __________    ________    _______    _______

    row=1      3.48       2.12      "outvar_1"      2.2665    1.1885      1.924 
    row=2      3.48       2.12      "outvar_2"      3.3427    2.4647     2.3548 
    row=3      3.48       2.12      "outvar_3"      2.6714    3.6132      2.918 
    row=4      1.43        4.9      "outvar_1"      3.3859    5.3759     1.5816 
    row=5      1.43        4.9      "outvar_2"      3.9499    3.8698     2.2693 
    row=6      1.43        4.9      "outvar_3"      5.7745    4.6871     1.7334 
    row=7      1.13       3.42      "outvar_1"    0.056853    2.1703     2.1098 
    row=8      1.13       3.42      "outvar_2"      3.1545    2.0634     0.7798 
    row=9      1.13       3.42      "outvar_3"    -0.49033    2.2566     1.7896 
    row=10     2.76        2.4      "outvar_1"      2.9611    2.6847     2.4986 
    row=11     2.76        2.4      "outvar_2"      2.9333    2.3457     3.0629 
    row=12     2.76        2.4      "outvar_3"      2.5814    2.4372     2.4806 
    row=13      3.6       1.96      "outvar_1"      2.7199    3.3129     3.0577 
    row=14      3.6       1.96      "outvar_2"      3.9804    1.4529     2.9285 
    row=15      3.6       1.96      "outvar_3"      2.8445    4.4117     2.6576 

7.3 Summarize

7.3.1 Matlab Table Summarize and Aggregate by Groups

Go back to fan’s MEconTools Package, Matlab Code Examples Repository (bookdown site), or Math for Econ with Matlab Repository (bookdown site).

7.3.1.1 Group Table Rows and Sum within Group

There is a table where subsets of rows belong to different simulations, with exogenous fixed \(\rho\) parameters. For each \(\rho\) parameter combination, there are, stored as different rows, a number of model predictions and data moments, and corresponding difference. Find the total difference between model and data for subsets of rows based for each \(\rho\) parameter set.

First, create a table where each \(\rho\) group is identified jointly by \(\rho_a\) and \(\rho_b\), stored in the 3rd and 4th rows.

% Make N by 2 matrix of fieldname + value type
mt_st_variable_names_types = [...
    ["year", "double"];["category", "string"];...
    ["rhoa", "double"];["rhob", "double"]; ...
    ["numberWorkersSimu", "logical"]; ["numberWorkersData", "double"]];
% Make table using fieldnames & value types from above
tb_agg_exa = table('Size',[0,size(mt_st_variable_names_types,1)],... 
    'VariableNames', mt_st_variable_names_types(:,1),...
    'VariableTypes', mt_st_variable_names_types(:,2));
% Table with data inputs
tb_agg_exa = [tb_agg_exa;...
    {1, 'C001', 0.50, 0.50, 5.5, 6.05}; {2, 'C002', 0.50, 0.50, 3.7, 4.4}; ...
    {1, 'C001', 0.25, 0.30, 2.5, 3.65}; {2, 'C002', 0.25, 0.30, 0.1, 1.6}; ...
    {3, 'C001', 0.25, 0.50, 0.01, 1.66}];
% Generate model and data difference
tb_agg_exa{:, "diff"} = tb_agg_exa{:, "numberWorkersSimu"} - tb_agg_exa{:, "numberWorkersData"};
% Display
disp(tb_agg_exa);

    year    category    rhoa    rhob    numberWorkersSimu    numberWorkersData    diff 
    ____    ________    ____    ____    _________________    _________________    _____

     1       "C001"      0.5    0.5            5.5                 6.05           -0.55
     2       "C002"      0.5    0.5            3.7                  4.4            -0.7
     1       "C001"     0.25    0.3            2.5                 3.65           -1.15
     2       "C002"     0.25    0.3            0.1                  1.6            -1.5
     3       "C001"     0.25    0.5           0.01                 1.66           -1.65

Second, select the subset of columns that are relevant for aggregation.

% Select
tb_agg_exa = tb_agg_exa(:, ["rhoa", "rhob", "diff"]);
% Display
disp(tb_agg_exa);

    rhoa    rhob    diff 
    ____    ____    _____

     0.5    0.5     -0.55
     0.5    0.5      -0.7
    0.25    0.3     -1.15
    0.25    0.3      -1.5
    0.25    0.5     -1.65

Third, group by unique combinations of rhoa, rhob, and aggregate. Then generate group ID.

% Sum within groupo
tb_groupby_agg_sum = groupsummary(tb_agg_exa, ["rhoa", "rhob"], "sum");
% Generate grouping ID
tb_groupby_agg_sum{:, "ID"} = (1:1:size(tb_groupby_agg_sum, 1))';
tb_groupby_agg_sum = movevars(tb_groupby_agg_sum, "ID", "Before", 1);
disp(tb_groupby_agg_sum);

    ID    rhoa    rhob    GroupCount    sum_diff
    __    ____    ____    __________    ________

    1     0.25    0.3         2          -2.65  
    2     0.25    0.5         1          -1.65  
    3      0.5    0.5         2          -1.25