Back to Fan's Matlab Examples Table of Content
During Iteration Solution Procedure, sometimes only a subset of rows/columns need to be updated for some core matrix after each iteration.
Potentially, there could be significant speed gains if one does not need to fully recompute based on some N by M matrix, but can compute based on some N-n by M-m matrix, and update values in the N by M matrix with new values. See this file for examples when we fully reuse matrixes.
As stated here, we will store existing calculations in cell arrays.
The set of operations below is common in iterative solution methods for value/policy functions.
clear all;
it_z = 15;
it_rown = 300; % 4GB if 1000
it_rown_update = floor(it_rown/7); % 1/10 of the rows are updated
ar_it_rows_replace = sort(datasample(1:it_rown, it_rown_update, 'Replace', false));
it_coln = round(((it_rown-1)*it_rown)/2 + it_rown);
it_coln = 3000;
it_iter = 50;
it_coll = 0;
% define Arrays
ar_coh = rand([1,it_coln]);
ar_kp = rand([1,it_rown]);
ar_bp = rand([1,it_rown]);
% ar_it_rows_replace
f_c = @(coh, kp, bp) coh - (kp + bp/(1+0.02))
f_u = @(c) log(c)
Matrix evaluation, all rows, or some rows. Evaluating only subset of the rows takes less time.
mt_u = f_u(f_c(ar_coh, ar_kp', ar_bp'));
mt_u_partial = f_u(f_c(ar_coh, ar_kp(ar_it_rows_replace)', ar_bp(ar_it_rows_replace)'));
size(mt_u)
size(mt_u_partial)
mt_u = f_u(f_c(ar_coh, ar_kp', ar_bp'));
mt_u_partial = f_u(f_c(ar_coh, ar_kp(ar_it_rows_replace)', ar_bp(ar_it_rows_replace)'));
f_full_mt_eval = @() f_u(f_c(ar_coh, ar_kp', ar_bp'));
f_part_mt_eval = @() f_u(f_c(ar_coh, ar_kp(ar_it_rows_replace)', ar_bp(ar_it_rows_replace)'));
fl_exe_time = timeit(f_full_mt_eval)
it_coll = it_coll + 1;
ar_fl_exe_times(it_coll) = fl_exe_time;
ar_fl_exe_desc{it_coll} = 'f_u(f_c(ar_coh, ar_kp, ar_bp))';
fl_exe_time = timeit(f_part_mt_eval)
it_coll = it_coll + 1;
ar_fl_exe_times(it_coll) = fl_exe_time;
ar_fl_exe_desc{it_coll} = 'f_u(f_c(ar_coh, ar_kp(ar_it_rows_replace), ar_bp(ar_it_rows_replace)))';
Compute consumption matrix every time.
ts_c_store = zeros([it_rown, it_coln, it_z]);
tic;
for it_iter_n=1:1:it_iter
for it_z_m=1:1:it_z
mt_u = f_u(f_c(ar_coh, ar_kp', ar_bp'));
end
end
toc;
% timing the function above store in file in the same folder
f_ff_subset_update_func = @() ff_u_c_partrepeat_func(1, ar_it_rows_replace, ar_coh, ar_kp', ar_bp', f_u, f_c, it_iter, it_rown, it_coln, it_z);
fl_exe_time = timeit(f_ff_subset_update_func)
it_coll = it_coll + 1;
ar_fl_exe_times(it_coll) = fl_exe_time;
ar_fl_exe_desc{it_coll} = 'ff_u_c_partrepeat_func(1): mt_u = f_u(f_c())';
Is this faster? Actually not. The consumption evaluation step was not time consuming. So there seems to be no gain from saving the consumption results.
In fact, this is substantiallly slowly, perhaps it is time consuming to retrieve stored matrix.
Note that this set up here works with 1 state variable, where our consumption grid is actually constant across periods. It does not work with 2 choice model with consumption grid is changing for the cash-on-hand single state version of 2 states and 2 choices model.
ts_c_store = zeros([it_rown, it_coln, it_z]);
tic;
for it_iter_n=1:1:it_iter
for it_z_m=1:1:it_z
if (it_iter_n == 1)
mt_c = f_c(ar_coh, ar_kp', ar_bp');
ts_c_store(:,:,it_z_m) = mt_c;
mt_u = f_u(mt_c);
else
ts_c_store(ar_it_rows_replace,:,it_z_m) = f_c(ar_coh, ar_kp(ar_it_rows_replace)', ar_bp(ar_it_rows_replace)');
mt_u = f_u(ts_c_store(:,:,it_z_m));
end
end
end
toc;
% timing the function above store in file in the same folder
f_ff_subset_update_func = @() ff_u_c_partrepeat_func(2, ar_it_rows_replace, ar_coh, ar_kp', ar_bp', f_u, f_c, it_iter, it_rown, it_coln, it_z);
fl_exe_time = timeit(f_ff_subset_update_func)
it_coll = it_coll + 1;
ar_fl_exe_times(it_coll) = fl_exe_time;
ar_fl_exe_desc{it_coll} = 'ff_u_c_partrepeat_func(2): f_u(ts_c_store(:,:,it_z_m))';
Let's not update utility, compute that only once. Now we have significant time saving. There should be slightly higher memory burden, although not substantial.
There are no speed gains at all, this is because in the following command structure, every time when the tensor ts_u_store is sliced to retrieve mt_u, a new mt_u array is copyied.
The time it takes to generate the new mt_u matrix is as much as it takes to calculate f_u(f_c())). Just because we stored the data in ts_u_store, does not mean there is no computational burden when we slice ts_u_store.
ts_u_store = zeros([it_rown, it_coln, it_z]);
tic;
for it_iter_n=1:1:it_iter
for it_z_m=1:1:it_z
if (it_iter_n == 1)
mt_c = f_c(ar_coh, ar_kp', ar_bp');
mt_u = f_u(mt_c);
ts_u_store(:,:,it_z_m) = mt_u;
else
ts_u_store(ar_it_rows_replace,:,it_z_m) = f_u(f_c(ar_coh, ar_kp(ar_it_rows_replace)', ar_bp(ar_it_rows_replace)'));
mt_u = ts_u_store(:,:,it_z_m);
end
end
end
toc;
% timing the function above store in file in the same folder
f_ff_subset_update_func = @() ff_u_c_partrepeat_func(3, ar_it_rows_replace, ar_coh, ar_kp', ar_bp', f_u, f_c, it_iter, it_rown, it_coln, it_z);
fl_exe_time = timeit(f_ff_subset_update_func)
it_coll = it_coll + 1;
ar_fl_exe_times(it_coll) = fl_exe_time;
ar_fl_exe_desc{it_coll} = 'ff_u_c_partrepeat_func(3): mt_u = ts_u_store(:,:,it_z_m);';
Rather than using a tensor, store values one by one, as a demonstration case, we separately name three mt_u matrixes. This method is dramatically faster.
But we do not want to have to name each of the matrix, that would involve hard-coding etc, not good.
tic;
for it_iter_n=1:1:it_iter
for it_z_m=1:1:it_z
if (it_iter_n == 1)
mt_c = f_c(ar_coh, ar_kp', ar_bp');
mt_u = f_u(mt_c);
if (rem(it_z_m,3) == 0)
mt_u_1 = mt_u;
end
if (rem(it_z_m,3) == 1)
mt_u_2 = mt_u;
end
if (rem(it_z_m,3) == 2)
mt_u_3 = mt_u;
end
else
mt_updates = f_u(f_c(ar_coh, ar_kp(ar_it_rows_replace)', ar_bp(ar_it_rows_replace)'));
if (rem(it_z_m,3) == 0)
mt_u_1(ar_it_rows_replace,:) = mt_updates;
mt_u = mt_u_1;
end
if (rem(it_z_m,3) == 1)
mt_u_2(ar_it_rows_replace,:) = mt_updates;
mt_u = mt_u_2;
end
if (rem(it_z_m,3) == 2)
mt_u_3(ar_it_rows_replace,:) = mt_updates;
mt_u = mt_u_3;
end
end
end
end
toc;
% timing the function above store in file in the same folder
f_ff_subset_update_func = @() ff_u_c_partrepeat_func(4, ar_it_rows_replace, ar_coh, ar_kp', ar_bp', f_u, f_c, it_iter, it_rown, it_coln, it_z);
fl_exe_time = timeit(f_ff_subset_update_func)
it_coll = it_coll + 1;
ar_fl_exe_times(it_coll) = fl_exe_time;
ar_fl_exe_desc{it_coll} = 'ff_u_c_partrepeat_func(4): mt_u_1, mt_u_2, mt_u_3;';
This was fast when no matrix updating was required, now this is a little slower, but still pretty good.
The less changes we have to make during each iteration, and also the more iterations there are, the better this method becomes.
Note the key replacement command is : cl_u_store{it_z_m}(ar_it_rows_replace,:)
tic;
cl_u_store = cell([it_z, 1]);
tic;
for it_iter_n=1:1:it_iter
for it_z_m=1:1:it_z
if (it_iter_n == 1)
mt_c = f_c(ar_coh, ar_kp', ar_bp');
mt_u = f_u(mt_c);
cl_u_store{it_z_m} = mt_u;
else
cl_u_store{it_z_m}(ar_it_rows_replace,:) = f_u(f_c(ar_coh, ar_kp(ar_it_rows_replace)', ar_bp(ar_it_rows_replace)'));
mt_u = cl_u_store{it_z_m};
end
end
end
toc;
% timing the function above store in file in the same folder
f_ff_subset_update_func = @() ff_u_c_partrepeat_func(5, ar_it_rows_replace, ar_coh, ar_kp', ar_bp', f_u, f_c, it_iter, it_rown, it_coln, it_z);
fl_exe_time = timeit(f_ff_subset_update_func)
it_coll = it_coll + 1;
ar_fl_exe_times(it_coll) = fl_exe_time;
ar_fl_exe_desc{it_coll} = 'ff_u_c_partrepeat_func(5): mt_u = cl_u_store{it_z_m}';
tb_exe_times = array2table([ar_fl_exe_times', ar_fl_exe_times'*it_z*it_iter]);
tb_exe_times.Properties.RowNames = ar_fl_exe_desc;
tb_exe_times.Properties.VariableNames = {'speedmat', 'speedfull'};
disp(tb_exe_times);