Introduction
This tutorial demonstrates benefits of writing vectorized MATLAB code which runs orders of magnitude faster than non-vectorized code. A loop is usually used if you want to perform same calculations on selected elements of an array. Vectorization is much faster alternative which can do the same in a single statement (or at least much fewer statements) without a loop. It is important to note that MATLAB 6.5 and later include Just-in-Time (JIT) compiler which optimizes the MATLAB code before running it. In most cases the loops run much faster after complication without any manual optimizations by the user. However JIT compiler has some limitations and cannot always optimize all loops. It is a much better practice to do the vectorization manually instead of relying on JIT.
For this tutorial we will use tic and toc functions to calculate the time taken for a piece of code to execute. tic will be used before the code and toc after the code that is being analyzed.
Example 1
Lets write a MATLAB code that generates 100000 random numbers from normal distribution and calculates their squares. Consider the following code that shows three different ways of writing this functionality.
clc
clear variables
%% Code 1 - Generate 100000 random numbers from normal distribution
% and calculates their square using loop (without pre-allocation)
length = 100000;
tic
for i = 1:length
number_1(i) = randn;
square_1(i) = number_1(i)^2;
end
time1 = toc
%% Code 2 - Generate 100000 random numbers from normal distribution
% and calculates their square using loop (with pre-allocation)
tic
number_2 = zeros(1,length);
square_2 = zeros(1,length);
for i = 1:length
number_2(i) = randn;
square_2(i) = number_2(i)^2;
end
time2 = toc
%% Code 3 - Generate 100000 random numbers from normal distribution
% and calculates their square using vector
tic
number_3 = randn(1,length);
square_3 = number_3.^2;
time3 = toc
- Code 1 uses a “for” loop (with 100000 iterations) to generate each random number one by one and calculate its square in each iteration. The arrays that store numbers and their squares are not pre-allocated before starting the loop. In this case the arrays are changing size at each iteration.
- Code 2 uses the exact same code with first pre-allocating the arrays using built-in “zeros” function. Arrays change values but don’t change size at each iteration.
- Code 3 is the vectorized code that doesn’t require a “for” loop. Built-in MATLAB functions like randn can return whole vectors or arrays. In the above code, we have asked MATLAB to generate a 1×100000 array of random values. With the way MATLAB works, generating an array like this is faster than generating using a loop. MATLAB also supports operations on whole vectors and arrays. Element-wise power operator ( .^ ) calculates square of each element in the vector or matrix.
Lets run the whole script to see how much time each code takes to execute.
Note: You will get slightly different results each time you run the script.
time1 =
0.0383
time2 =
0.0051
time3 =
0.0026
>> time1/time3
ans =
14.6490
>> time2/time3
ans =
1.9474
Code 1, Code 2 and Code 3 take 0.0383 sec, 0.0051 sec and 0.0026 sec to execute respectively. Code 3 runs 14.6 times faster than Code 1 and 2 times faster than Code 2. We can observe that
- Pre-allocating arrays before using them in loops results in much faster code.
- Vectorized code runs faster than loop (even when arrays are pre-allocated).
We can conclude that writing vectorized MATLAB code can make your MATLAB programs much faster.
Example 2
Consider the case where we have to generate two vectors of 10000000 random numbers and calculate their dot product.
clc
clear variables
length = 1000000;
%% Code 1 - Generate 2 sets of 10000000 random numbers from normal distribution
% and calculates their dot product (using loop)
tic
arr1_1 = zeros(1,length);
arr2_1 = zeros(1,length);
dotproduct_1 = 0;
for i = 1:length
arr1_1(i) = randn;
arr2_1(i) = randn;
dotproduct_1 = dotproduct_1 + arr1_1(i)*arr2_1(i);
end
time1 = toc
%% Code 2 - Generate 2 sets of 10000000 random numbers from normal distribution
% and calculates their dot product using vectorization.
tic
arr1_2 = randn(1,length);
arr2_2 = randn(1,length);
dotproduct_2 = arr1_2*arr2_2';
time2 = toc
time1/time2
- Code 1 uses a loop to generate elements of the two vectors and calculate their dot product.
- Code 2 uses vectorization (matrix multiplication) to multiply a row vector of random numbers with column vector (transpose of 2nd row vector).
Lets run the code to compare the execution time of both codes.
time1 =
0.1322
time2 =
0.0524
ans =
2.5257
Code 2 is 2.5 times faster than Code 1.
We have seen two cases that demonstrate examples of writing vectorized MATLAB code and its benefits. If you are doing a lot of MATLAB programming then make it a habit to write vectorized MATLAB code. It will result in faster, compact and error free code.
Please subscribe to my Youtube channel.