Quantcast

Parallel Computing Toolbox

Introduction to PARFOR Reduction Variables - Monte Carlo Method for Computing Pi

We demonstrate three concepts in this demo:

  1. Converting a for-loop to a parallel for-loop (parfor),
  2. Using parfor reduction variables,
  3. Executing code in parallel using a set of MATLAB® workers called a MATLAB Pool.

To demonstrate these concepts we use the example of computing $\pi$ using the Monte Carlo method.

Contents

Computing Pi Using a Monte Carlo Method

Given an area A of a square, and an area C of a circle inscribed in the square, we can compute $\pi$ from A and C via $\pi = 4 \times \frac{C}{A}$. The Monte Carlo method for computing $\pi$ is based on this relationship and assumes that when we throw a sufficiently large number of darts into the square, and count the number of darts that fall within a circle inscribed in this square; the ratio of darts that fall inside of the circle to the total number of darts thrown at the square approximates the ratio of C over A or $\frac{\pi}{4}$. Thus the value of $\pi$ is computed as the ratio of darts which fall inside of the circle to the total number of darts thrown at the square times 4.

In this demo, for purposes of simplicity and symmetry, we focus only on the first quadrant of a unit circle inscribed in a square with side equal to 2. We assume that the X and Y coordinates of the darts are uniformly distributed between 0 and 1.

The figure above is a visual demonstration of how the Monte Carlo method works. To generate this figure, we throw 50 darts into the upper right quadrant of a square with side equal to 2. The o marks represent the darts which fell inside of the circle of radius 1, and the x marks represent the darts that fell outside of the circle. With 50 darts thrown we only get a crude approximation of $\pi$. To approximate $\pi$ more accurately we need to throw more darts. Note that this method for approximating $\pi$ is not particularly effective. However, it serves to demonstrate how to convert for loop iterations into parfor and how to use parallel reduction variables.

function paralleldemo_parfor_pi()

Computing Pi Using Sequential-Loop Monte Carlo Method

In this section, we present an implementation of a sequential-loop Monte Carlo method for estimating $\pi$. The method is implemented as a nested function SequentialMonteCarloPI. First, we set the number of darts that is thrown at the square and initialize the radius R to 1.

darts = 1e8;
R = 1;
tic

Call the sequential Monte Carlo method for computing $\pi$.

myPI = sequentialMonteCarloPI(darts, R);

Measure the sequential elapsed time and display the results.

seqElapsedTime = toc;
fprintf('The computed value of pi is %8.7f.\n',myPI);
The computed value of pi is 3.1419218.

Implementing the Sequential Monte Carlo Function

This function uses a for-loop in place of a more optimized "vectorized" implementation to serve as a demonstration on how to use parfor and parfor reduction variables.

    function myPI = sequentialMonteCarloPI(darts, R)
        count = 0; % Number of darts that fell inside of the circle.
        for i = 1:darts
            % Compute the X and Y coordinates of where the dart hit the
            % square using uniform distribution.
            x = R*rand(1);
            y = R*rand(1);
            if x^2 + y^2 <= R^2
                % Increment the count of darts that fell inside of the
                % circle.
                count = count + 1;
            end
        end
        % Compute PI using the ratio of darts that fell inside of
        % the circle to the total number of darts thrown.
        myPI = 4*count/darts;
    end

Transitioning to Using Parallel For-Loops (PARFOR)

In this section, we modify our sequential implementation of the Monte Carlo method for computing $\pi$ to use parallel for-loops (parfor), and reduction variables.

We start the process by analyzing the sequential-loop Monte Carlo method. The increment operations on the count variable appear to introduce data dependencies between iterations of the for-loop. However, the addition operation is associative, so the order of operations does not matter. Using this property, parfor can compute local copies of the count variable in parallel on each worker in the MATLAB pool, and then accumulate the partial results together at the end of the parfor loop. This accumulation of a result is called a reduction operation, and the count variable is called a reduction variable. To indicate that the count variable is a reduction variable, it is declared before the parfor code block, and is used after the parfor code block.

A MATLAB pool is a collection of MATLAB workers that can be used to offload work from the MATLAB client interactively and in parallel. To take full advantage of parfor to speed up computations a MATLAB pool has to be open. If the MATLAB pool is not open the parfor executes sequentially in the client MATLAB session.

Refer to the parfor documentation for more details. See the Advanced Topics section of the parfor documentation for more information on the classification of variables and reduction variables.

Computing Pi Using Parallel-Loop Monte Carlo Method

First, we verify that the MATLAB pool is open.

poolSize = matlabpool('size');
if poolSize == 0
    error('parallel:demo:poolClosed', ...
        'This demo needs an open MATLAB pool to run.');
end
fprintf('This demo is running on %d MATLABPOOL workers.\n', ...
    matlabpool('size'));
This demo is running on 8 MATLABPOOL workers.

We scale the number of darts by the size of the MATLAB pool to demonstrate the parfor weak scaling capability.

parforDarts = darts * poolSize;
R = 1;
tic

Call the parallel Monte Carlo method for computing $\pi$.

myPI = parallelMonteCarloPI(parforDarts,R);

Measure the parallel elapsed time and display the results.

parElapsedTime = toc;
fprintf('The computed value of pi is %8.7f.\n',myPI);
The computed value of pi is 3.1417299.

Implementing the Parallel Monte Carlo Function

    function myPI = parallelMonteCarloPI(darts, R)

Each iteration is independent, and effectively only involves the use of a reduction variable count. Therefore we only have to change the for to parfor to execute loop iterations in parallel.

        count = 0;
        parfor i = 1:darts
            % Compute the X and Y coordinates of where the dart hit the
            % square using Uniform distribution.
            x = R*rand(1);
            y = R*rand(1);
            if x^2 + y^2 <= R^2
                % Increment the count of darts that fell inside of the
                % circle.
                count = count + 1; % Count is a reduction variable.
            end
        end

parfor automatically accumulates all the partial results of the reduction variable count.

        % Compute pi
        myPI = 4*count/darts;
    end

Comparing the Sequential and Parallel Loop Results

We compare the execution times of the sequential and parallel loop Monte Carlo method implementations. We observe that even though the execution times are comparable for both implementations, the parfor implementation executes a poolSize times larger number of loop iterations, and computes a more accurate value for $\pi$. In addition, the code change required to accomplish this significant increase in performance involves just changing the for keyword to parfor in the main loop of the program.

fprintf('The sequential-loop Monte Carlo method executed in %8.2f',...
    seqElapsedTime);
fprintf(' seconds\n and computed %4.2e darts per second.\n',...
    darts/seqElapsedTime);
fprintf('The parallel-loop Monte Carlo method executed in %8.2f',...
    parElapsedTime);
fprintf(' seconds\n and computed %4.2e darts per second.\n',...
    parforDarts/parElapsedTime);
The sequential-loop Monte Carlo method executed in    12.38 seconds
 and computed 8.07e+06 darts per second.
The parallel-loop Monte Carlo method executed in    13.29 seconds
 and computed 6.02e+07 darts per second.
end