Slow Execution of Parfor Loops due to Communication Overhead: Load static data into worker workspace memory?
6 views (last 30 days)
Show older comments
For my research, I require near realtime execution of a large number (>1000) of matrix-vector multiplications of the form A*x with A a medium scale matrix (e.g. 150x150). These matrices are constructed in an extremely expensive operation (takes hours to complete), and saved in a static data structure (MatSet in the example below). This static data structure is used by all workers, and is not modified after creation.
When I run the code, which is equivalent to the code below, I find that the PARFOR loop is more than 10 times slower than the FOR loop in Matlab 2010b. This is caused by a constant transfer of data (MatSet in this case) between workers. In my case, however, this data transfer is completely unnecessary as MatSet is a read-only dataset!
My question is whether there is some way of loading a STATIC dataset into the workspace of the workers so as to prevent unnecessary communication overhead between workers? Is it possible to do this without having to load data from disk?
Here is the demo code:
matlabpool(2); % init 2 worker threads
Msize = 150; Nloop = 1000;
c1 = zeros(Msize, Nloop); c2 = zeros(Msize, Nloop);
% parallel initialization loop
MatSet = cell(Nloop, 1);
parfor i=1:Nloop
MatSet{i} = rand(Msize); % simulates expensive code operation
end
% real-time parallel loop (SLOW!)
tic;
parfor i=1:Nloop
c1(:,i) = MatSet{i} * rand(Msize, 1);
end
time1 = toc;
% real-time serial loop (for comparison)
tic;
for i=1:Nloop
c2(:,i) = MatSet{i} * rand(Msize, 1);
end
time2 = toc;
fprintf('Parallel time: %2.4f ms, Serial Time: %2.4f ms\n', 1000*time1,1000*time2);
matlabpool close;
Any comments are appreciated,
Coen
0 Comments
Accepted Answer
Edric Ellis
on 17 Nov 2011
You might be able to take advantage of my Worker Object Wrapper which is designed to help set up this sort of static data to be used on workers.
2 Comments
Edric Ellis
on 21 Nov 2011
It's hard to say exactly why it's still slower. Your syntax is fine. Some points to note:
1. generateMatrix is evaluated once per worker.
2. The result "w.Value" is stored separately on each worker.
Either of those two factors could be important. Also, it's worth bearing in mind that some PARFOR loops do not experience speedup due to the overhead of going into a PARFOR loop, and the data transfer involved.
More Answers (0)
See Also
Categories
Find more on Parallel for-Loops (parfor) in Help Center and File Exchange
Products
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!