Sunday, September 27, 2009

On OPL

The paper on OPL offers a great survey of patterns restricted to describing the design of parallel software; it is basically concerned with software architecture and ways to design and implement parallel algorithms. Its target audience is the application programmer and not compiler writers or OS or parallel libraries developers. OPL is also not specific to any specific application domain.

OPL is structured as stacked layered system that defines five categories: architectural patterns - describing overall organizations of a parallel system and how the computing elements interact, computational patterns - describing the core classes of computations that make up the application, parallel algorithm strategy patterns - covering the methods to exploit concurrency in a parallel application, implementation strategy patterns – parallel program organization and common data structures specific to parallel programming and concurrent execution patterns.

While being familiar with many of the patterns I also found some that I was not that familiar with. Therefore, I appreciate the initiative of the authors to define the OPL layers and list the patterns and I also look forward for their next steps where they promised to follow up with pattern descriptions and careful review.

From the implementation strategy patterns the Master-worker/Task-queue is an interesting one. Structurally, the pattern is represented as a Master, maintaining a task queue and controlling a group of processing elements or workers. Usually, only one master and several identical worker components simultaneously exist and process during the execution time.

In this pattern, the same operation is simultaneously applied in effect to different pieces of data. Operations in each worker component are independent of operations in other components. The structure of the solution involves a central Master that distributes data among workers by request. Parallelism is introduced by having multiple data sets processed at the same time.

The tasks or the data pieces may have different sizes. This means that the independent computations of each task should adapt to the data size to be processed, in order to obtain automatic load-balancing. Also, the coordination of the independent computations has to take up a limited amount of time in order not to impede performance of the processing elements. The solution has to scale over the number of workers. Changes in the number of workers should be reflected by the execution time. Improvement in performance is achieved when execution time decreases.

No comments:

Post a Comment