Thursday, November 5, 2009

On PipelineProcessing, GeometricDecomposition, DataParallelism

Instruction pipeline in modern CPUs was one of the main topics I studied in the Computer Architecture class and therefore I could assume some familiarity with this pattern. Vector processing (loop level pipelining) was also a subject covered in same class. In loop level pipelining, specialized hardware in super computers allows operations on vectors to be performed in a pipelined fashion. Pipeline processing pattern is very similar to the Pipes and Filter patter, with the key difference being that this pattern explicitly discusses concurrency. It is also similar with Discrete Event pattern in the way that both apply to problems where the computation is naturally decomposed into a collection of semi independent tasks. However, the Discrete Event pattern is irregular and asynchronous where the Pipeline pattern is regular and synchronous: In the Pipeline pattern, the semi-independent tasks represent the stages of the pipeline, the structure of the pipeline is static, and the interaction between successive stages is regular and loosely synchronous. In the Discrete Event pattern, however, the tasks can interact in very irregular and asynchronous ways, and there is no requirement for a static structure.

The Geometric decomposition pattern was a first time read for me, but the explanation was clear and very intuitive. The authors discuss four key areas of geometric decomposition: data decomposition, data exchange, update operation, data distribution and task schedule. The decomposition of data into chunks implies a decomposition of the update operations into tasks. Each task represents the update of one chunk, and tasks execute concurrently. If each task can be executed with information entirely local to its chunk, then the concurrency is embarrassingly parallel and the Task Parallelism pattern should be used. However, when this is not the case Geometric decomposition should be used. A key factor in Geometric decomposition is ensuring that nonlocal data required for the update operation is obtained before it is needed. This is specified by the Data exchange. We must also ensure the data required for each chunk update is present when needed. This is analogous to managing data dependencies in the Task Parallelism pattern. The overall program structure for applications of this pattern will normally use either the Loop Parallelism pattern or the SPMD pattern, with the choice determined largely by the target platform.

Data parallelism is very generic. The operations are often uniformly applied to the data elements. This is when in a way data parallelism relates to the composite pattern. This is mainly because the pattern allows the programmer to represent a complex hierarchical structure so that individual objects and compositions of those objects are handled in a uniform way. In parallel computing domain decomposition is one way of utilizing the power of a parallel architecture. For data parallelism and domain decomposition into sub-domains, the composite pattern may provide a useful paradigm.

No comments:

Post a Comment