Wednesday, October 21, 2009

On Armstrong Thesis Ch 4

A side effect is any computation or operation by a procedure that is not the primary purpose of that procedure. Obviously sharing data between concurrent processes represents a context favoring side-effects, because the view on the shared resource in each process can easily be compromised by another’s action on the same resource. And the problem is not breaking preconditions and invariants, because this can happen in sequential code as well; it is the fact that, in concurrent code, they can be broken even in the middle of performing a certain function, after the invariants or preconditions have been verified by the concurrent code. Consequently, our process can easily run into side effects. After checking preconditions and invariants sequential code is guaranteed to perform the function until the end. Of course, sequential code is also concerned with the manipulation of hardware, which also involves side effects. In the Erlang sequential subset there are a few operations with side-effects, but they are virtually never needed.

Making processes and concurrency part of the Erlang programming language as opposed to relying on the host operating system has a number of advantages. Concurrent programs run identically on different OSs, all issues of synchronization and interaction between processes are the same irrespective of the properties of the host operating system. Also, the separation from OS allows Erlang processes to be implemented as light weight processes without the overhead associated to context switching (thus improved performance). Besides performance, this separation or very little need for an OS, offers a lot of flexibility, an Erlang system being easily ported on specialized environments as for example embedded systems.

I mainly agree with Armstrong saying that “abstracting out concurrency is one of the most powerful means available for structuring large software systems”, an idea definitely supporting the suggested dirty/clean paradigm for structuring code for highly reliable systems. However, I wonder whether Armstrong’s idea should be generalized to any types of software systems or it should only be applied to systems that aim for high reliability.

I recently read a very interesting article where the authors were trying to foresee the future of programming languages. “How high level can we get?” was the question posted by the authors. The current trends favor high level abstractions in everything, higher level languages and tools, from compiler to programming environment. However, it is to not really clear whether languages should take Erlang’s path and change fundamentally to handle distributed systems and concurrency. In fact, the authors even give a good example of a system architecture where concurrency should not be abstracted away altogether: distributed systems. In these systems, local resources (memory and objects etc) are generally fast to access, however accessing remote resources may be hundreds of times slower, or even worse, the resources may not be available at all. These differences should not be abstracted away and hidden for the programmer. An abstraction that allows us to handle different resources using the same techniques is fine, one which prevents us from knowing how our resources are allocated or created is not.

Probably a reasonable idea would be for a language to stay away from concurrency and other such restrictive abstractions or support them only through add-on libraries to facilitate what is a new programming paradigm. Obviously Erlang took the opposite path and built concurrency support into the core language. Would this be among the reasons Erlang not being mainstream?

No comments:

Post a Comment