Monday, January 14, 2008

Teraflop processors is not far, Are we ready to use it?

The prototype 80-core Polaris processor on a single chip delivered the super computer like performance of a trillion floating-point operations per second (one teraflop) while consuming less than 62 watts – Intel.

It will take less than 10 years from now for a common man to have PC running on a teraflop processor. A prototype of teraflop processor has 80 cores which can be executed in parallel. So to get the most out of 80 cores we need to run 80 threads in parallel. Hence it’s clear that the program must be heavily threaded to make use of all cores.

The question that comes is, “It can deliver up to a Teraflop, but how we are going to get most out of it?”

These are the days when programmers are trying hard to get most out of a quad core or a dual core processor. These processors can give a lot but it is up to the programmer to make use of it.

Threading for parallelism

Most of the programmers used thread only for separating User Interface from the time consuming operations that happens according to the user operation. But those days are gone. Now the thread is not just to do things without blocking the other one. It is all about performance. Threading for performance is the key now.

The hard

Hands full of tools are available which helps to analyze, debug and optimize threads. It’s not hard to detect a synchronization problem or a thread over run. It’s easy these days to debug a chunk of code in different threads. But why all algorithms are not yet threaded? What is the big deal in it? It is discovering parallelism!!! Yea, the hardest ever thing in optimization is finding a parallel way to optimize the most time taking part of the algorithm. Mostly every time if we look the code of an algorithm the most time taking part will be entirely sequential. It will look like something which can never be parallelized. That is where it gets quite tricky. More and more innovation can only do something to get things parallelized. It’s not about parallelizing the code of the algorithm; it’s all about changing the algorithm in a parallel way!

No comments: