Friday, January 18, 2008

Limiting the possible data types given for a template class during instantiation

How to limit the data types that can be used for instantiating a template class?

In C++ there is no standard mechanism to limit the data types that a template can be created with. For example,

template < class T>
class ATemplate
{
};

The above class (ATemplate) can be instantiated with any data type. So ‘T’ can be any data type (eg. basic data types, structures and classes). But what if you want to accept only selected data types? For example a template which only supports char, int and double. C++ doesn’t have a standard method to do it. But Microsoft have put some effort in .NET Generics (something like a C++ template) to limit the data types which can be supported. Even though C++ doesn’t support it directly it is possible to get the data types limited in C++ by making a small trick. Let us see an example.

template < class T >
class MyTemplate
{
T m_x;
void AllowThisType( int& obj ){}
void AllowThisType( char& obj ){}
void AllowThisType( double& obj ){}

public:
MyTemplate()
{
T tmp;
AllowThisType( tmp );
}
void SetX( T val )
{
m_x = val;
}

void print()
{
std::cout<< m_x<< std::endl;
}
};

int main()
{
MyTemplate < int > objint;
MyTemplate < char > objchar;
MyTemplate < double > objdouble;
MyTemplate < float > objfloat; //<< Error when you create this
}

How do the above program limit the data types that can be used for creating a template class object?
From the constructor of class MyTemplate an overloaded function AllowThisType is called. This function is having 3 different overloads. An integer reference, character reference and a double reference. So when you make object with int, char or double there is an appropriate AllowThisType method that compiler can find and match. But when you create an object with any other data type compiler cannot find a matching AllowThisType overload and hence the compilation fails. Let us see an example.

MyTemplate < float > objfloat;

When you make an object of MyTemplate with T as float compiler will look for an AllowThisType( float& ). Since we did not write an overload like AllowThisType( float& ) the compilation fails. So it is clear that the data types which can be given are limited here. Whenever an addiditional type needs to be supported a new AllowThisType overload must be added with the newly supporting data type.

Monday, January 14, 2008

Teraflop processors is not far, Are we ready to use it?

The prototype 80-core Polaris processor on a single chip delivered the super computer like performance of a trillion floating-point operations per second (one teraflop) while consuming less than 62 watts – Intel.

It will take less than 10 years from now for a common man to have PC running on a teraflop processor. A prototype of teraflop processor has 80 cores which can be executed in parallel. So to get the most out of 80 cores we need to run 80 threads in parallel. Hence it’s clear that the program must be heavily threaded to make use of all cores.

The question that comes is, “It can deliver up to a Teraflop, but how we are going to get most out of it?”

These are the days when programmers are trying hard to get most out of a quad core or a dual core processor. These processors can give a lot but it is up to the programmer to make use of it.

Threading for parallelism

Most of the programmers used thread only for separating User Interface from the time consuming operations that happens according to the user operation. But those days are gone. Now the thread is not just to do things without blocking the other one. It is all about performance. Threading for performance is the key now.

The hard

Hands full of tools are available which helps to analyze, debug and optimize threads. It’s not hard to detect a synchronization problem or a thread over run. It’s easy these days to debug a chunk of code in different threads. But why all algorithms are not yet threaded? What is the big deal in it? It is discovering parallelism!!! Yea, the hardest ever thing in optimization is finding a parallel way to optimize the most time taking part of the algorithm. Mostly every time if we look the code of an algorithm the most time taking part will be entirely sequential. It will look like something which can never be parallelized. That is where it gets quite tricky. More and more innovation can only do something to get things parallelized. It’s not about parallelizing the code of the algorithm; it’s all about changing the algorithm in a parallel way!