Monday, 26 October 2009

Parallel Extensions in .NET Part 2

In the last post I covered the Co-ordination Data Structures in .NET 4.00, so now I will examine the Task Parallel Library and Parallel LINQ (PLINQ).

As I mentioned in the last post, Mike Taulty explained that the thought behind the Parallel Extensions is to abstract the multi-processor jobs away from the concept of threads.  The Task Parallel Library contains objects that allow you to achieve this.

At the heart of this is the Task object which represents a unit of work to be carried out.  There is a factory version (as shown below) and a standard object version of this class.


Task myTask = Task.Factory.StartNew(() =>
{
    //code to be carried out
});


myTask.Wait(); //wait for task to complete

Which ever method you use to construct the object it is passed a predicate which contains the work to be carried out.  Tasks are atomic operations that can be carried out on any of the processors in the PC.  The Wait() command is only applicable if you need to wait for the result of the Task

An interesting feature of Tasks is that a Task can start off other Tasks within them.

Task parentTask = Task.Factory.StartNew(() =>
{
    Task childTask = Task.Factory.StartNew(() =>

    {
       //code to be carried out
    });
    //other code to be carried out

});

These child classes are automatically linked to the Parent class, so that if you call Wait() on the parent class then this will wait for any child Tasks to finish.  This default behavior can be turned off (if required) in an extra parameter passed into the constructor.

Another intriguing feature about these Tasks is that when an exception occurs within the main Task or one of it's children, the exception thrown is a AggregateException.  This exception contains a new property called InnerExceptions that contains any exceptions that are thrown by the children.  So, if three child task are spawned and all of these throw an exception you can view all of these in this AggregateException.  Nice.

Mike also introduced the Parallel static class which provides an alternative way of spawning multiple Actions.  It has handy methods such as Parallel.ForEach that allows a number of actions to be thrown a number of times.  I won't go into too much detail about this one, but if you look on the recently published MSDN for .NET 4.00 there are plenty of examples of it's use.

Finally, Mike introduced use to Parallel LINQ, otherwise known as PLINQ!  This is a command that can be used on LINQ queries to make them execute across multiple processors.  It's purpose is to make LINQ queries on any large data sources more efficient.

PLINQ is very easy to use, all you need to do is utilise the AsParallel() at the appropriate place in the LINQ query.  For example:

var results = myData.AsParallel().Where(x => x.Name == "World").Count();

It's that easy!

That concludes my whirlwind tour of the Parallel Extensions in .NET 4.00.  There are plenty of other resources on the MSDN and various microsoft blogs on Parallel Extensions.  Here are a few:

Introduction to PLINQ
Microsoft Parallel Computing Center
Microsoft Parallel Programming Team Blog
Mike Taulty's Blog

No comments: