Sunday 17 January 2010

I have moved.......

Finally got around to setting up a "proper" blog on my own hosting.  The main reason was to be able to utilise some of the code highlighting tools.

My blog can now be found at www.iainjmitchell.com/blog

Monday 26 October 2009

Parallel Extensions in .NET Part 2

In the last post I covered the Co-ordination Data Structures in .NET 4.00, so now I will examine the Task Parallel Library and Parallel LINQ (PLINQ).

As I mentioned in the last post, Mike Taulty explained that the thought behind the Parallel Extensions is to abstract the multi-processor jobs away from the concept of threads.  The Task Parallel Library contains objects that allow you to achieve this.

At the heart of this is the Task object which represents a unit of work to be carried out.  There is a factory version (as shown below) and a standard object version of this class.


Task myTask = Task.Factory.StartNew(() =>
{
    //code to be carried out
});


myTask.Wait(); //wait for task to complete

Which ever method you use to construct the object it is passed a predicate which contains the work to be carried out.  Tasks are atomic operations that can be carried out on any of the processors in the PC.  The Wait() command is only applicable if you need to wait for the result of the Task

An interesting feature of Tasks is that a Task can start off other Tasks within them.

Task parentTask = Task.Factory.StartNew(() =>
{
    Task childTask = Task.Factory.StartNew(() =>

    {
       //code to be carried out
    });
    //other code to be carried out

});

These child classes are automatically linked to the Parent class, so that if you call Wait() on the parent class then this will wait for any child Tasks to finish.  This default behavior can be turned off (if required) in an extra parameter passed into the constructor.

Another intriguing feature about these Tasks is that when an exception occurs within the main Task or one of it's children, the exception thrown is a AggregateException.  This exception contains a new property called InnerExceptions that contains any exceptions that are thrown by the children.  So, if three child task are spawned and all of these throw an exception you can view all of these in this AggregateException.  Nice.

Mike also introduced the Parallel static class which provides an alternative way of spawning multiple Actions.  It has handy methods such as Parallel.ForEach that allows a number of actions to be thrown a number of times.  I won't go into too much detail about this one, but if you look on the recently published MSDN for .NET 4.00 there are plenty of examples of it's use.

Finally, Mike introduced use to Parallel LINQ, otherwise known as PLINQ!  This is a command that can be used on LINQ queries to make them execute across multiple processors.  It's purpose is to make LINQ queries on any large data sources more efficient.

PLINQ is very easy to use, all you need to do is utilise the AsParallel() at the appropriate place in the LINQ query.  For example:

var results = myData.AsParallel().Where(x => x.Name == "World").Count();

It's that easy!

That concludes my whirlwind tour of the Parallel Extensions in .NET 4.00.  There are plenty of other resources on the MSDN and various microsoft blogs on Parallel Extensions.  Here are a few:

Introduction to PLINQ
Microsoft Parallel Computing Center
Microsoft Parallel Programming Team Blog
Mike Taulty's Blog

Thursday 22 October 2009

Parallel Extensions in .NET 4.00 Part 1

Last night I attended an NxtGen User Group in Manchester at which Microsoft guru Mike Taulty was giving an introduction to the new parallel programming components of the .NET framework.

Whether we like it or not we, as developers, are going to have to adapt to using techniques that will run code on separate processors. The new chips from Intel and AMD are no longer continuing to increasing in Ghz, they are just containing more processors. So, we need to be able to utilise this power.

The good news is that the .NET CLR and certain components, such as WCF and WF, already spread their workload across multiple processors. But the bad news is that our own programs do not.

So, how would you currently handle this? The answer is by using a Threads or a ThreadPool. But as Mike demonstrated this is quite long winded and makes the code quite difficult to read.

The idea behind the Parallel Extensions in .NET 4.00 is to abstract the concept of the work that can be carried out on multiple processors away from the underlying concept of threads. They have split the components up into the following three categories:
  1. Co-ordination Data Structures.
  2. Task Parallel Library
  3. Parallel LINQ (PLINQ)
Lets start with the Co-ordination Data Structures. This category contains all the replacement data structures that are required to support parallel programming. If you have ever performed any thread safe work that involves a collection, you will know that none of the collections in .NET are thread safe. The only solution is to lock the entire collection when carrying out any work. Which frankly is a bit rubbish.

The co-ordination data structures contain a selection of Concurrent Collections to get around this issue. These include ConcurrentLinkedList and ConcurrentQueue. It also includes some advanced Co-ordination helpers that include a Barrier (blocks work until a x number of threads reach) and a CountdownEvent (a concurrent countdown that can be signalled from multiple threads).

The last of the co-ordination data structures is the Data Synchronization classes. The main one of interest is the Lazy class. Great name, but what does it do? It allows you to create a thread safe Singleton class. For example, say we have a singleton class called MyClass which has a method called MyMethod(). Initialisation of the singleton can be complex as we may have several threads entering the initialisation code. The solution is to use the Lazy class to wrap the class like so:

Lazy myInstance = new Lazy<MyClass>();
myInstance.Value.MyMethod();

This ensures that ONLY one instance of this class will exist.

In the next part I will cover the Task Parallel Library.

Tuesday 20 October 2009

Creating comma separated strings with LINQ

Here is a really quick LINQ based solution to the conversion of a collection of strings into a comma separated string:

//Prepare list
List myList = new List();
myList.Add("aString");
myList.Add("anotherstring");

//Get comma separated string

var stringBuilder = myList.Aggregate(new StringBuilder(), (builder, stringValue) =>
builder.AppendFormat((builder.Length == 0) ? "{0}" : ", {0}", stringValue));
string commaString = stringBuilder.ToString();

Monday 28 September 2009

WCF Tracing with Activities

One of the nice features of Windows Communication Foundation (WCF) is that it can generate an excess of trace information that can help us identify issues with our services. All you need to do is turn the tracing on through the app.config like so....

name="System.ServiceModel"
switchValue="Information">
name="traceListener"
type="System.Diagnostics.XmlWriterTraceListener"
initializeData= "c:\log\Traces.svclog" />
Microsoft even provide a handy tool called Service Trace Viewer Tool (SvcTraceViewer.exe) that can be used to view the trace information that WCF produces.

However, if you have ever looked at one of these traces you will have noticed that it can be difficult to sieve through and find the service call(s) that you are interested in. It becomes especially difficult if the service calls other services. Wading through all this tracing information is time consuming and it can be very frustrating.

Fortunately WCF provides a mechanism called Activities that allow us to correlate related trace information.

Where these Activities start and stop are defined by the developer in code. They can also be propagated across service boundaries and can also be linked together with other related activities.

So, what do you need to do in order to use Activities in your tracing? They are quite simple to use, all you need to do is assign an activity id to the correlation manager. This is a GUID value and is assigned as follows:

Trace.CorrelationManager.ActivityId = Guid.NewGuid();

Now, all the trace messages that are fired after this will be included in this Activity. However, in order for this to appear in the trace log we need to add ActivityTracing to the switchValue of the diagnostics section of the app.config.

name="System.ServiceModel"
switchValue="Information ActivityTracing"
propagateActivity="true">
name="traceListener"
type="System.Diagnostics.XmlWriterTraceListener"
initializeData= "c:\log\Traces.svclog" />
Also, if we want the ActivityId to be passed from one service to another, we can set the propagateActivity attribute to True. This is useful if we are starting the activity in one service operations that is then calling other services operations to achieve it's goals. With propagateActivity set to true, all of the trace information from these services for this particular call would be grouped together.

Wednesday 16 September 2009

UML Aggregation vs Composition

A colleague asked me a few weeks ago for some advice about their UML class diagram. One of the issues they were having was deciding on whether the link between two of the classes should have a "filled-in" or "non-filled-in" diamond.

You have probably already guessed that he was really asking about whether the two classes were linked by Aggregation ("non-filled-in") or Composition ("filled-in"). From my experience, this is a very confusing area of UML for some developers. I have reviewed may class diagrams where the developer seems to have just liked the look of one of the diamonds and just used it on all of their associations!

So, where should these notations be used? If you search on the internet you'll find a plethora of explanations, but I thought it might be useful to try and add a simplified explanation.

First off, this is the difficult part for developers, Aggregation and Composition are nothing to do with code. Both Aggregation and Composition are effectively talking about is one class contains another class. So, many explanations suggest it's Aggregation if it's public property and composition if it's a private field. But unfortunately it doesn't work like that.

The decision on whether a relationship is a Aggregate or Composite is based on how the classes are related in the real world. Take a classic UML example of a Car class. Your car class may have a engine class, a collection of wheel classes and a fluffy dice class. So, which of these are aggregates and which are composites?

Lets start with the engine class. This is quite a vital part of the car, you couldn't imagine a car without an engine. Likewise, the wheels are quite important too. In fact you could say that the car is composed of (among other things) an engine and wheels. So, these relationships are examples of Composition.

The remaining fluffy dice class is a bit of a looser relationship. A cars function does not depend on the tacky fluffy dice attached to the mirror. The car may not even have fluffy dice. So, this demonstrates quite nicely a Aggregation relationship.

So, our resulting class diagram looks something like this...

Monday 11 May 2009

Oracle vs .NET int

One of the problems of using .NET with Oracle is the type conversion of integers through the ODAC component.  Basically, the problem comes about because the ODAC component has no idea which .NET type to use when returning a value in a NUMBER field.... so it panics and just tends to return everything as a double (remember that the NUMBER field can hold decimal values!).

Oracle have also provided the INTEGER data type, which is really just a short-cut to get a NUMBER(32) field.  The best solution I can find to this is to define any columns that are being mapped to a .NET int as NUMBER(9, 0).   This will always ensure that the field will hold data that is compatible with being held in a .NET int.