In the article 'Programming for Multi-core Processors' carried in this issue,
we have talked about how you can utilize your the capabilities of your
multi-core machine. Here, we'll talk about another way of doing the same by
using Task Parallel Library (TPL). TPL is designed to automatically utilize all
the processors in a machine which results in enhanced performance, as displayed
in this sample implementation. TPL contains algorithms that automatically adapt
to a particular machine. For example, if you are running a parallel for loop on
a single core machine, it will perform as normal for loop. But if the same is
run on a multi-core machine, it will parallelize automatically. If any exception
is thrown in any of the iterations, all iterations are canceled and the first
thrown exception is re-thrown in the calling thread ensuring that exceptions are
properly propagated and never lost. One point to keep in mind while using TPL is
that, it does not support synchronization therefore, it is important for
programmers to take in account safe synchronous execution.
Direct Hit! |
Applies To: NET Developers USP: Parallel programming Primary Link: www.msdn.com search Engine Keywords: TPL, parallel programming |
Though one can use thread pool for parallelizing their code, it is very
complicated when compared with usage of TPL. When using thread pool instead of
TPL, one often divides work statically which leads to uneven work distribution.
TPL on the other hand uses work-stealing techniques to dynamically adapt and
distribute work items over the worker threads. TPL has task manager that by
default assigns one worker thread per processor. This ensures minimum thread
switching by OS. Each thread has its own queue of tasks and when this queue
becomes empty, thread tries to 'Steal' work from other queues of worker threads.
Implementation
To show how parallel processing can boost your application performance, we
have created a sample code that measures time of processing. We ran this code in
three different scenarios. First, we used simple 'for' loop and calculated time
of execution on single core, then we used 'Parallel.For' loop to parallelize
execution of conventional 'for' loop. We executed 'Parallel.For' loop on two
different machines one with two cores and other with four. To start with, create
a console application in Visual Studio. We have used C# as programming language.
Now add 'System.Threading' reference to program and add following code:
The comparative graph shows execution of 'for' loop and 'Parallel.For' loop. Parallel for loop is executed on quad core and dual core machines. |
using System.Linq;
using System.Text;
namespace ConsoleApplication2
{
class Program
{
static void Main(string<> args)
{
var source = Enumerable.Range(1, 1000);
using System.Linq;
using System.Text;
using System.Threading;
namespace ParallelTPL
{
class Program
{
static void Main(string<> args)
{
DateTime startTime = DateTime.Now;
Console.WriteLine(startTime);
/*for (int i = 0; i < 1000; i++)
{
Thread.Sleep(100);
Console.WriteLine(i);
}*/
Parallel.For(0, 1000, delegate(int i)
{
Thread.Sleep(100);
Console.WriteLine(i);
});
DateTime stopTime = DateTime.Now;
Console.WriteLine(stopTime);
TimeSpan duration = stopTime - startTime;
Console.WriteLine(duration);
Console.ReadLine();
}
}
}
Output of conventional 'for' loop. Results are displayed on order as execution happens in sequence. |
Output of 'Parallel.For' loop on quad core machine. Results are out of order as execution happens in parallel. |
There are three important parts to this code -first is the code that
calculates and displays time of execution. This code again has three parts,
first it gets current time (before execution) via following code:
DateTime T = DateTime.Now;
Console.WriteLine(T)
After these lines, add your code for which you want to calculate processing
time, then again get current time (after execution) and finally subtract two to
get time elapsed in execution. Second important part of the sample code is
conventional 'for' loop. In this loop, every iteration is executed in sequence.
Finally third part of the code is 'Parallel.For' loop, where 'for' loop is
executed in parallel using multiple cores of machine. As one can see
'Parallel.For' loop takes three arguments, first two arguments specify the
iteration limits with last argument being delegate expression.
Results
When we compared time of execution of the same code, we found out that
conventional 'for' loop was able to execute in 100 seconds. When we executed
'Parallel.For' loop in dual core machine, execution time was 25 seconds while
that for quad core machine was just 11 seconds. These results show how one can
boost application performance using Task Parallel Library present in .NET
Framework