Wednesday, November 28, 2012

Unit testing asynchronous operations with the Task Parallel Library (TPL)

Unit testing asynchronous operations has never been easy in C#. The most common methods (or at least the methods I usually end up with) is either;
  1. Write a synchronous version of the method to test, unit test this one and then call the synchronous method from another method that runs it asynchronous in the production code.
  2. Raise an event in the production code when the asynchronous operation has finished, subscribe to the event in the unit test, and use the ManualResetEvent to wait for the event before making any assertions.
Neither is a good solution.
Writing a synchronous version and let the production code call it is probably the easiest one, but breaks down once you need to do more than just call the synchronous method in production (e.g. orchestrating several dependent asynchronous operations, or have some logic run when the asynchronous operation(s) completes). And the worst part of it; a vital part of the production code will be untested.
The ManualResetEvent is better, but it takes a lot more code, makes the unit tests harder to read and you need to fire events in the prod code that possibly only unit tests are interested in. And unit tests dependent on ManualResetEvent tends to be fragile when run in parallel.
But with the Task Parallel Library (TPL) the table has turned; TPL makes unit testing asynchronous code a lot easier. That is; it’s easy if you now how to do it.
Running some code asynchronously without any concerns for testability is pretty straight forward with TPL:
Task.Factory.StartNew(MyLongRunningJob);
And in fact; it’s not much harder to make it test-friendly. You only need a bit insight into what’s going in the Task Factory. And to have it straight from the horse’s mouth; here’s what MSDN says about it:
Behind the scenes, tasks are queued to the ThreadPool, which has been enhanced with algorithms (like hill-climbing) that determine and adjust to the number of threads that maximizes throughput. This makes tasks relatively lightweight, and you can create many of them to enable fine-grained parallelism. To complement this, widely-known work-stealing algorithms are employed to provide load-balancing.

The Task Factory will use a Task Scheduler to queue the tasks and the default scheduler is the ThreadPoolTaskScheduler, which will run the tasks on available threads in the thread pool.

The trick when unit testing TPL code is to not have those tasks running on threads that we have no control over, but to run them on the same thread as the unit test itself. The way we do that is to replace the default scheduler with a scheduler that runs the code synchronously. Enter the CurrentThreadTaskScheduler;

public class CurrentThreadTaskScheduler : TaskScheduler
{
    protected override void QueueTask(Task task)
    {
        TryExecuteTask(task);
    }

    protected override bool TryExecuteTaskInline(
       Task task, 
       bool taskWasPreviouslyQueued)
    {
        return TryExecuteTask(task);
    }

    protected override IEnumerable<Task> GetScheduledTasks()
    {
        return Enumerable.Empty<Task>();
    }

    public override int MaximumConcurrencyLevel { get { return 1; } }
}

TaskScheduler is an abstract class that all schedulers must inherit from and it only contains 3 methods that needs to be implemented;
  1. void QueueTask(Task)
  2. bool TryExecuteTaskInline(Task, bool)
  3. IEnumerable<Task> GetScheduledTasks()
In the more advanced schedulers like the ThreadPoolTaskScheduler, this is where the heavy-lifting of getting tasks to run on different threads in a thread-safe manner happens. But for running tasks synchronously, we really don’t need that. In fact, that’s exactly what we don’t need. So instead of scheduling tasks to run on different threads, the TryExecuteTaskInline method will just execute them immediately on the current thread.

Now it’s time to actually use it in the production code;

public TaskScheduler TaskScheduler
{
    get
    {
        return _taskScheduler
            ?? (_taskScheduler = TaskScheduler.Default);
    }
    set { _taskScheduler = value; }
}
private TaskScheduler _taskScheduler;

public Task AddAsync(int augend, int addend)
{
    return new TaskFactory(this.TaskScheduler)
        .StartNew(() => Add(augend, addend));
}

To be able to inject a different TaskScheduler from unit tests, I’ve made the dependency settable through a public property on the class I’ll be testing. If no TaskScheduler has been explicitly set (which it won’t be when executed ‘in the wild’), the default TaskScheduler will be used.

The method Task AddAsync(int, int) is the method we would like to unit test. As you can see it’s a highly CPU intensive computation that will add 2 numbers together. Just the kind of work you’d want to surround with all the ceremony and overhead of running asynchronously.

The important part here is the instantiation of the TaskFactory that will take the TaskScheduler as a constructor parameter.

With that in place we can set the TaskScheduler from the unit tests:

[Test]
public void It_should_add_numbers_async()
{
    var calc = new Calculator
    {
        TaskScheduler = new CurrentThreadTaskScheduler()
    };

    calc.AddAsync(1, 1);

    calc.GetLastSum().Should().Be(2);
}

The System Under Test, SUT, is the Calculator-class that has the AddAsync-method we’d like to unit test. Before calling the AddAsync-method we set the CurrentThreadTaskScheduler that the TaskFactory in the Calculator should use.

Since AddAsync doesn’t return the result of the calculation, I’ve added a method to get the last sum. Not exactly production-polished code, but it’ll do for the purpose of this example.

Anyway; the end result is that the test pass. And if I don’t assign the CurrentThreadTaskScheduler to Calculator.TaskScheduler – that is it runs with the default ThreadPoolTaskScheduler – it will fail, because the addition will not be finished before the assertion.

But don’t trust me on this. I’ve uploaded the complete (absurd) example to GitHub, so you can run the tests and see for yourself; https://github.com/bulldetektor/TplSpike.

References


You can read the MSDN-article that I quoted from here; http://msdn.microsoft.com/en-us/library/dd537609.aspx

I found the code for the CurrentThreadTaskScheduler in the TPL samples here; http://code.msdn.microsoft.com/windowsdesktop/Samples-for-Parallel-b4b76364. The samples contains a dozen or so TaskSchedulers, for instance;


  • QueuedTaskScheduler - provides control over priorities, fairness, and the underlying threads utilized
  • OrderedTaskScheduler - ensures only one task is executing at a time, and that tasks execute in the order that they were queued.
  • ReprioritizableTaskScheduler - supports reprioritizing previously queued tasks
  • RoundRobinTaskSchedulerQueue - participates in scheduling that support round-robin scheduling for fairness
  • IOCompletionPortTaskScheduler - uses an I/O completion port for concurrency control
  • IOTaskScheduler - targets the I/O ThreadPool
  • LimitedConcurrencyLevelTaskScheduler - ensures a maximum concurrency level while running on top of the ThreadPool
  • StaTaskScheduler - uses STA threads
  • ThreadPerTaskScheduler - dedicates a thread per task
  • WorkStealingTaskScheduler - a work-stealing scheduler, not much more to say about that