Tuesday, February 15, 2011

Yield

Yield - does not play well with recursive algorithms

In c# they introduced a thing called yield. If you have not come across it before, then you can do such things as this:



public override IEnumerator<T> GetEnumerator()
{
    int count = Count;
    for (int i = 0; i < count; ++i) {
        yield return this[i];
    }
}

And now, c# automagically creates an enumerator object and manages the state of this enumerator for you. There's a small cost, but usually it does not matter.


You can even write beautifully simple code to traverse a tree


public IEnumerable<T> EnumerateInOrder(Node node)
{
    int startStamp = changeStamp;

    if (node != null) {
        foreach (T item in EnumerateInOrder(node.left)) {
             yield return item;
        }

        yield return node.item;

        foreach (T item in EnumerateInOrder(node.right)) {
           yield return item;
        }
    }
}

It's simple, elegant, recursive code. It traverses a recursive data structure. What could be better?

It's also painfully slow.

When you traverse the list, you make one call to the GetEnumerator function. You pay one set up cost. And the cost is amortized over all the calls to GetNext.


However, when you traverse the tree, you make a 2 calls to EnumerateInOrder for each node. C# creates a new enumerator for you each time you make a call to EnumerateInOrder.So now instead of amortizing your cost over each node, you are increasing your cost with each node.

Some simple tests I ran showed this Naive Yield based code to be 8 times slower than a simple stack based in-order traversal.  

Let's be clear. If you have a small list that you traverse once when a user presses a button, this is interesting, but unimportant.

But if you have some graphics or maths library, running in-order traversals at the core of what it does, then Yield to the Stack Based implementation. (Sorry, I couldn't resist).
    

No comments:

Post a Comment