Friday, July 17, 2015

Pouring concrete

If there were 10 commandments of automated testing, number 1 must surely be "Thou shalt test what, not how".

Imagine if I wrote a sort routine, and further imagine I wrote a bubble sort in particular.
In a misguided attempt to test it, I decided I would make sure it did the correct swaps. So I come up with test data, and figure out what items get swapped, and then I modify my sort to pass in a "swap" delegate, so that I can intercept this and test that the correct swaps occur.

I run my tests, and I see that is it good. Each call to swap happens just as I expect, and I know exactly what my code is doing.

Only it is not good.

As my data sets increase, bubble sort struggles, and then I read about Quicksort, or Radix sort, and I decide to replace my bubble sort.

All my tests fail. They all need to be fixed by figuring out what swaps will now occur.
My tests have poured concrete on the implementation, and made it impossible to change without dynamiting the unit tests.

Let's rewind and put in the rule that tests must test results, not actions.
If I sort data, it must be sorted. I don't care how it sorts it. I just want it sorted.

My tests have less detail, and for sure, there could be some hidden bug, that still leaves my data sorted, but does not operate in manner I expected. But my data is sorted. Now I don't need that swap delegate. I pass in my data, I get back the sorted result, and I confirm that the sorted result is indeed sorted.

When I replace my bubble sort with a New and Better Sort, lo and behold, all my tests still run, and hopefully pass.

Test that the data is sorted, not how the data gets to be sorted.

Sadly many test tools facilitate the how testing. The ability to write A.CallTo(()=>swap(a,b)).MustHaveHappened; or some such thing is nothing other than pouring concrete deep and heavy all over the implementation. Avoid the temptation, the road to hell is paved with concrete.

Tuesday, June 9, 2015

The devil is in the details...

Dijkstra: "The purpose of abstraction is not to be vague, but to create a new semantic level in which one can be absolutely precise". It's a favourite quote of a good friend. I'm growing to like it more and more by the day.

Perhaps I'm getting old, or perhaps it's because I'm working more and more on larger teams, with code that I did not write, but I find that I cannot deal with 50+ details at the same time any more.

If the details bleed from another part of the code, and I was forced to do a depth first reading of the code at every function call, I might be somewhat uncharitable towards the original author.

Imagine a circular buffer - in my world, it should have an IsEmpty() function. Few would argue this. If I had to wander into the internals of CBuffer and start comparing first, and last, and remembering if it's Empty when they are the same, or when they are 1 away from each other, don't forget to MOD by the size of the buffer, where was I going with this, hold on, let me go back over that again...

The abstraction IsEmpty() is very precise. In a sense it's far more precise than any comparison of first and last pointers, since it expresses exactly what I want to know, not how to go about finding out. And I can simply trust IsEmpty(), because it will (should) have unit tests. But if I am repeatedly adding this comparison code to my class, then every time I add it, I can goof it.

You would think that this is software 101, and nobody needs reminding about this. But again and again it creeps into code, usually more complex code than a simple buffer, but it does creep in.

It creeps in in little ways, like the train wreck antipattern. It creeps in when there are comparisons that imply something else. The little comment along side "// only set if" is a giveaway that we have been far too intimate with another class, when we should be maintaining a professional distance.




Thursday, June 4, 2015

Do one thing... sort of

This was always confusing advice. Do one thing.

So how do I open a file, read it, close it, parse it, and then process the input.
That's a lot more than one thing. Something has got to "do several things"

Like most soundbites, Do One Thing is correct, but incomplete. You have to take into account levels of abstractions. So do one thing at a given level of abstraction.

A call to ProcessConfigFile can reasonably kick off calls to open, read, close, parse and process.
But it should not actually _do_ the Reading, Parsing and the Processing for example. You should call someone else to _do_ these.

If it does Reading, Parsing and Processing, then you have them bound together. So now if you want to process already parsed data from an block of memory or from a network connection, your Process code needs to be "fished out" of the Read Parse code.

So at one level ProcessConfigFile does indeed do one thing. At a more detailed level, each function or class it calls or uses will also do one thing.

If you aggregate the functionality of others classes or functions, then you don't want to mix in any functionality of you own.  Mixing low level details and high level structures tends to hide what is important.  If you have a 30 line function which contains 20 lines of string formatting and 4 lines of structurally important code, it starts to become easy to miss the big stuff for the detail.



Friday, April 24, 2015

Let it Throw

Oh the code that calls is frightful
But the tests are so delightful
And since we've no place to go
Let It Throw! Let It Throw! Let It Throw!

It doesn't show signs of crashing
And the logs files aren't thrashing
'Cause the logging's turned way down low
Let It Throw! Let It Throw! Let It Throw!

The argument's getting old
And I want to get home tonight
But if you'll review the code
We can push this to the client site

The RC's slowly building
And, guys, the tests are passing
But we'll catch and log it so,

Let It Throw! Let It Throw! Let It Throw!

Friday, August 2, 2013

The Great Software Lie

/* Temporary Fix 
 * 
 * No time write a proper blog entry, I'll get back to it later
 * Dave T 
 * 2nd Aug 2013
 */

Monday, June 3, 2013

The Splinterface

It starts off well, you have a class which has grown too complex, so you pull out some functionality into a separate class. You've pulled it out because now it is no longer fixed. It can be cyclic or non-cyclic perhaps.

All is good.

There are some parts that are shared by cyclic and non-cyclic, so you pull them out and create an interface. But over time, cyclic and non-cyclic diverge. You add a method to the interface that's only used by cyclic, and another which is specific to non-cyclic.

And eventually you can add comments to your interface that look like this

// cyclic uses these
...
// non-cyclic uses these
...
/// this one is shared by both
...

Now you have a Splinterface.

It provides the illusion of shared functionality and shared code, but that's all. It hinders change as you have lost clarity of what is and is not used. It promotes code duplication.

It's cause seems to be allied to designing interfaces based on what something is, not what services it provides.

Chances are you have 2 implementations of your Splinterface and liskov substitution is gone out the window.

Really it's time to stop, document what's going on in your code, take a long hard look, understand what should be done and fix it.

Or you could just add more code and hope for the best. If you take that approach, let me know how it all works out.

Ps- Thanks to JackH who came up with the name.


Monday, May 6, 2013

How long will that take?

OK, that's a got scope for a book. A long book.

But I remember a long time ago I did a small fixed price contract for a company. They wanted me to add some "simple" functionality to an existing app. They had the source, but no-one to do the work.

I'd never done a fixed price contract before, but I knew I'd never got any estimates right. I didn't like the idea of a fixed price contract much.

I started with - If you want a fixed price, pay me for one day to do the spec & estimates. This was (very) grudgingly accepted.

I sat down and broke down what needed to be done. With each part, I continued to break it down to the level where I knew exactly what was needed, and I knew that part would take less than 1/2 a day.

Anything longer than a Day was broken down more.

I went back with the estimate. I think it was about 3 weeks work. It's a long time ago. The manager thought I was winding him up, he'd expected about 3 days, not 3 weeks.

I had brought my list.

We went though the list piece by piece, and about 1/3 of the way through he said "Fine".

It took 12 days, not 15.

This was a task which had no unknowns for me. Each individual part was something I'd done already somewhere else, or at some other time.

Consider the amount of work that would be required to estimate a 3 month project for 6 people, working with 3 new pieces of technology that no-one on the team has used before. If you have not done something before, with a particular tool, it is possible that there's a snag. Do a simple test. Can I serialise the data as XML? Opps, the files are HOOOGE. We have to use something else.

Everyone wants accurate estimates, but no-one wants to put in the leg work to come up with the estimates.

You can of course still be wrong, but hopefully, not off by a factor of 5.