LivingWithSourceCode

Tuesday, September 6, 2011

Singletons

Singletons are useful, but cause all sorts of problems, the are basically Globals with better PR.

We can however change Singletons a little to make them more testable and more maintainable.

Forget the MyDatabaseSingleTon.GetInstance() Function. Instead, use a Factory, IMyDatabaseFactory, which can return a database instance.

If you want to unit test some code, and pass in a Mock Database Instance Factory, which returns a Mock Database Instance to the client, happy days. Now your unit test does not need a real database.

If you want to later on update the code to use multiple database instances, you have the beginnings of the factory, and all your code knows about it. It becomes a smaller re factoring job.

It's the same old trade off, you introduce indirection == complexity and you get flexibility == maintainability.

Too much of that indirection can however introduce confusion != flexibility or maintainability.

The very structure of the code.

If I have a simple socket interface

public interface ISocket
{
public bool Connect(IPAddress address);
public bool Read(Buffer b, int size);
public bool Write(Buffer b, int size);
}

It can already be abused, The calling code could Read or Write without doing a Connect.

What if I change the very structure of the code to reflect the constraints I want to impose

public interface ISocket
{
public bool Read(Buffer b, int size);
public bool Write(Buffer b, int size);
}

public interface ISocketFactory
{
public IScoket Connect(IPAddress address);

}

Now, once you have an ISocket, you KNOW it's was at one point connected, It has an associated IPAddress.
The other side may have disconnected since, but that's a whole other matter.

We have now made it more or less impossible to read or write to a socket which is uninitialised.

Of course you could put the IPAddress in the constructor of the Socket, but for our nice generalised code base, that is a problem, since we now have to know which type of socket we want to create. Once we start doing "new IPSocket(serverAddress)" in our class, we can change to using some other type of connection, eg TestSocket() or QueueSocket() which we might want to use if the other side is in process.

Friday, September 2, 2011

Pull on a Thread

Multi-threaded programming has gotten a bad rep for being difficult. It's not easy, but we are often our own worst enemies.

You can't lock what you don't own
You can't lock something unless you own it completely. If someone passes you a list as a parameter, locking it is nothing more than a decoration. Someone in some other code somewhere will see that list, and won't even know your lock exists.

If you allow someone to get that list in any way, you can be sure they will either update it or iterate through it without bothering to lock it.

Avoid thread primitives in open code.
Don't create a dictionary and then lock the dictionary. Someday, someone will come in and update the dictionary without locking it. Instead Create, or use a Thread Safe Dictionary. This will encapsulate the locking and make it downright hard from someone to mess up your code.

Name lock objects with care. Lets say that you do write your own Thread Safe data structure. If there's a List in it that looks like this : List myObjects; then call the lock object myObjectsLock.
A member variable called lockObject is less than helpful when the new guys joins in 3 months and finds code accessing some of the data without locking "lockObject". So... did lockObject cover this list, or that one. Was that list created and used in one thread only, or only modified at startup.

Write it once, test it well.

Likewise, don't use monitor, pulse and wait in open code. Write or Use a Thread Safe Blocking Queue. Do it once, test it lots. It's remarkably easy to mess up that code. And the semantics of threading primitives are badly misunderstood. Even if YOU understand it perfectly, want to bet the new guy will? Or even the old guy who only has 1 day to understand the code, add a feature and push it to production?

Use Immutable Objects where possible

If your object is immutable, and it's not visible to other threads until it is fully constructed, then it does not need locking. Happy days. You still need to lock the collections of objects, or the references to them, but no the object itself.

Understand what happens when objects "go away"
A big problem in MT apps is unsubscribing from a publisher. Think carefully through what happens when you unsubscribe. Does the publisher still have a reference to your callback function? Can a callback be in progress as you unsubscribe? If you come up with a very elaborate mechanism to deal with this, it's probably wrong. Set up your listener so that you can tell it to stop listening before you unsubscribe it. Stop Listening, Unsubscribe, then you don't care about messages in flight.

Use Watchdogs
When threads die, they often leave a mess. This sometimes helps you find the problem. But it's nice to know if they hang too. Have a simple watchdog thread which your worker threads can signal each time around their loop to say "I'm Ok". If they stop signalling for a given amount of time you have a problem, but at least now you KNOW you have a problem.

Use Blocking Waits for Data
Avoid busy waiting loops. Use Blocking Waits for data with long timeouts, wait times in the order of seconds. If there's no data, fire off your "I'm still alive" message to whatever thread watchdog you have set up and then back to the blocking wait.
There's no need to do this 1000 times per second, unless you need to know if the thread has died within a millisecond. (If that's part of your requirements, then I hope all the above is old hat to you!)

A thought for the day:
If you can find nothing wrong with the code you wrote two years ago..... then you have not learned much in the last two years.

Thursday, July 28, 2011

TLA, FLA, FLA, SLA, SLA, ELA

A long time ago in a Company that wrote telecoms software I remember having a discussion that ran along the lines of

"So the SMSC sends an SRI to the HLR, which returns the MSC, Then we Send the SM to the MSC which queries the VLR....."

The conversation went on in that vein for some ten minutes as we discussed where exactly the problem was, what was not talking to what and why not.

To anyone without very current domain knowledge, it was Dutch, or Latin, or perhaps Cherokee, but either way it was hard to follow.

Our code would have been a nightmare for someone to try to follow, it was littered with variable names which were TLAs, FLAs and *FLAs, and some times they were context dependent.

In this day and age, most editors have variable name completion, and **memory is not an issue, so for the love of peace and harmony, and for the sanity of the new guy who starts on Thursday and has to try to figure out the code and fix bugs in it, with partial domain knowledge, and incomplete documentation, at least give the variables and functions sensible names.

(*FLA = Four Letter Abbreviation, unless it has been preceded by FLA, in which case the latter FLA means FIVE Letter Abbreviation)

(**I did work on a program ones, where I had to change all the variable names to shorter names to save memory. That was a long, long, long time ago, when 8k was a whole world of ram, and the program was stored as text, and continually interpreted, and goto was 2nd the most common keyword after if.)

Friday, July 22, 2011

Unit testing is hard....

The main reason that unit testing is hard it that it requires many of the things that good design requires.

Built in dependencies

It’s much easier to hard code things, strings, external dependencies, statics and so on. It’s more effort to coordinate gathering up the various external requirements for a class in a sensible place, and then pass the external requirements on to that class.

In c# for example it is easy to have a class directly call:

featureEnabledString = ConfigurationManager.AppSettings["EnableFeatureX"];

Of course, now in order to run unit tests, you need an App.config.

And if you want to write a test where that feature is disabled, you need to get creative.

Perhaps you set up a derived test class to set the value of base.featureEnabledString so that you can test the thing that needs testing.

That’s all well and good for adding tests to existing code, but now, you are testing your derived test class, as well as your production class. And they are more or less the same but not exactly the same. (That’s a pretty good definition of a bug, it’s more or less the same as I what I want, but not exactly.)

Complexity

As you class starts to do more and more things, it gets harder and harder to manage the testing.

As the complexity of your class goes up, and internally a, b and c all interact together, it gets harder and harder to manage testing.

But the underlying problem here is not testing the class. The fact that the class is becoming harder and harder to test is a really good indication that the class is too big and too complex.

If you can’t test it, it’s probably too complex to work in the real world too.

If you need to set up 20 different things to get a test to work, it’s probably too complex to work in the real world too.

Don’t shoot the Messenger

The unit testing is not the problem. It’s just bringing the bad news. Your code needs some reworking, refactoring, rethinking.

Zapping the unit tests, checking it all in leaving the mess for someone else is more befitting of an elected representative than a Software Engineer.

Wednesday, July 20, 2011

The class within

If you are ever working on a class, and it seems like a good idea to group a few functions and member variables together, stop.

Right about now, you have just reached the point where you should seriously consider breaking out the functionality into a separate class.

Of all the times I've looked through code (including my own code), I've usually had unkind thoughts about the author of classes that were too big. I don't often remember thinking "why did he bother creating a class for that?"

Wednesday, July 13, 2011

return bool

someFunction(true, false, false, true);

or

someFunction(Sensor.Enabled, Filtering.Disabled, Normalising.Disabled, Smoothing.Exponential);

I know which one is

easier to read
easier to debug
harder to get wrong when you call it
slightly more effort initially.