Thursday, September 15, 2011

I Could Just


You know the one, Someone comes over with a question, perhaps you've even done it yourself (many years ago).

The question usually starts out with a justification.... Some prevarication.... and then an explanation of why it's ok to do X in this case, even though normally it would not be acceptable.

"You see I need this data here, and I'd have to change all this code, so I could just.."

The phrase "I could just" is really the dead giveaway.

And everyone already knows the answer, the guy asking it knows, you know even as you hear the "You see" part.

The answer is inevitably "No, it's not OK".

Then there's "We'll go back and fix it later" - as though next week things won't be as busy as right now. If someone can explain to me why they think next week will be different than last week or the week before, fire away.

"I Could Just" is a wrong thing.

What he should be saying is "This is going to take longer than we planned, I just I'd let you know now".

edit - I've just realised that this is a repost of the same idea. It's still true though.

Tuesday, September 6, 2011

Singletons


Singletons are useful, but cause all sorts of problems, the are basically Globals with better PR.

We can however change Singletons a little to make them more testable and more maintainable.

Forget the MyDatabaseSingleTon.GetInstance() Function. Instead, use a Factory, IMyDatabaseFactory, which can return a database instance.

If you want to unit test some code, and pass in a Mock Database Instance Factory, which returns a Mock Database Instance to the client, happy days. Now your unit test does not need a real database.

If you want to later on update the code to use multiple database instances, you have the beginnings of the factory, and all your code knows about it. It becomes a smaller re factoring job.

It's the same old trade off, you introduce indirection == complexity and you get flexibility == maintainability.

Too much of that indirection can however introduce confusion != flexibility or maintainability.

The very structure of the code.


If I have a simple socket interface

public interface ISocket
        {
            public bool Connect(IPAddress address);
            public bool Read(Buffer b, int size);
            public bool Write(Buffer b, int size);
        }
       
It can already be abused, The calling code could Read or Write without doing a Connect.

What if I change the very structure of the code to reflect the constraints I want to impose

        public interface ISocket
        {
            public bool Read(Buffer b, int size);
            public bool Write(Buffer b, int size);
        }

        public interface ISocketFactory
        {
            public IScoket Connect(IPAddress address);

        }


Now, once you have an ISocket, you KNOW it's was at one point connected, It has an associated IPAddress.
The other side may have disconnected since, but that's a whole other matter.

We have now made it more or less impossible to read or write to a socket which is uninitialised.

Of course you could put the IPAddress in the constructor of the Socket, but for our nice generalised code base, that is a problem, since we now have to know which type of socket we want to create. Once we start doing "new IPSocket(serverAddress)" in our class, we can change to using some other type of connection, eg TestSocket() or QueueSocket() which we might want to use if the other side is in process.


Friday, September 2, 2011

Pull on a Thread

Multi-threaded programming has gotten a bad rep for being difficult. It's not easy, but we are often our own worst enemies.


You can't lock what you don't own
You can't lock something unless you own it completely. If someone passes you a list as a parameter, locking it is nothing more than a decoration. Someone in some other code somewhere will see that list, and won't even know your lock exists.

If you allow someone to get that list in any way, you can be sure they will either update it or iterate through it without bothering to lock it.


Avoid thread primitives in open code.
Don't create a dictionary and then lock the dictionary. Someday, someone will come in and update the dictionary without locking it. Instead Create, or use a Thread Safe Dictionary. This will encapsulate the locking and make it downright hard from someone to mess up your code.

Name lock objects with care. Lets say that you do write your own Thread Safe data structure. If there's a List in it that looks like this : List myObjects; then call the lock object myObjectsLock.
A member variable called lockObject is less than helpful when the new guys joins in 3 months and finds code  accessing some of the data without locking "lockObject". So... did lockObject cover this list, or that one. Was that list created and used in one thread only, or only modified at startup.


Write it once, test it well.

Likewise, don't use monitor, pulse and wait in open code. Write or Use a Thread Safe Blocking Queue. Do it once, test it lots. It's remarkably easy to mess up that code. And the semantics of threading primitives are badly misunderstood. Even if YOU understand it perfectly, want to bet the new guy will? Or even the old guy who only has 1 day to understand the code, add a feature and push it to production?

Use Immutable Objects where possible

If your object is immutable, and it's not visible to other threads until it is fully constructed, then it does not need locking. Happy days. You still need to lock the collections of objects, or the references to them, but no the object itself.

Understand what happens when objects "go away"
A big problem in MT apps is unsubscribing from a publisher. Think carefully through what happens when you unsubscribe. Does the publisher still have a reference to your callback function? Can a callback be in progress as you unsubscribe? If you come up with a very elaborate mechanism to deal with this, it's probably wrong. Set up your listener so that you can tell it to stop listening before you unsubscribe it. Stop Listening, Unsubscribe, then you don't care about messages in flight.

Use Watchdogs
When threads die, they often leave a mess. This sometimes helps you find the problem. But it's nice to know if they hang too. Have a simple watchdog thread which your worker threads can signal each time around their loop to say "I'm Ok". If they stop signalling for a given amount of time you have a problem, but at least now you KNOW you have a problem.

Use Blocking Waits for Data
Avoid busy waiting loops. Use Blocking Waits for data with long timeouts, wait times in the order of seconds. If there's no data, fire off your "I'm still alive" message to whatever thread watchdog you have set up and then back to the blocking wait.
There's no need to do this 1000 times per second, unless you need to know if the thread has died within a millisecond. (If that's part of your requirements, then I hope all the above is old hat to you!)


A thought for the day:
If you can find nothing wrong with the code you wrote two years ago..... then you have not learned much in the last two years.