LivingWithSourceCode

Friday, November 11, 2011

Concurrency Profiling

On a daily basis we are working on machines with 4, 8, 12 or even more cores. Not so long ago, 12 cores was considered exotic hardware.

Making apps run well on a multi-threaded environment is a whole study in itself.

One thing that can help along the way is to understand where threads in your application are "stuck" waiting on other threads.

Rather than re-iterate what has already been said quite well - look here http://msdn.microsoft.com/en-us/magazine/ff714587.aspx

Be aware that if you have worker threads blocking on queues, or waiting for some form of I/O they will show up as "concurrency problems" - ie threads that are blocked.

For your first pass, pretty much ignore these points in your code, the threads are supposed to be blocked there. Focus on the points where you do not expect threads to be blocked.

For you second pass, you may want to consider if you need to think about how many threads you have "waiting" for things to happen. Do you end up loosing too much of you CPU to swapping threads in and out?
Once you get to that level, a short blog post is not going to help you much though.

Wednesday, November 2, 2011

Is your VS2010 running slowly?

VS2010 is nice.

It keeps files open for you.

Even when you shut down and then start up again.

Unless you periodically close all windows, they build up.

Then things start to slow down.

Solution, [Alt-W] L

Ie Close all windows.

Note - I have Resharper running, this may or may not be related. The solution is so trivial, and life without Resharper would be unthinkable. As such, I'm not planning on un-installing Resharper and testing to see if it's part of the problem or not, if you want to run this test, please do, and post a comment with the result.

Friday, October 28, 2011

Code should look Trivial.

Good code is deceptive. It looks like it's not doing much. It looks like it was written as an example for a new guy. It looks like it was written for a child.

It does not read like Ulysses.
Ulysses was a book written by James Joyce.
Ulysses is difficult to read.
Good Code has short sentences.
The sentences in good code are simple.
Each sentence in Good Code says one thing.

It does not have long rambling sentences with several clauses which expound at length in complex form, and in confounding nature. It is neither overly verbose, nor decorative nor does it repeat itself and repeat itself and repeat itself. It tends not to say the same thing in different ways. It display a sense of consistency of expression.

Some would argue that you need a degree in literature, and a local knowledge of Dublin Culture to read Joyce.

Try to ensure that you code can be read without such preconditions.

Thursday, October 6, 2011

Consistency Of Abstraction

I'm reading through Robert C Martin's "Clean Code" at the moment. It's a good book that gives examples of what not to do, explains why and show how to do it better.

Many of the things he says are, much like many great ideas, utterly obvious, after they have been pointed out. Yet clearly, they are not universally adhered to.

Consistency of Abstraction
A simple point he makes is that we should have Consistency of Abstraction in a function.

Simply put, if you have high level calls like "GetAvailableGridHosts()" then in the same function there is no place for low level code, such as iterating through the HostObjects and adding the HostNames of the hosts to a formatted string buffer.

Do the right thing and take the loop out into a separate function. Give it a GOOD NAME. Spend a while thinking about the name. Call if FormatHostNamesAsPrettyString() or FormatHostNamesAsCSV(), you may have naming conventions, you may think these names are rotten, but if you at least start to think about the name, and why you like these, or don't, then you are heading in the right direction.

Now your function is dealing with ideas at a consistent level of abstraction. The reader is not going from thinking about the high ideas to low level details and back.

Readability
Part of the advantage of small, clear, and concise functions is reuse, but a huge part of the advantage is the improved readability of the code.

Scope
This loop that does "stuff" in a larger function also has a scope problem. It's not clear exactly what data it operates on. It could be using or modifying any local variables in that function. It's scope is too large.

Testing
It's harder to test. How do you write a unit test for a loop in the middle of a bigger function?

The Downhill Slope
Add another loop or two, and suddenly the function becomes very hard to follow. Now the function does not fit on the screen. Now you can't see where a variable is set at the same time as where you next use it.

Break the function up into smaller functions and then it's easier to test, each individual function has smaller scope, and is easier to read and debug.

Potential Trade-offs
The trade off is lots of small functions. This does have some overhead in terms of the CPU, Jumps, Cache Misses and so on. Badly done, you have to go to each function to look and see what it does. But if the name is good, and if the function does one thing, and respects it's name, you probably don't have to look in all the functions, you can quickly find the functions you need to know about.

In real terms, I don't think that I have ever sat down and gone through code and thought to myself - "If only (s)he hadn't broken the code up into such small functions". There are very limited and specialized areas where the time to make the extra function calls matters. But they are few and far between.

Thursday, September 15, 2011

I Could Just

You know the one, Someone comes over with a question, perhaps you've even done it yourself (many years ago).

The question usually starts out with a justification.... Some prevarication.... and then an explanation of why it's ok to do X in this case, even though normally it would not be acceptable.

"You see I need this data here, and I'd have to change all this code, so I could just.."

The phrase "I could just" is really the dead giveaway.

And everyone already knows the answer, the guy asking it knows, you know even as you hear the "You see" part.

The answer is inevitably "No, it's not OK".

Then there's "We'll go back and fix it later" - as though next week things won't be as busy as right now. If someone can explain to me why they think next week will be different than last week or the week before, fire away.

"I Could Just" is a wrong thing.

What he should be saying is "This is going to take longer than we planned, I just I'd let you know now".

edit - I've just realised that this is a repost of the same idea. It's still true though.

Tuesday, September 6, 2011

Singletons

Singletons are useful, but cause all sorts of problems, the are basically Globals with better PR.

We can however change Singletons a little to make them more testable and more maintainable.

Forget the MyDatabaseSingleTon.GetInstance() Function. Instead, use a Factory, IMyDatabaseFactory, which can return a database instance.

If you want to unit test some code, and pass in a Mock Database Instance Factory, which returns a Mock Database Instance to the client, happy days. Now your unit test does not need a real database.

If you want to later on update the code to use multiple database instances, you have the beginnings of the factory, and all your code knows about it. It becomes a smaller re factoring job.

It's the same old trade off, you introduce indirection == complexity and you get flexibility == maintainability.

Too much of that indirection can however introduce confusion != flexibility or maintainability.

The very structure of the code.

If I have a simple socket interface

public interface ISocket
{
public bool Connect(IPAddress address);
public bool Read(Buffer b, int size);
public bool Write(Buffer b, int size);
}

It can already be abused, The calling code could Read or Write without doing a Connect.

What if I change the very structure of the code to reflect the constraints I want to impose

public interface ISocket
{
public bool Read(Buffer b, int size);
public bool Write(Buffer b, int size);
}

public interface ISocketFactory
{
public IScoket Connect(IPAddress address);

}

Now, once you have an ISocket, you KNOW it's was at one point connected, It has an associated IPAddress.
The other side may have disconnected since, but that's a whole other matter.

We have now made it more or less impossible to read or write to a socket which is uninitialised.

Of course you could put the IPAddress in the constructor of the Socket, but for our nice generalised code base, that is a problem, since we now have to know which type of socket we want to create. Once we start doing "new IPSocket(serverAddress)" in our class, we can change to using some other type of connection, eg TestSocket() or QueueSocket() which we might want to use if the other side is in process.