LivingWithSourceCode: 2011

Friday, December 30, 2011

Branches and Code Quality

A while back I posted about not reformatting code and the problems you incur when merging branches. After some thinking, reading and discussion, I'd like to partially recant that advice.

I still think that moving brackets around in the code is pretty much a waste of time compared to things you could tidy up which make code easier to understand.

But more and more I think that the cost of a merge made a little more difficult is repaid many times by improving the code quality.

Martin Fowler has quite a lot to say about branches.

But in general we read code a lot more often than we write it, so I'm more and more prepared to amortize the cost of fixing poor and mediocre code for the benefit of less confusion, less bugs, more tests.

Thursday, December 29, 2011

The Partial Class anti-pattern

In c# you can have a partial class.

This has a very specific function. It allows you to separate "Widget Generated Code" from the rest of your code. Then if you run the widget again, it won't overwrite all the additional code you have added. Think Windows Forms.

Of course when a class gets too large, you could* break up the class into multiple files, and use partial classes.

But this is very much brushing the problem under the rug.

Now you have made things worse.

If you have succeeded in breaking up the class into two or more partial classes which do not access private variables or private functions from the remainder of the partial class, then make it a brand new stand alone class.

If not, you now have a mini program within a large program, where you essentially have Global variables and functions (albeit marked as private) which can be changed by code from another file. You can no longer see what is going on.

OK, in fairness you can use Resharper or some such tool to "find usages", but really you have just acknowledged the class was too big. Break it up into separate classes.

This

makes it more testable.
reduces the scope of private and protected members.
may well separate out dependencies.
allows you to have higher level code in the primary class and hides the details of "the other stuff".

*if you do this, allow me to suggest that you don't go boasting about it to all your friends.

Wednesday, December 28, 2011

if (fred)

if (fred) {
...
}

Cryptic?

How much better is

if (preloadCaches()){
}

You would probably guess that this returns false for an error. An exception might be a better way of handling this.

So what about

if (reloadCaches()){
}

Any guesses when this would return true or false? maybe true if this is the first time it's loaded the cache? or perhaps true if anything changed? or perhaps true if there were no errors?

If a function name does not give away the result, don't return a bool.

bool isValidId()
bool isEnabled()
bool isOverDrawn()

The above are just fine.

bool setAccountOverdraftLimit()
bool preInitialisationCheck()
bool fred()

These all require me to go looking into the function to see what it is all about.

I may be looking for a bug in the account balance, but when I see the code.

if (setAccountOverdraftLimit(10000)){
....
}

Then I don't know if I'm going into the if() or not. Now I need to read setAccountOverdraftLimit at the outset. Every time I come across this ambiguity I am forced to read more and more code which in all probability has no relation to the bug.

And wait for the fun that happens when someone overrides setAccountOverdraftLimit() and one instance returns true/false for the limit has been set/not set, and the other instance returns true/false for the limit has been increased compared to the previous limit.

Thursday, December 15, 2011

Cut and Paste is not a valid design pattern

You would think in this day and age, that no-one would copy a 200 line function from one class to another.

Because if they did then surely one version would have changes made to it.

Then some poor fool would be left wondering if the two versions were actually slightly different, or if a fix was applied to one, not the other. Or if a change was simply because the functions were called in slightly different ways.

Of course having a 200 line function which looks like 20 x 10 lines blocks cut and paste into a switch statement... that would be a wrong thing in the first place.

And no-one would do that, because if they did, their name would be against the check-in in the source code control system.

Wednesday, December 14, 2011

Do one thing.

Classes should do one thing.

If they have "other stuff" to do, then they should use other classes.

Embedding the implementation of thread-safe, dictionaries of linked lists in your functions fails on several levels

Next time you need a "Multimap" you will write it all over again.
There's a name for this data type, that should be a hint not to write it yourself.
Your function is now implementing what it's supposed to, and it's implementing MultiMap.Add()
You have lock primitives in open code.
Your code is harder to read. You now have many lines of code in your function. These lines could be replaced by Multimap.Add(). That would simplify your code and make it easier to see the main functionality.
If just above the section of code you have a comment saying "add this to a thread safe multimap" then shame on you, you already knew it should not be there.
If you have enclosed it in a #region (c#) and collapsed that region to a single line comment, then was this not a big hint that it could be taken out and put in it's own class.
Your code is now much harder to test. How do you test the MultiMap code which is embedded in your function.
If you need to update the Dictionary of Lists for performance, it's going to be fun fixing it in many different places.
And sooner or later someone will read or update your dictionaryOfLists without noticing the dictionaryOfListsLockObject.

I do appreciate that a dictionary of lists is simple, and that you have probably implemented it correctly. In fact you have implemented it so many times by now, you can do it in your sleep.

Please appreciate that this is not a good thing to boast about.

For further reading google "principle of single responsibility" or look here

http://en.wikipedia.org/wiki/SOLID_(object-oriented_design)

Friday, November 11, 2011

Concurrency Profiling

On a daily basis we are working on machines with 4, 8, 12 or even more cores. Not so long ago, 12 cores was considered exotic hardware.

Making apps run well on a multi-threaded environment is a whole study in itself.

One thing that can help along the way is to understand where threads in your application are "stuck" waiting on other threads.

Rather than re-iterate what has already been said quite well - look here http://msdn.microsoft.com/en-us/magazine/ff714587.aspx

Be aware that if you have worker threads blocking on queues, or waiting for some form of I/O they will show up as "concurrency problems" - ie threads that are blocked.

For your first pass, pretty much ignore these points in your code, the threads are supposed to be blocked there. Focus on the points where you do not expect threads to be blocked.

For you second pass, you may want to consider if you need to think about how many threads you have "waiting" for things to happen. Do you end up loosing too much of you CPU to swapping threads in and out?
Once you get to that level, a short blog post is not going to help you much though.

Wednesday, November 2, 2011

Is your VS2010 running slowly?

VS2010 is nice.

It keeps files open for you.

Even when you shut down and then start up again.

Unless you periodically close all windows, they build up.

Then things start to slow down.

Solution, [Alt-W] L

Ie Close all windows.

Note - I have Resharper running, this may or may not be related. The solution is so trivial, and life without Resharper would be unthinkable. As such, I'm not planning on un-installing Resharper and testing to see if it's part of the problem or not, if you want to run this test, please do, and post a comment with the result.

Friday, October 28, 2011

Code should look Trivial.

Good code is deceptive. It looks like it's not doing much. It looks like it was written as an example for a new guy. It looks like it was written for a child.

It does not read like Ulysses.
Ulysses was a book written by James Joyce.
Ulysses is difficult to read.
Good Code has short sentences.
The sentences in good code are simple.
Each sentence in Good Code says one thing.

It does not have long rambling sentences with several clauses which expound at length in complex form, and in confounding nature. It is neither overly verbose, nor decorative nor does it repeat itself and repeat itself and repeat itself. It tends not to say the same thing in different ways. It display a sense of consistency of expression.

Some would argue that you need a degree in literature, and a local knowledge of Dublin Culture to read Joyce.

Try to ensure that you code can be read without such preconditions.

Thursday, October 6, 2011

Consistency Of Abstraction

I'm reading through Robert C Martin's "Clean Code" at the moment. It's a good book that gives examples of what not to do, explains why and show how to do it better.

Many of the things he says are, much like many great ideas, utterly obvious, after they have been pointed out. Yet clearly, they are not universally adhered to.

Consistency of Abstraction
A simple point he makes is that we should have Consistency of Abstraction in a function.

Simply put, if you have high level calls like "GetAvailableGridHosts()" then in the same function there is no place for low level code, such as iterating through the HostObjects and adding the HostNames of the hosts to a formatted string buffer.

Do the right thing and take the loop out into a separate function. Give it a GOOD NAME. Spend a while thinking about the name. Call if FormatHostNamesAsPrettyString() or FormatHostNamesAsCSV(), you may have naming conventions, you may think these names are rotten, but if you at least start to think about the name, and why you like these, or don't, then you are heading in the right direction.

Now your function is dealing with ideas at a consistent level of abstraction. The reader is not going from thinking about the high ideas to low level details and back.

Readability
Part of the advantage of small, clear, and concise functions is reuse, but a huge part of the advantage is the improved readability of the code.

Scope
This loop that does "stuff" in a larger function also has a scope problem. It's not clear exactly what data it operates on. It could be using or modifying any local variables in that function. It's scope is too large.

Testing
It's harder to test. How do you write a unit test for a loop in the middle of a bigger function?

The Downhill Slope
Add another loop or two, and suddenly the function becomes very hard to follow. Now the function does not fit on the screen. Now you can't see where a variable is set at the same time as where you next use it.

Break the function up into smaller functions and then it's easier to test, each individual function has smaller scope, and is easier to read and debug.

Potential Trade-offs
The trade off is lots of small functions. This does have some overhead in terms of the CPU, Jumps, Cache Misses and so on. Badly done, you have to go to each function to look and see what it does. But if the name is good, and if the function does one thing, and respects it's name, you probably don't have to look in all the functions, you can quickly find the functions you need to know about.

In real terms, I don't think that I have ever sat down and gone through code and thought to myself - "If only (s)he hadn't broken the code up into such small functions". There are very limited and specialized areas where the time to make the extra function calls matters. But they are few and far between.

Thursday, September 15, 2011

I Could Just

You know the one, Someone comes over with a question, perhaps you've even done it yourself (many years ago).

The question usually starts out with a justification.... Some prevarication.... and then an explanation of why it's ok to do X in this case, even though normally it would not be acceptable.

"You see I need this data here, and I'd have to change all this code, so I could just.."

The phrase "I could just" is really the dead giveaway.

And everyone already knows the answer, the guy asking it knows, you know even as you hear the "You see" part.

The answer is inevitably "No, it's not OK".

Then there's "We'll go back and fix it later" - as though next week things won't be as busy as right now. If someone can explain to me why they think next week will be different than last week or the week before, fire away.

"I Could Just" is a wrong thing.

What he should be saying is "This is going to take longer than we planned, I just I'd let you know now".

edit - I've just realised that this is a repost of the same idea. It's still true though.

Tuesday, September 6, 2011

Singletons

Singletons are useful, but cause all sorts of problems, the are basically Globals with better PR.

We can however change Singletons a little to make them more testable and more maintainable.

Forget the MyDatabaseSingleTon.GetInstance() Function. Instead, use a Factory, IMyDatabaseFactory, which can return a database instance.

If you want to unit test some code, and pass in a Mock Database Instance Factory, which returns a Mock Database Instance to the client, happy days. Now your unit test does not need a real database.

If you want to later on update the code to use multiple database instances, you have the beginnings of the factory, and all your code knows about it. It becomes a smaller re factoring job.

It's the same old trade off, you introduce indirection == complexity and you get flexibility == maintainability.

Too much of that indirection can however introduce confusion != flexibility or maintainability.

The very structure of the code.

If I have a simple socket interface

public interface ISocket
{
public bool Connect(IPAddress address);
public bool Read(Buffer b, int size);
public bool Write(Buffer b, int size);
}

It can already be abused, The calling code could Read or Write without doing a Connect.

What if I change the very structure of the code to reflect the constraints I want to impose

public interface ISocket
{
public bool Read(Buffer b, int size);
public bool Write(Buffer b, int size);
}

public interface ISocketFactory
{
public IScoket Connect(IPAddress address);

}

Now, once you have an ISocket, you KNOW it's was at one point connected, It has an associated IPAddress.
The other side may have disconnected since, but that's a whole other matter.

We have now made it more or less impossible to read or write to a socket which is uninitialised.

Of course you could put the IPAddress in the constructor of the Socket, but for our nice generalised code base, that is a problem, since we now have to know which type of socket we want to create. Once we start doing "new IPSocket(serverAddress)" in our class, we can change to using some other type of connection, eg TestSocket() or QueueSocket() which we might want to use if the other side is in process.

Friday, September 2, 2011

Pull on a Thread

Multi-threaded programming has gotten a bad rep for being difficult. It's not easy, but we are often our own worst enemies.

You can't lock what you don't own
You can't lock something unless you own it completely. If someone passes you a list as a parameter, locking it is nothing more than a decoration. Someone in some other code somewhere will see that list, and won't even know your lock exists.

If you allow someone to get that list in any way, you can be sure they will either update it or iterate through it without bothering to lock it.

Avoid thread primitives in open code.
Don't create a dictionary and then lock the dictionary. Someday, someone will come in and update the dictionary without locking it. Instead Create, or use a Thread Safe Dictionary. This will encapsulate the locking and make it downright hard from someone to mess up your code.

Name lock objects with care. Lets say that you do write your own Thread Safe data structure. If there's a List in it that looks like this : List myObjects; then call the lock object myObjectsLock.
A member variable called lockObject is less than helpful when the new guys joins in 3 months and finds code accessing some of the data without locking "lockObject". So... did lockObject cover this list, or that one. Was that list created and used in one thread only, or only modified at startup.

Write it once, test it well.

Likewise, don't use monitor, pulse and wait in open code. Write or Use a Thread Safe Blocking Queue. Do it once, test it lots. It's remarkably easy to mess up that code. And the semantics of threading primitives are badly misunderstood. Even if YOU understand it perfectly, want to bet the new guy will? Or even the old guy who only has 1 day to understand the code, add a feature and push it to production?

Use Immutable Objects where possible

If your object is immutable, and it's not visible to other threads until it is fully constructed, then it does not need locking. Happy days. You still need to lock the collections of objects, or the references to them, but no the object itself.

Understand what happens when objects "go away"
A big problem in MT apps is unsubscribing from a publisher. Think carefully through what happens when you unsubscribe. Does the publisher still have a reference to your callback function? Can a callback be in progress as you unsubscribe? If you come up with a very elaborate mechanism to deal with this, it's probably wrong. Set up your listener so that you can tell it to stop listening before you unsubscribe it. Stop Listening, Unsubscribe, then you don't care about messages in flight.

Use Watchdogs
When threads die, they often leave a mess. This sometimes helps you find the problem. But it's nice to know if they hang too. Have a simple watchdog thread which your worker threads can signal each time around their loop to say "I'm Ok". If they stop signalling for a given amount of time you have a problem, but at least now you KNOW you have a problem.

Use Blocking Waits for Data
Avoid busy waiting loops. Use Blocking Waits for data with long timeouts, wait times in the order of seconds. If there's no data, fire off your "I'm still alive" message to whatever thread watchdog you have set up and then back to the blocking wait.
There's no need to do this 1000 times per second, unless you need to know if the thread has died within a millisecond. (If that's part of your requirements, then I hope all the above is old hat to you!)

A thought for the day:
If you can find nothing wrong with the code you wrote two years ago..... then you have not learned much in the last two years.

Thursday, July 28, 2011

TLA, FLA, FLA, SLA, SLA, ELA

A long time ago in a Company that wrote telecoms software I remember having a discussion that ran along the lines of

"So the SMSC sends an SRI to the HLR, which returns the MSC, Then we Send the SM to the MSC which queries the VLR....."

The conversation went on in that vein for some ten minutes as we discussed where exactly the problem was, what was not talking to what and why not.

To anyone without very current domain knowledge, it was Dutch, or Latin, or perhaps Cherokee, but either way it was hard to follow.

Our code would have been a nightmare for someone to try to follow, it was littered with variable names which were TLAs, FLAs and *FLAs, and some times they were context dependent.

In this day and age, most editors have variable name completion, and **memory is not an issue, so for the love of peace and harmony, and for the sanity of the new guy who starts on Thursday and has to try to figure out the code and fix bugs in it, with partial domain knowledge, and incomplete documentation, at least give the variables and functions sensible names.

(*FLA = Four Letter Abbreviation, unless it has been preceded by FLA, in which case the latter FLA means FIVE Letter Abbreviation)

(**I did work on a program ones, where I had to change all the variable names to shorter names to save memory. That was a long, long, long time ago, when 8k was a whole world of ram, and the program was stored as text, and continually interpreted, and goto was 2nd the most common keyword after if.)

Friday, July 22, 2011

Unit testing is hard....

The main reason that unit testing is hard it that it requires many of the things that good design requires.

Built in dependencies

It’s much easier to hard code things, strings, external dependencies, statics and so on. It’s more effort to coordinate gathering up the various external requirements for a class in a sensible place, and then pass the external requirements on to that class.

In c# for example it is easy to have a class directly call:

featureEnabledString = ConfigurationManager.AppSettings["EnableFeatureX"];

Of course, now in order to run unit tests, you need an App.config.

And if you want to write a test where that feature is disabled, you need to get creative.

Perhaps you set up a derived test class to set the value of base.featureEnabledString so that you can test the thing that needs testing.

That’s all well and good for adding tests to existing code, but now, you are testing your derived test class, as well as your production class. And they are more or less the same but not exactly the same. (That’s a pretty good definition of a bug, it’s more or less the same as I what I want, but not exactly.)

Complexity

As you class starts to do more and more things, it gets harder and harder to manage the testing.

As the complexity of your class goes up, and internally a, b and c all interact together, it gets harder and harder to manage testing.

But the underlying problem here is not testing the class. The fact that the class is becoming harder and harder to test is a really good indication that the class is too big and too complex.

If you can’t test it, it’s probably too complex to work in the real world too.

If you need to set up 20 different things to get a test to work, it’s probably too complex to work in the real world too.

Don’t shoot the Messenger

The unit testing is not the problem. It’s just bringing the bad news. Your code needs some reworking, refactoring, rethinking.

Zapping the unit tests, checking it all in leaving the mess for someone else is more befitting of an elected representative than a Software Engineer.

Wednesday, July 20, 2011

The class within

If you are ever working on a class, and it seems like a good idea to group a few functions and member variables together, stop.

Right about now, you have just reached the point where you should seriously consider breaking out the functionality into a separate class.

Of all the times I've looked through code (including my own code), I've usually had unkind thoughts about the author of classes that were too big. I don't often remember thinking "why did he bother creating a class for that?"

Wednesday, July 13, 2011

return bool

someFunction(true, false, false, true);

or

someFunction(Sensor.Enabled, Filtering.Disabled, Normalising.Disabled, Smoothing.Exponential);

I know which one is

easier to read
easier to debug
harder to get wrong when you call it
slightly more effort initially.

Tuesday, June 21, 2011

The iddy biddy interface

Tools like resharper allow you to extract an interface from a class.

At a simple level you can just pull out all the public functions. Happy days, look Ma, I'm programming to an interface.

But what does that really buy you.

Lets say I have my AnalogTemperatureSensor class, and I extract an IAnalogTemperatureSensor interface.

So now I have a function that takes in readings, and gets some form of summary data, say an AverageWithOutliersRemoved .

Then when I want the same functionality for my WaterLevelSensor, I find that it does not share the Functions to choose Centigrade or Fahrenheit.... Opps, I can't reuse my code.

Also, there's no clear way to see what functions from IAnalogTemperatureSensor the AverageWithOutliersRemoved function will use.

IAnalogSensor

On the other hand, If I had simply extracted an IAnalogSensor with functions like TakeReading(), IsReadingValid(), then my AnalogTemperatureSensor and my WaterLevelSensor could both implement this.

AnalogTemperatureSensor might also implement ITemperatureDevice, which allows me to choose C or F.

Several Small Specific interfaces are a whole better than one big nasty one.

Thursday, May 19, 2011

Error Handling - An exercise for the reader....

So many times I read through books about programming, and the examples always skipped nicely over error handling. It was left as "an exercise for the reader"....

Even in college, error handling was a poor relative to algorithms, data structures, methodologies and such.

Of course in the real world, things go wrong all the time.

Opps, the file system is read only, someone set up the permissions wrong on the share.
Database connection, ah, well you see we had to restart the database.
Config files, "I only changed one setting" and so on.

I can't tell you how to handle errors, because, as always, that depends. Rather, I can offer some thoughts.

When a program goes wrong, you need to be able to figure out where and why it went wrong. That is hugely important. A failure that neatly tells you what is wrong, in such a way as you can sort it out is almost friendly.
A program which locks or crashes without giving you the slightest clue as to what is going on (or not going on for that matter) may cause all manner unkind thoughts.

It's nice too if your program can survive the little problems. If a file is read only, and you cannot save your data, then crashing out is a little rude, allowing the user to save somewhere else is more polite.

A whole pile of your error will go toward these two straight forward goals. And then you are back to the other requirements of consistency, simplicity, and not having 50% of your code taken up with dealing with errors. But now that you have goals, you can plan for the bigger picture. Too often in my early days, I wrote a whole pile of error handling code without really considering quite where I was going. Classic can't see the Wood for the Trees stuff.

Some things to consider:

Don't log errors that are not errors. When something real happens and I'm looking through the log file, if I have to try to figure out the "errors" that happen all the time from the real errors, I may not feel much love for you.
Exceptions are expensive, throw them for exceptional events. Don't make them part of the normal control of you program.
When you log an error, log as much as you can, and log it consistently. Allow for the fact that you might want to process logs files automatically. Comma Separated Fields are your friend here. Grep then Cut and Paste into a spread sheet, and off you go.
Consider third party logging tools eg log4net
Log "interesting" things as information, so that you know that X happened just before the crash. You can't trust users to remember exactly what they were doing, they were busy just before things went west.

This just touches on some of the ideas that shape the bigger picture. You may have other considerations too.

Tuesday, May 17, 2011

The why of unit testing

When you sit down to start writing test code, it's sometimes hard to see the wood for the trees. It's hard to make sense of writing a unit test for a piece of code which is maybe 20 lines long, and you can run it as part of the app and see that it works.

So far so good. It works, who needs tests.

Well, what about after someone else changes the code? Will it still work?

What about after someone changes the code in another branch, and a source code control system automagically merges your code. My change in this branch, your change in that branch, and an automated system decides the changes don't conflict, and puts them both in the merged version.

You can of course check each merged file, say all 40 source code files that you changed this week. You might spot the place where our changes don't play nice. But better still, you can run the unit tests. All of them, hundreds of them built up over time.

And that's the reason why you write unit tests, even for code that's obviously right.

Because things change, other people work on the code, some of them are new to the code base, some of them are new to the language.

So only about 5% of the value of units tests are to make sure the code works now, the real value of the unit tests are to make sure the code still works later.

There's other reasons too, but that's another post.

Thursday, April 21, 2011

Design Patterns - The WHY

Over time I have had the pleasure and indeed on rare occasions the misfortune to interview many software engineers.

One thing that you learn in interviews is how poorly some basic ideas are understood....

Factories
A very common misconception is the humble Factory Class.

A lot of people seem to miss the point. It's often been described as a place to collect together constructors, as if bunching them together in a single file was a goal in itself.

It's the WHY question that matters. Why on earth would you use a factory class?

The answer is quite simple. It's all about reducing dependencies.

It's all about dependencies
If I have a class ShapeManager that understands and cares about shapes, but has no particular need to know about or care about the particular type of a shape, then I want to preserve that separation. I am prepared to go to quite some lengths to prevent that class ever knowing about a Square, Circle or Triangle. As soon as it does know about individual shapes all is lost.

For example, if I suddenly see the error of my ways and change all the Simple Regular Polygons to a single class, my ShapeManager should ideally remain unchanged. I should not have to go into ShapeManger and look for all places where I have a particular instance of Square, Triangle, Pentagon, etc. and change them. That way lies madness.

Loading Shapes
Somewhere it the ShapeManger, I may care to load in Shape from a file. I could code it directly, but then my ShapeManager needs to start knowing about the types of Shape in the file. Madness I tell you.

What if I pass in an IShapeFactory, in this case an instance of ShapesFromAFileFactory()?

Now my ShapeManager sees an Interface which has a Function GetNextShape(), which unsurprisingly returns a Shape.

Now when I add a Simple Regular Polygon, or change the way shapes are stored, or even move the storage of the Shapes to the database, ShapeManager remains unchanged.

And I can also pass in an ShapesForTestingFactory() in my unit tests and run the tests without any external dependence on databases or Files.

The ShapeManager code is now more testable, it's less fragile, and please don't underestimate this part, it's more understandable to people who know about standard design patterns.
That bit matters a lot in a large software team. Depending on schedules, any of a half a dozen engineers with a passing knowledge of an area may be working on a change or a fix. If they see a Factory and know from roughly what is going on, then there's less "figuring out" to do before they can get started.

Monday, April 11, 2011

The State of things

The humble state machine has been implemented many many times, but enums are no longer the way to do it.

At the heart of things, you have some information about our current condition or state, and you have things that can happen, ie Events.

Some Events are valid for our current state, some are not.

If we use an object to record our state, then the object can have member functions to deal with events.

So, if we start with an AbstractState which cares about Enable, Disable, Pause, and Continue events. We give it virtual functions for each of these and in each case, we throw an "Invalid Event For State" exception, detailing the Name of the Current State and the Event.

Our Enabled State, derives from this, and implements Disable, and Pause.

IState Disable() {

return DisabledState; // this is just a static instance.

}

IState Pause() {

return PausedState; // this is just a static instance.

}

If you try to "Continue" an Enabled State, then you get an exception

For each state, you override the functions to deal with the possible events.

To process an Enable event, your code then calls

currentState = currentState.Enable();

And so on.

Do exercise care in deciding that you cannot go from state A to state B. Someone may take a personal dislike to you and all your children based on some artificial restriction that you enforce.

Friday, April 1, 2011

Names ending in 2

It's almost always a wrong thing to have a variable, class or function whose name ends in 2.

class SensorReadingTransform sensorReadingTransform;

class SensorReadingTransform2 sensorReadingTransform2;

Anyone care to make a guess as to the difference between the 2 classes above.

Does this code "speak to you"

Are you going to have to go look in the class to have any idea what the difference is?

If I'd renamed the first one, and created the second one as follows

class SensorReadingLinearTransform sensorReadingLinearTransform;

class SensorReadingNonLinearTransform sensorReadingNonLinearTransform;

Now you have at least a starting point. Yes, that's what I need, look further, or no, that's ok, I need to look somewhere else.

And truly, if the best name you can come up with is "Foo2" then that says a lot about the need for some rethinking.

Thursday, March 31, 2011

Simple Rules for Source Code Control.

Only check in one functional change at a time. Then when someone else has to merge the changes, or to undo one of them, or to cut a release, they will not be forced to question your progeny.

Add function overloads after the original function. Otherwise diff and merge can make an unholy mess of figuring out what changed where. Especially if anything else in the original function changed.

Put in a decent comment in the source code control. Yes someone can look at the files to see what you did, but the poor fool may be merging 6 months of work in one branch with 6 months of work in another. They really don't have time to play detective. You may even end up being that poor fool, and not remember every change that you coded 6 months ago. "Fixed bug 102312" is not an acceptable comment. Do you really want to have to log on to the bug tracking system and chase up each bug by reference number while you sort out an interesting part of a merge.

Don't manually apply the same change to 2 different branches. There is a special place in hell for programmers that do this. Use the Source Code Control System to push your change from one branch to another. Then the SCCS will have a history, and the shared base file will be updated. It may seem like a Royal PITA, but it will not be nearly as much as a PITA as the merge will be if you manually apply the same change to two branches, especially if you have add the functions in different places, or if the formatting is different because when you CUT & PASTE a function, the Environment automagically formats it.

SCCS should be a mandatory class in any degree that has any pretentions to Software Engineering. And one of the keystone projects should involve merging 2 code branches and producing a report of good practices that would make the merge easier. Ideally give each student changes to make to a branch, and randomly pair them and make each student merge to the other branch*.

(*Hey I never said I was a nice person)

Tuesday, March 8, 2011

Let the compiler do the work

If you have data that can be transposed it will surely be.

Look Mizuho, Instead of selling 1 share at ¥610,000, they sold 610,000 shares at ¥1 each.

A function that takes two parameters CalculateArea(double length, double width) will eventually be called with width and length. In this case, for most geometries that we care about, no harm done.

But what about some process function like :

void AddDilutedAcid(double ml, double PartsPerMillion)?

The following will compile, run, and cause all sorts of problems.

void MixUpDangerousChemical()
{
     double ml = getVolumeToAdd();
     double ppm = getConcentration();
     AddDilutedAcid(ppm, ml); // Nicely transposed. Should be ml, ppm
}

It would be nice to find that bug at compile time, not at runtime.

What if we had a simple Concentration class. At it's simplest, it would just contain and expose a double.

public class Concentration{
   public double ppm {get;set;}
}

void AddDilutedAcid(double ml, Concentration PartsPerMillion) {...}

void MixUpDangerousChemical()
{
     double ml = getVolumeToAdd();
     Concentration ppm = getConcentration();
     AddDilutedAcid(ppm, ml); // This throws a compiler error now !!
}

Let the compiler do the work. It always pay attention. Always.

Thursday, March 3, 2011

I'll Just....

If an electrician came to your house, and offered to save you running new wiring by using the fact that the Aluminium Window frames are pretty conductive....

So why do we do the same thing in software?

"I'll just add this feature in here" translates directly to "Man is this ever gonna be a mess in a few years time".

It's an invariant. The words "I'll Just" almost always precede a short term fix that will eventually pile up with all the other "I'll Just" fixes until the whole thing becomes a tangled mess of surprises.

You look in a function and find that it's also doing something else unrelated. "Ah yes, they wanted to be able to do.... so I just added it here"

There's a name for this one, "accidental coupling", and no, that has nothing to do with being drunk in a nightclub. Things get tied together because they happen together by coincidence.

Of course, later someone wants to do them separately. And in code, surprises are almost* never good things.

The PrintInvoice() function also sets the credit limit. Who'd have thought that? "Well we did that because we usually set the credit limit then..."

It was a wrong thing to do, and any and all justifications don't change that. Great, You've just used the Aluminium window Frame to carry the current to the Outside Lights. Nice.

There is a minimum amount of work that is required to do something right. You cannot avoid that work forever. You can put off that work, but when it comes to tidying up all the shortcuts, you may not like the interest and penalties that build up on all the work you should have done up front.

*Almost Never ~ ok, let's just say Never

Thursday, February 24, 2011

Ask Specific Questions....

You might think the two if statements below are the same thing...

But if myCollection conforms to an interface, which has both Count and IsEmpty properties declared, then use IsEmpty if you mean IsEmpty.

I'm not for a moment suggesting that Count==0 will not reliably determine that the collection is indeed empty. But what if the object that implements the interface is a linked list, which implements count by iterating over the nodes?

No-one would do that. Would they?

What if it was some form of virtual collection.... Count may be quite expensive. And all you care about are two cases Count == 0 and Count > 0.

It's an unlikely side effect.

But in general, if someone has gone to the trouble of giving you a specific function, use it. They may have had something in mind when they wrote it.

Your code will be, if anything slightly clearer, there's very little down side.

Say what you mean

Prefer Specific things.
Like Empty

It's a bit better than

It's most likely implemented as a static, so you don't have to allocate an empty list.
It shows clarity. You intend to return a empty collection. No one will ever wonder if you had omitted to add something to the list.

YAGNI

You Ain't Gonna Need It

Or perhaps, Einstein said it better "Make everything as simple as possible, but not simpler.".

There are many tools at our disposal when we write code. And once we get to know about a new tool, we tend to use it. We often use it a lot. Too much in fact.

To make it just that little bit more confusing...

Some people seem to have developed a habit of reversing comparisons to keep the Constant at the front.

So

If (itemCount > 0) {... becomes If (0 < itemCount) {...

Yield

Yield - does not play well with recursive algorithms

In c# they introduced a thing called yield. If you have not come across it before, then you can do such things as this:

public override IEnumerator<T> GetEnumerator()
{
   int count = Count;
   for (int i = 0; i < count; ++i) {
   yield return this[i];
   }
}

And now, c# automagically creates an enumerator object and manages the state of this enumerator for you. There's a small cost, but usually it does not matter.

The Confusion Index

At some point many months after you write you code, you will have to go back and change it. Or someone else will. And then a while later, someone else will change it again.
That someone will be picking their way through your code to find why it is broken, or to find the place to add a new feature.

As they pick their way along, your code may be clear, concise, or it may be misleading and confusing.

How Big should a Class be?

It's a common question - how big should the class be? Should I break it into smaller classes? Am I making things overly complex by splitting it out.

I will say that I am not sure I have ever seen source code where the classes were consistently too small.

Let us take a trivial example:

Say we have sensor, and it holds it's readings in a list. (I know there should be timestamps etc, but we are limited by how much I'm prepared to type, and by how much you will read)

Change

Way back a long time ago, I wrote some code to use a neural network to deal with the Kinematics of a robot arm in 2 dimensions. In nearly 20 years, no-one has ever needed to change a single line of that code.

However, that is entirely because it was my final year college project, and it has not seen the light of day since I showed it to the professors all those years ago.

If people use your code, then the requirements will change. And you will have to change the code. This is as certain as any physical law, and as true as any mathematical proof. It is unavoidable.

Don't reformat code....

So, the old brackets argument....

if (x==0) {
// Do Something here
}

or

if (x==0)

{

// Do Something here

}

Well here's the answer. They are both fine. Leave 'em be.

And here's why.

It depends....

Some details may have been changed to protect the ~~innocent~~ naive.

"So all I wanted was a simple program to send a formatted message over a particular network protocol. Just a little test program.
No problem. I have code that builds that message, it's in a utility library. It'll just take a moment.

Names are powerful things.

Read the list of colours, but try to call out the colour, not the word.

Red, Blue, Green, Yellow, Green, Red, Blue, Yellow

Names are powerful, they live in our head, and tell us things, we believe them, even against strong evidence.