On effort estimates and productivity

From the erlang mailing list

Joe Armstrong

Once upon a very long time ago we did a project to compare the efficiency of Erlang to PLEX.

We implemented “the same things” (TM)  in Erlang and PLEX and counted total man hours

We did this for several different things.

Erlang was “better” by a factor of 3 or 25 (in total man hours) – the weighted average was a factor 8

They asked “what is the smart programmer effect”

We said “we don’t know”

We revised the figure 8 down to 3 to allow for “the smart programmer effect” – this was too high to be credible, so we revised it down to 1.6. (the factors 3 and 1.6 where just plucked out of the air with no justification)

Experiments that show that Erlang is N times better than “something else” won’t be believed if N is too high.

The second point to remember is that you *never* implement exactly the same thing in two different languages (or very rarely) – the second time you do something you have presumably learnt from the mistakes made the first time you do something.

If you implement the same thing N times in the same language, each implementation should take less effort and code than the last time you did it. What can you learn from this?

The difference in programmer productivity can vary by a factor of 80 – (really it’s infinity, because some programmers *never* get some code right, so the factor 80 discounts the totally failed efforts) – So given a productivity factor you have to normalize it by  a factor that depends upon the skill and experience of the programmer.

There are people who claim that they can make models estimating how long a software projects take.

But even they say that such models have to be tuned, and are only applicable to projects which are broadly similar. After you’ve done almost the same thing half a dozen times it might be possible to estimate how long a similar project might take.

The problem is we don’t do similar things over and over again. Each new unsolved problem is precisely that,  a new unsolved problem.

Most time isn’t spent programming anyway –  programmer time is spent:

a) fixing broken stuff that should not be broken

b) trying to figure out what problem the customer actually wants solving

c) writing experimental code to test some idea

d) googling for some obscure fact that is needed to solve a) or b)

e) writing and testing production code

e) is actually pretty easy once a) – d) are fixed. But most measurements of productivity only measure lines of code in e) and man hours.

I’ve been in this game for many years now, and I have the impression that a) is taking a larger and larger percentage of my time. 30 years ago there was far less software, but the software there was usually worked without any problems – the code was a lot smaller and consequently easier to understand.

Again in the last 30 years programs have got hundreds to thousands of times larger (in terms of code lines) but programming languages haven’t got that much better and our brains have not gotten any smarter. So the gap between what we can build and what we can understand is growing rapidly.

Extrapolating a bit I guess a) is going to increase – so in a few years we’ll have incredibly smart devices which almost work, and when broke nobody will able to fix, and programmers will spend 100% of their time fixing broken stuff that should not be broken.

And no I have to figure out why firefox has suddenly stopped working – something is broken …

Cheers

/Joe

Mahesh Paolini-Subramanya

There are – at least – four orthogonal areas in which your software gets developed, each of which has different metrics for estimation, tracking, progress, etc., etc., etc.  To throw some semantics at this

1) Technical

When the specifics of the solution are clear, and it pretty much boils down to implementation. “I need to replace the valve on my Hot Tub” (With the same model, I’ve already done it once before, etc., etc.)

2) Engineering

When you need to solve the problem first, before implementing the solution.  There _is_ a body of kno “I need to install my Hot Tub” (Hmmm. There is no water line going out there. How do I get one? What about the electrical circuit? Do I need architectural permission? etc.)

 3) Science

You need to invent a new class of solutions for the problem at hand.  “I need to install my Hot Tub in an underground bunker in the marshlands of Florida” (How do I build an underground bunker in the marsh? Maybe I can freeze the ground and pour concrete? How do I keep the concrete from sinking? Hmmm. Time to start running experiments?”

 4) Art

You need to get the intangibles correct, viz., is it maintainable? Supportable? Documented? Elegant? “Will some future Significant Other like my paranoid underground-bunker hot-tub?”

 Note that each of the above are _different_.

– Its fairly easy to Manage / Maintain / Monitor “Technical” work.

– There is a reasonable body of knowledge that helps in doing the same for “Engineering” stuff.

– For Science, its all pretty clearly made-up (the fusion reactor!),

– For “Art”, well, it really _is_ in the eye of the beholder.  (Good Documentation? For whom? What do you mean by “Good”? *I* understand it! And so does Jane!)

Which brings me back to the original point – much as we would like it to be that way, software pretty much never fits neatly into one of the buckets above – it is some combination of the four, with different parameters for {T, E, S, A}.

What’s worse, there is a time-variant aspect to this too – and the parameters are inter-related.  e.g., different “Engineering” solutions have different “Technical” impacts.  In short, your development process is actually f(T, E, S, A, Time)

All this being said, there is a time-honored Academic way of solving f(T, E, S, A, Time), which basically consists of wishing-away the unknowns (or make unrealistic assumptions about them) and then spend an in-ordinate amount of time on the remaining parameters.  With some appropriate tweaking of parameters, one can quite successfully make this match some “real-world” results, which can then be trumpeted widely.  Any failures can be blamed on the actors (ho-ho. “Actors”. In an erlang post. I crack myself up.) who were clearly not qualified..

This may very well be the best option?

Cheers

Mahesh Paolini-Subramanya