Donald Knuth, one of the true Great Minds in the field of computer science once wrote the following statement, in his article “Structured Programming with go to Statements”:
“We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%. A good programmer will not be lulled into complacency by such reasoning, he will be wise to look carefully at the critical code; but only after that code has been identified”
What he means by this is that programmers will often start optimizing code before they truly understand the code’s performance characteristics, and as such will optimize in the wrong places: not the places where the actual bottlenecks are, the critical code. As a result, they are both wasting time on rewriting code for no good reason, and also making compromises to the design, readability and maintainability of the code in search of more performance.
While he stresses that optimizing is a good thing (an engineer should always strive for best efficiency), and trading a bit of readability/maintainability in the critical code for better performance is a good tradeoff, the key is to make sure that you do it in the right place.
Although Knuth wrote this back in 1974, it is still very relevant today (as is a lot of his work). However, the world of software development has changed, and so has the average programmer (in fact, the term ‘programmer’ is not used all that often anymore, these days you hear terms such as ‘developer’ or ‘software engineer’). Optimization is not such a hot topic anymore for most programmers. On the one hand, we have much faster computers these days, so we reach the point where performance is ‘good enough’ much sooner. On the other hand, our Operating Systems and programming suites come with a lot of functionality bundled in pre-optimized libraries. This means that a lot of the bottlenecks have already been optimized for you. For example, writing an efficient decoder for video formats such as MPEG is quite difficult… But most programmers who need to have a video player in their software, will not write and optimize their own. They will just use a standard pre-optimized library, which solves the critical code for them.
In fact, I believe that optimization is an art that is mostly lost on the new generation of programmers. As Donald Knuth describes, a programmer’s style changes over the years, the style gets more refined. As you optimize code, you learn from both your mistakes and your successes. You will both avoid making previous mistakes in terms of efficiency, and you will also recognize situations that you already have developed an efficient solution for earlier. This means that over time, your initial, ‘unoptimized’ code will become more efficient as well. But you can only learn this by actually doing the optimization work, and pursuing efficiency. And that is not something that a lot of programmers do anymore, these days, since they don’t have to, most of the time. I often see, when a programmer needs to optimize something, that they don’t really know where to look, or what to do. You can’t learn to optimize code in just a few days. It’s an ongoing process. You need to be familiar with performance characteristics of a lot of things. Basic looping logic, the approximate cost of different operations (e.g. multiplying is faster than division), CPU caching, different datastructures and algorithms, etc. And be aware that these are moving targets as well. New advancements in CPU design might change the rules on what the fastest methods are (currently we are seeing the move from CPU to GPU, where GPUs can be considerably faster than CPUs, but generally require vastly different algorithms. Translating a fast CPU algorithm to GPU is not likely to work well). In a way, optimizing is a way of life.
But what about design?
What I really want to talk about today is not optimization however, it is design. I want to draw a few parallels with the situation of optimization. As I said, programming isn’t about optimization anymore, these days. What it is about, is design. Software design/architecture/engineering is a very important topic in today’s computer science courses. Which is only right, since good software design is the key to good software. Where ‘good’ can mean a number of things, such as maintainability, extensibility, longevity, and performance, to name a few.
However, although the younger generation seems to be aware of the need for good software design, and are familiar with various approaches to software engineering, I feel that they often do not quite understand the history of software engineering, and the state of software engineering today.
I think a reference to another famous old article on software is appropriate here. Namely Frederick P. Brooks Jr.’s “No Silver Bullet: Essence and Accidents of Software Engineering”, from 1986. By that time, it already became apparent that new software technologies came and went every few years. People would often see these new technologies as “silver bullets”, i.e. solving problems orders of magnitude faster than possible with the ‘old’ technology. Such new technologies would often be paired with a lot of hype, and programmers would embrace the new technology… only to find out after a short while that it did not really deliver as promised.
In fact, one could argue that adopting new technologies too soon and too often will only diminish productivity, as the programmers need time to adjust to the new technology, rather than using their gathered experience with previous technologies (both mentally, and also in modifying existing code to fit the new technology).
Although Brooks’ article is more than two decades old, much of what he writes still applies today. I feel that many programmers, especially the ones who just start working after finishing their education, still have this idea of a ‘silver bullet’. That the design methods they were taught are the right way (perhaps even the only way they know?), perhaps even foolproof. When they start developing a new piece of software, they start by trying to design the entire program as a whole.
They may come up with an elaborate design, which may look quite good on the surface (you will see a lot of use of familiar textbook design patterns), but there is no guarantee that it will even work, let alone that it is any good. Besides, even in the unlikely event that you get the design right the first time, you can never be sure that the requirements you based the design on, were the right ones. And this is what I would call Premature Design, where I see a lot of similarities with the aforementioned Premature Optimization. In this case, design is being done before the requirements (not only functional, but also technical) have been properly identified. Another thing you should also understand is that methodologies should be seen more as ‘guidelines’, rather than following everything by the letter. You should decide where and when certain aspects of the design may or may not be relevant. Don’t follow a single methodology by the letter. Rather, get acquainted with a range of proven methodologies, and take bits of each where and when required. Writing applications is a creative process, so allow yourself to be creative.
What you should be doing is prototyping. ‘Grow’ the software, evolve it. Especially with less-than-trivial technologies, there often is no way to create a detailed design of the inner workings, without understanding the technology first. For example, if you need to interface with a piece of hardware… How do you communicate with the hardware? Is it event-based? Does it require polling? Do you need multiple threads to handle the input/output? Etc. Such issues can massively change how you design the software around it, so you need to know how the data is supposed to flow beforehand.
I always build a small prototype application first, which concentrates only on getting such core functionality working, and finding out what’s what. This generally takes only a few days at most… and once it works, you can actually create a design with relevant knowledge about the technical requirements (I actually got some pretty strange looks when I started on such a prototype, and someone told me “Shouldn’t you draw it out first?”, to which I simply answered: “I don’t work that way”. I don’t think he quite understood it). You will know how to fit this part of the functionality into a larger framework effectively. Without this knowledge, you would likely have painted yourself into a corner with the earlier design, and by the time you would have caught this mistake, you would probably have to redo a lot of the design and programming. Writing a prototype first will save you a lot of time.
As Brooks also mentions, having a working prototype is also quite motivational. The project actually comes alive to the programmers, and they will be more enthusiastic about working on the project. I also believe that once a part of a program works, it is easier to keep it working. If it breaks by any new changes, you can always look back at an earlier working version, and find out where things went wrong. I personally have always felt uneasy by making large changes in one go. There are some algorithms or optimizations that by nature cannot be implemented in a step-by-step approach. However, I find that many programmers just write a lot of code first, and then try to jump-start everything in one go. I think this should be avoided whenever possible, because of Murphy’s Law: “Anything that can go wrong will go wrong”. You will find yourself having to debug a lot of functionality at the same time, which gives an exponential growth of complexity. Even the best and most experienced of programmers will still often make some small mistakes in their first draft of the code, such as small off-by-one errors here and there. They can be caught easily in isolated cases, but if you have to test a lot of code at once, the errors may propagate through the system in ways that are nearly impossible to track down.
By growing your software from working code, you can test and debug new code more easily, which means progress will be faster, and the project as a whole will have more reliable code. You will maintain a solid foundation throughout the project.
So to conclude, the problem is not so much that programmers aren’t taught the tools to design software, as it is that they aren’t taught when and how to use (or not to use) these tools. This is something you will only learn by experience. It also helps to look at the past, and read what people with a lot of experience have said on the subject already. As you can see, even these two very old articles I have referenced, still carry an important message for today’s programmers. I hope that such knowledge will be ‘rediscovered’ by the new generation. And another thing is to give yourself the space to be able to gain that experience. I think that is what the new generation should do more: when there’s a bit of technology that you’re not familiar with, just isolate it and build a prototype application. Just go out and experiment with a prototype from time to time, don’t worry about playing by the rules and don’t be afraid to make mistakes. That is how you will gain that experience.
Update: I have found a blog of another programmer who has run into the trap of ‘premature design’ himself, and is now very much aware of this problem, and sees it often as well, especially with junior developers. They use generalization techniques and design patterns, without fully understanding the consequences. As he explains: you need to know the domain before you can make the generalization. You cannot anticipate all the aspects of the problem. He also suggests to first make an implementation to get to know that domain, like the prototyping that I advised. And use evolution… Start with working code, and generalize when you learn where and how it is appropriate.
He calls it ‘premature generalization’, but the problem is very similar to the one I have described here: http://myossdevblog.blogspot.com/2009/03/premature-generalization-is-root-of-all.html