One of the tasks I’ve set for myself this month is to document an application that is complex and undocumented. Not only is there no technical documentation for this application, but the specifications and requirements that determine whether or not it is working correctly also were never written down.
The yardstick used to measure whether this application is working correctly is: Do the customers think it is working correctly? And there are dozens of customers, who have been using the application in one incarnation or another for more than fifteen years.
In the beginning, when this application was first born, its internal technical architecture might have made sense. It might have had a logical flow. It might have been self-explanatory, and the emailed specs and hallway conversations that lead to its creation might have been fresh in the minds of the development team, who never imagined that they would forget what job the application was intended to do, and how it was intended to accomplish it.
Moreover, those original developers did not imagine that they would have since left for positions in other departments, or even left the company altogether, and that the application would pass on to other, different developers, who did not have the benefit of those original hallway design discussions and long-since deleted emails.
And so I find myself today looking at something that is not so much an application as a tapestry, woven together by different weavers over the course of a decade and a half. During that time, each successive developer strove to make whatever changes were mandated by the application’s users in the least intrusive way possible. Often, this meant that instead of modifying a function or a block of code, the developer would add a new function or block of code and an "if" statement. This helped to ensure that the change applied only to the behavior that the developer was intending to change, but those additional functions and blocks of code were also not documented, and the size of the application grew and grew.
Redundant code and code that no longer has a purpose are not the only crocodiles lurking in this swamp, though. Over the years, a few requirements have come along for applications that were similar in function but not exactly the same as this one.
Rather than create a new application to serve this disparate need, a decision was made to add more ifs and more blocks of sometimes-executed code to this one.
Because the code is now so convoluted and difficult to understand, it is difficult to test. Modern changes result in unexpected functionality that is only discovered by customers, who then report bugs. This is primarily why I am creating a technical document that describes what it does and how it works.
However, looking at the application now, I can’t help but think the application could have been written in a different way, one that would make it much easier to understand and maintain, and which might even have eliminated the need for the document on which I am now working. I think incomprehensible code can be written in any language, so here are my language / technolgy agnostic recommedations in retrospect.
First, the application could have been broken up into a core application, and libraries. This helps by making the core application logic easier to follow, it helps with testing, as functions implemented in libraries have defined interfaces and don’t use application-wide global variables. It also helps with reuse. If the application had been implemented in this way, when the requirement for a similar but slightly different application came along, that application could indeed have been a different application, and just shared the libraries.
Second As I alluded to above, application-level global variables are the devil’s pool balls. They make it very difficult to determine what changes what when, and to track down dependancies between functions. This has been considered to be a bad coding practice for a very long time.
Third It goes almost without saying that more in-line comments would have helped tremendously, especially when the flow of the application was branched to accomodate some mysterious requirement. I’d like to combine this with use of liberaly labeled constants in place of magic numbers (and letters).
Fourth Documentation of expected functionality. This is a nice-to-have, and it depends a lot on having good support for documentation, such as a well maintained and supported wiki. Documentation for each change that had made to the application over the past fifteen years, along with expected results and testing scenarios, would do much to eliminate bugs caused by programmers who guess wrong about how some functionity is supposed to work.
Fifth Avoid copying and pasting of code. Way too often I see code that has just been lifted and copied from somewhere else in the code, then modified slightly. It’s been my experience that this kind of practice leads to typo-bugs as it’s too easy to just turn your brain off and press Ctrl-C, Ctrl-V. Instead, the code should be moved to a function. You can copy paste and modify the function if you need. I see this a lot for in-line sql statements.
Sixth And that brings me to in-line sql statements. It’s very convenient to be able to build SQL on the fly, but it’s hard to optimize the performance of those in-line sql statements as the application scales. Plus there is the danger of sql injection hacks, not to mention that automatic testing of in-line sql statements using something like MSTEST or NUNIT is very difficult.
In conclusion, I’m aware that all of these suggestions require managment support as they take time to implement. Too often, the edict passed to the programmer is "get this done ASAP, the customer is depending on it!" and the result is code sludge, and eventually a task like the one I’m working on now, to map every tree stump and lilypad in the swamp.