Where do bugs come from?
“Our Software Sucks”
I was once party to a contentious meeting. A senior executive made the statement that “our software sucks!”. When he said this, he was referring to all of the projects worked on by all of the developers at the company. And he was making this assessment on two points: First, that there were many bugs logged in our bug tracking software, and second, that other executives that he met with, representing customers who used our software, were complaining to him about those bugs.
I found the statement a little offensive, although he wasn’t referring specifically to the project on which I had worked. At the time the meeting was held I was just finishing up a difficult project. I had worked hard over nights and weekends, worried about obscure performance issues in the morning shower, spent tense, nail-biting hours waiting to hear from testing that particularly complex and difficult to define issues were at last resolved. Did my work “suck”? I did not think so.
The application was complex, with obscure requirements. Users of the application then interacted with external systems and during development those systems were themselves under development. Documentation of APIs supplied to us for those external systems was frequently out of date, or sometimes APIs were introduced but servers weren’t updated. So there were times when our customers attempted to use our application, and it crashed and it did so because the APIs were out of sync. And for each crash, it wasn’t always clear whether it was the application or external system that was a fault. There were testers that were part of my team, but customers were also testing our integration external systems to which we didn’t have access. During the two years during which development was underway we worked closely with each other, regarding each issue as community property that we worked together to identify and resolve.
The system at last entered production, with most issues resolved. We had fought a tiger and won. Our software worked, it was speedy and flexible and easy to maintain. All of us who were involved felt proud, whether we worked primarily with the external systems or were involved in the application’s development.
At this company there were many projects underway like the one that I worked on. This was pretty much our business.
Where do bugs come from?
In my professional experience and certainly for this project that I was working on at the time, there are five primary kinds of bugs:
- A true bug, which is a software function that is designed to do something and does not do that thing. For example, a point of sale system at a restaurant might be designed to suggest a tip, and instead of showing 15% of the value it shows 150% of the value.
- There are also misunderstood or unstated requirements. The same point of sale system might be coded to show a suggested tip based on 15% of the after-tax value of the meal. Testers might log a bug stating that the suggested tip should be based on the pre-tax amount. If this is not specified in the requirements, and is not communicated to the programmer, it may be a bug, but it is a requirements bug not a software bug and does not reflect on the quality of the programmer’s implementation. Performance issues, where a programmer thinks that software needs to support 10 transactions per hour but what is really needed is software that can support 1000 transactions per hour fall into this category.
- There are training and documentation issues. This is particularly an issue with complicated software that has lots of options and an UI which is not self-explanatory. Sometimes when a customer thinks there is a problem with the software, the problem really is that they’ve chosen the wrong options.
- Finally there are bug-quirements, where users are unclear about the distinction between a bug and a new requirement. In the point of sale system example, a user may submit a bug that the amount of the percent displayed can not be adjusted and that it’s always a fixed 15%. This may be a bug, but it may also be a new requirement. Again, the programmer should defer to the business process owner and his or her manager. A bug-quirement like this may stay in the bug tracking system but at a low priority.
Swarms of bugs, oh my!
Are large numbers of bugs an issue for a project that’s under development? The answer is”maybe”.
For our project, we used the bug tracking system as a way to keep track of issues. It was convenient, accessible to both to ourselves as well as those working on the external systems. Bugs provided a great way for us to keep track of issues as we worked, as some of those issues we’d resolve right away and others we’d defer. When a customer contacted us with an issue, even a simple one, we would request that they log the issue as a bug. Having issues logged as bugs meant that it was less likely that we’d lose track, and we wanted to know about every issue and have a chance to investigate it, to determine whether there was fault and if so what remedy was necessary and by whom.
We would then investigate and, when possible, would close out the bugs. Many were training or configuration issues, some were the fault of external systems which, when updated, were resolved through no effort of our own. Most of the remaining were misunderstood functional requirements, sprinkled with a some bug-quirements which we either implemented or de-prioritized for later development. And, to be honest, there were a few actual bugs, which we pulled out all the stops on to resolve as quickly as possible.
On many occasions we spent nights and weekends exploring functional changes with our business process analyst and customers. They would describe the change that they thought was necessary, we’d investigate, perhaps propose a counter change, implement a solution, ask them to test, then rinse and repeat.
The Customer Factor
The senior executive also brought up customer complaints. This surprised me, as I felt that our relationship with the direct users of our application were happy with our work, and we had lots of enthusiastic and complementary emails as evidence to our positive relationship.
Thinking back on this now, what I suspect happened is this: Our senior executive was meeting not with our direct customers, but with management several levels up from them. Senior executives tend to be a bit removed from day-to-day work. So they operate on what gets passed up to them, and sometimes things get distorted in the passing.
For example: Our customers, like us, were working hard to meet deadlines. Together we encountered issues. Our customers told their manager “we encountered issues”. Their manager then, in an attempt to frame the issue as positively as possible, rephrased the report to this: “We encountered issues with the vendor’s software” and passes it on to his manager.
That final manager then, when meeting with the our executive, has what he/she thinks is a report condemning us. He has an incentive to be as critical as possible of us as a negotiating factor, so what he told our executive is: “Your software sucks and is full of bugs. Why should we pay so much for it?”
Quality is not just requirements
I’ve been in meetings where “better requirements!” has been the the answer when question was asked “What should we do to improve quality?” Perfect requirements sound good in theory–if engineers and architects create blueprints prior to building bridges and skyscrapers, why shouldn’t software too have blueprints?
I think there are two answers to why having perfect requirements prior to starting a project is difficult.
The first is that software is different than bridges and skyscrapers in that software is usually constantly evolving. Imagine engineers hard at work on a suspension bridge over a rushing river. Footings have been prepared, cables strung when the order comes from above: “We’re going to double the width of the river, and the bridge should support trains.” Almost any software project that takes more than a couple of months to develop has to adapt to changing market and business conditions. That adaptation can negate the benefit of having those perfect up-front requirements.
The other is that gathering requirements, collating and sorting requirements is a lot of work, that software is almost nothing but requirements and that not all of those requirements can be foreseen prior to starting the project. How should that temporary file be named? A temporary file? We didn’t realize that our third-party compression library wouldn’t work with streamed files. Should we check user data entry into a field for punctuation? We didn’t realize that the external system that we’re submitting that data to wouldn’t accept dashes or periods.
Quality is tomorrow, too
For most software projects, success is a matter of time. The software has been developed and delivered. Wonderful! Now, can it be maintained and enhanced?
It can be really difficult to attempt to make a change to a complex software application. There’s typically a lot of detective work involved. How does the application work? What other parts of the application will be affected by the change? How can the change be tested?
There are a couple of things that can be done at design time to make future maintenance easier.
- Segmented functionality. What’s important is that the developer should be able to change a function and know what parts of the system will be affected. For giant, monolithic systems this can be very, very difficult to do–which is why breaking functionality into components (separate applications, web services, stored procedures, whatever) is so helpful.There’s a temptation to design giant, monolithic systems. After all, doesn’t it make sense to have a business layer where all business logic resides? The problem is that over time, fashion changes. If the architecture says “all the logic should be in stored procedures”, along will come a web page where it’s plain just not possible to put the business logic in the database. Conversely, if the architecture says all business logic should be in web services, there will be a database trigger or stored procedure where it’s just not feasible to use a web service. It’s better to have components with clearly defined boundaries. Make it really clear where and how the stored procedure is used. Need to use the almost the same logic somewhere else? It might be better to clone and modify the cloned procedure, rather than modify the original to serve two masters.
- Documentation. It doesn’t have to be super-detailed, but the best way to prevent training and documentation bugs is to have documentation.Good documentation aids developers who need to change the software by helping them come up to speed on what it’s supposed to do and how it’s supposed to do it. Especially after a few years when the original developers have moved on to other projects and may not be accessible or if they are may not remember why, two plus years ago, they made the design decisions they did.Documentation also helps directly prevent bugs. Too often, users have to depend on hearsay or tribal knowledge to know how to perform essential functions with their software. That can be really frustrating for users. If they have documentation to help them figure out how to use the software on their own, they’re a lot less likely to decide that “the software sucks.”
Complexity is the enemy of quality, and responsiveness is its friend
Early in my career, I worked on a large project that was organized according to the waterfall methodology. Our team of 12 developers spent upwards of 5 months working with a business analyst, end users and a technical writer to develop a thick binder of requirements. Once development began, we quickly found that our design could not deliver the necessary performance, and that the UI was too complex to develop and had to be simplified. We ended up delivering only a subset of the original functionality, and much later than had been promised.
My part of that project was the UI. This was back in the day when explorer-type windows forms applications were just coming on the scene, with treeviews and splitterbars and panels. I spent months getting my hand-made splitter bar and treeview components to work properly. Knowing then what I know now, I would have lobbied hard for a simpler UI. The UI that we delivered was beautiful, easy to use and a work of technical art, but a much simpler one would still have gotten the job done and would have been finished in half the time. This was application intended for internal use, and that complicated, beautiful UI did need a lot more effort to maintain than a simpler one would have.
In contrast, the most successful projects that I’ve worked on in my career have had these elements:
- The ability to quickly deploy changes and updates.
This is both infrastructure (perhaps “DevOps”) as well as an architecture which allows changes to be made separately and discretely.
- The ability to deploy the software in phases of ever increasing scope.
Don’t try to eat the whole sandwich all in one bite. Deliver (and focus on) critical items first. If there are important features that you’ve missed there will be less you’ll have to rearchitect. It gets something in the hands of users quickly, which provides a sense of momentum for users and project owners.
- Close integration with customers, who could provide feedback and guidance regarding necessary features.
Customers are the best way to determine if necessary features are there, and if they’re implemented in a friendly way. As side benefit, involving customers provides them with a sense of control and ownership, which means that they’re more likely to talk in glowing terms about the software to their managers.
A note of caution
When that executive said “our software sucks” and based his assessment on count of bugs, that room full of developers went back to their desks determined to reduce bug count. That didn’t necessarily mean change the way that they developed software–this was largely not under their control–rather, they took steps to reduce the number of issues that were resolved as bugs, either by working informally with customers or by not entering bugs that they themselves discovered (choosing instead to track them in spreadsheets).
By targeting count of bugs rather than specific issues, that executive created a sort of “butterfly effect.” The count of recorded bugs was indeed reduced, the executive appeased, but at the expense of long term quality because information about some issues was no longer being entered into our tracking software.
It would have been better had he done some analysis of the kinds of bugs represented in those high bug counts, chosen a specific issue to target, and had restated the issue in a less contentious manner. Instead of “our software sucks” he could have said, “our customers are having a lot of problems with install issues. What can we do to improve?”