Quality: The Process, Part III [Think Quality.]

[continued from Gamma testing?]

Think Quality.

The most important aspect of the entire quality assurance process is attitude. We all need to strive for quality and refuse to accept anything less. There is no process or technique that can overcome a lack of commitment to quality, but the right mindset will make the process and decisions that much easier.

As [independent software developers], we can work together to improve our own products and, in so doing, raise the bar for other software.

Think Quality.

Gregg Seelhoff is an independent game developer, and the results of [a previous] beta test can be found at www.goodmj.com.

Quality: The Process, Part III [Gamma testing?]

[continued from Something different]

Gamma testing?

Software is not complete when it is released, and this is especially true for shareware offerings, since updates are relatively simple when compared to retail products. Some quality assurance professionals use the term “gamma testing” to refer to the process of improving and evolving products after the initial release. Unfortunately, this also refers to checking for radiation, so I only use the term jokingly.

The concept, however, is fundamentally sound. It can be summed up with the following saying:
The Customer is always right, even when he is wrong.

This phrase is often taken to suggest that one must appease every customer, and while this is a reasonable goal for good customer relations, it is not the only interpretation.

This saying also means that any feedback is valid, and no matter how unreasonable it seems. In terms of software, it means that every time a customer has a complaint or comment, it indicates a portion of the product or process that could be improved. As much as you know about your own software, you can never be the customer, so you need to listen to the feedback. For every person who contacts you about a problem, there are possibly hundreds of others with the same problem who do not bother.

For the same reasons, all reviews are beneficial. Good reviews are nice, but poor reviews, in fact, can do more to help you improve the quality of your product, if not your bottom line. Any negative aspects of a review can be corrected, and it will improve the software. Even where the reviewer makes an incorrect statement, such as overlooking a feature, this just shows that the interface or documentation should be improved to prevent that mistake from happening.

Note that there is no obligation to distort your software according to the whims of customers and reviewers. In fact, this can have detrimental effects on the product. You should be the “keeper of the vision” for your product and reject inappropriate suggestions. However, it is imperative to listen and consider.

[continued in Think Quality.]

Quality: The Process, Part III [Something different]

[continued from Standard treatment]

Something different

There are a number of other testing techniques that are used during development, and I want to touch briefly on a few.

One essential technique is known as “compatibility testing”. As the name implies, this is testing the software for compatibility on a variety of different system configurations. There are companies that will perform extensive compatibility testing, but this is not inexpensive. Alpha and beta testing should cover a range of systems, but it will be far from comprehensive.

For a Windows product, one must test on some flavors of Win9x and NT, at an absolute minimum, and preferably on every supported operating system. Game and multimedia products need to be tested with different video cards and sound cards. Products with printing features need to be checked on different types of printers, including at least a color inkjet and a laser printer, from different manufacturers. In short, you must cover as much of your target audience as absolutely possible.

Another external testing technique, related to compatibility testing, is product certification. This involves submitting your software for certification according to the rules of some program. Instead of checking different system configurations, product certification programs check other criteria, depending on the goals of the particular certification. These range in cost from free to very expensive.

For a slightly less formal review of the usability and general quality of the software, one can conduct “focus group” testing. Focus groups are essentially a collection of people in the target audience who are brought together in one location specifically to give their opinions and feedback. Professional firms can conduct such groups with quasi-scientific questionnaires, hidden cameras, and written analysis, for a tidy sum.

The easier and, in my experience, no less effective method to perform focus group testing is to find a location, such as the computer lab in a local school, and advertise free pizza and drinks for computer users who will show up and try your new product. I cannot comment on how this would work for business products, but it works well for games.

Finally, throughout the entire testing process, you need to conduct “regression testing”. Regression testing is a method of making sure that bugs that were fixed are not reintroduced into the program. This concept is really as simple as trying to reproduce each of the fixed bugs and making certain that they have not reappeared.

My first exposure to regression testing was a spiral notebook into which every bug was written as it was reported and checked as it was solved. Before we would send a game build to the publisher, we simply tested each item in the notebook as part of the test plan. It hardly needs to be more complicated than that.

[continued in Gamma testing?]

Quality: The Process, Part III [Standard treatment]

[continued from Beta move on]

Standard treatment

In most cases, companies use closed beta testing, limiting and controlling the distribution of beta versions of the software. Finding and managing beta testers becomes an issue, and finding good testers is a difficult challenge, so we need to discuss the closed beta process in more detail.

The unfortunate fact is that few users know how to properly test software, so if you are lucky enough to find a good tester, make certain that you keep that person happy. Useful feedback should be rewarded with a free copy of the program, at a minimum, and the tester should always be invited to participate in future beta tests. A good tester will outperform a dozen mediocre testers and, therefore, is very valuable.

A related problem is that many prospective testers will not provide any feedback at all, so it is necessary to invite more beta testers than you expect to need. You can anticipate that roughly half of the beta testers in a closed beta will not report anything at all, and some of the others will not be useful. In most cases, it is difficult to find enough beta testers, so it is unlikely that a product will get too many volunteers.

When looking for beta testers, cast a wide net. It is important to have as large a range of experience levels, methods of use, and system configurations as possible. It is a good idea to ask potential beta testers not only for contact information, but also about system configurations and software experience.

Remember, some of your potential customers are likely to be struggling with computer illiteracy, so it makes sense to have some less experienced testers as well. Knowledgeable users will often figure out how to do something, or find a workaround, on their own without indicating that there may be a problem. Neophytes, on the other hand, will ask questions that customers would ask. Do not rely solely on other developers for testing unless your product can only be used by programmers.

The best means of communication for a closed beta process is beta forum of some kind, in which beta testers can interact with each other. This helps establish a sense of community that works to support tester involvement and breeds loyalty to the product. From a practical standpoint, this also allows problems to be independently verified by other testers, and they will often work together to help you replicate a bug. There should also be an email address for bug reports, but forum participation should be encouraged.

It is important to remember that beta testing is not an adversarial process. Let me say that again. Beta testing is not an adversarial process. It can sometimes be very difficult to take criticism, but you must be certain not to get defensive. Always wear a (virtual) smile. Beta testers are there to help you, and it is far better to hear about problems now rather than after release.

All feedback is beneficial, so you should listen to everything that is reported. Try to respond to every report so that testers know you are listening and involved, which gives a psychological incentive to do a better job. Avoid being dismissive, as that discourages participation. Also, make it clear that you appreciate the reports, even the negative ones, since some testers are reluctant to report bugs or bad impressions if they feel that you will be insulted. Many reports are preceded by apologies.

One technique for keeping testers involved is to provide means of communication that does not necessarily involve bugs reports. Informal surveys about aspects of the program or system hardware questionnaires give testers a change to participate even if they cannot find any bugs (which is the goal, after all). In my last beta test, I decided to try a little contest. I found three unreported bugs in different areas of the game and challenged the testers to find them. The number of valid bug reports increased measurably.

[continued in Something different]

Quality: The Process, Part III [Beta move on]

[continued from Greek to me]

Beta move on

When the program is feature complete, or approaching that stage, it is time to consider taking the next step. One step from alpha is beta, so we should now look at “beta testing”.

Beta testing is the most recognized form of black box testing, in which the software is submitted to users outside the company for additional testing and feedback. Generally, these testers are not professionals, but rather should represent a typical cross-section of potential customers and users.

Since beta testing is often the first external exposure of your product, it is important that the alpha testing and glass box techniques have produced a reasonably solid program. It may be a cliché, but there is not a second chance to make a first impression. When a tester’s first experience with a product is lousy, he or she will be less likely to get comfortable with it. If you know that there are lots of bugs, then your software is probably not ready for beta testing.

A practical reason for making sure the software already shows a standard of quality when beta testing begins is that obvious bugs will be reported multiple times, and less severe bugs will be overlooked. When a tester finds a number of problems, he or she may relax the reporting or assume that one bug is caused by another. Also, some bugs do cause a multiplicity of symptoms, and tracking becomes more convoluted.

There are two primary forms of beta testing, “open” and “closed”. In open beta testing, the developer announces the availability of a “public beta” version of the software, and any interested party can download and test the software. For closed beta testing, the developer provides a “private beta” version of the software to a limited number of known testers.

Companies may use either or both forms of beta testing. The main advantage of open beta testing is that the software can be tested by lots of people to cover a wide array of systems and uses, at the expense of control and a possible impact on the marketing plan. On the other hand, closed beta testing provides the developer with better control of the process, but the disadvantage is that it is hard to find testers.

Some companies use both forms of beta testing, starting with a closed beta and then expanding to an open beta program once the program is closer to release. Microsoft, for example, runs an extensive closed beta testing program for DirectX, including the SDK and the runtimes, which lasts for several months each version, but near the end of this process, the beta runtimes are made available for public download. [Note: Microsoft has since ceased proper testing of DirectX SDK releases and is now a counter example, not to be followed.]

For either form of beta testing, you should insert a “drop dead” date in the code, so the program will not run after a certain fixed date. This prevents the beta from entering general circulation and reduces testing of outdated versions. Note that this technique should never be used for release versions, so you must remember to remove it before the final version. You must also remember to update the date with each new testing version lest you have a valid beta timeout prematurely.

Just as a feature complete product signals the approaching end of the alpha testing phase, the impending completion of the beta testing phase is signaled by a “release candidate”. A release candidate is a version of the product that is potentially the release version of the software. At this point, testers should be instructed to report every bug they find, even if they have reported it previously, since all bugs should have been eliminated. If bugs are corrected, another release candidate should be created and tested.

For the first release of a product, the traditional beta version numbers start at 0.90 and approach 1.0, the release version. I know of one game product, on which I did not work, that had so many beta versions that the producer gave the team shirts that read “Version 0.99999999…” with the nines running all of the way down one of the sleeves.

[continued in Standard treatment]

Quality: The Process, Part III [Greek to me]

[continued from Quality: The Process, Part III]

Greek to me

Every program with more than seven lines of source code has bugs. It is important that software developers do whatever is feasible to eliminate bugs. With mass market software, one can be confident that even rare bugs, when multiplied by thousands of users, will be discovered. Bugs in some specialized and vertical market software could actually cause damage or injury. In any case, when distributing shareware, bugs will cost you sales, so quality will directly help your bottom line.

The most innovative approach to elimination of bugs, which I must credit to Barry James Folsom, involved a simple corporate proclamation. As the new President, he called for a meeting and all of the several dozen developers in the company were gathered. After an introduction, he declared that none of our software would have “bugs”. From that point forward, it could only have “defects”.

It may not be terribly practical to simply redefine terms and create quality, but this dubious proclamation did have a point. When a customer or, in the case of shareware, a potential customer is using the software and it fails to work properly, that is a problem. “All software has bugs,” is not comforting, so we need to look at the software the perspective of a user.

Let’s start at the very beginning, with alpha, or more specifically, “alpha testing”.

Alpha testing is a form of black box testing that is performed in-house. In practical terms, alpha testing is simply the developer using the software in the same way that a customer would, prior to making the software available to others.

After each version of the software is ready, I close all my development tools, clear the registry and data files, and pretend to be a user seeing the program for the very first time. I start by running the program installer, and then launching the game (in our case) using the installed shortcut, as opposed to the debugger. I will then just play the game for a while, recording any problems that arise.

Once I am comfortable that the program is working as intended on my development system, I then copy the installer to at least one other test system. Rather than install the software myself, though, I enlist somebody else to do it. This can be a colleague, friend, spouse, child, parent, pet, or benevolent stranger. I provide no other instruction, and note where any questions are asked. Any problems witnessed here will also be experienced by users on a larger scale.

In a formal testing environment, alpha testing involves testers systematically checking the software according to the specified test plan, combined with actual use of the software. In a corporate environment, the test plan is executed by the QA department. In small businesses, it generally falls on the programmers to follow the test plan. In either case, anybody willing should try using the software. In a larger company, I would throw an “open house” to show the software to other employees. As an independent, simply having the game available for play is sufficient.

Alpha testing should begin as soon as the software is usable, and this will necessarily overlap with program development. At some point during the alpha phase, the software should become “feature complete”. This means that all intended features for this version are in the program and functional. It does not mean that the performance is optimized, nor does it mean that the interface is finalized, but it should do everything that it was intended to do.

[continued in Beta move on]

Quality: The Process, Part III

[This article was originally published in the January 2003 issue of ASPects.]

Good things come in threes. Literature is rife with examples. Jack (of Beanstalk fame) received exactly three magic beans for a reason. However, with deference to Sigmund Freud, sometimes an article is just an article.

In the first installment of this trilogy, I introduced some foundational concepts for testing, including planning, some quality assurance terminology, and classification and tracking of bugs. The second part, the story bridge, covered general tools and techniques that can be utilized during product development. In this, the conclusion, I will discuss testing methods used as the software reaches a functional stage.

[continued in Greek to me]

Quality: The Process, Part II [Getting some help]

[continued from Automatic or manual]

Getting some help

Up to this point, I have discussed a variety of methods for improving the quality of software that can be implemented solely by the programmer during the development. However, as the program gets closer to completion, it becomes important to enlist the help of others for black box testing and feedback. That will be the topic for my next installment.

In the meantime, there is an opportunity to implement some of the above tools and practices into your development process.

Gregg Seelhoff is an independent game developer and charter member of the Association for Professional Standards [now defunct].

Quality: The Process, Part II [Automatic or manual]

[continued from Beyond the build]

Automatic or manual

At each development stage, an application has some new or updated features that will need to be tested thoroughly, beyond a quick execution of the program. Certainly, the code should be pretty solid after having passed through some of these tools, but there is still no guarantee that the results produced are actually correct, except to the extent that they are manually checked.

It is very important that you test your application to make sure that it withstands unusual input and produces correct results, or fails gracefully, especially if your software can be used for mission critical operation. This will often involve checking more input and output than a team of testers can conveniently generate, so this is where automated testing tools can help you with quality assurance.

One type of automated testing tool interacts directly with your source code and automatically generates special code, known as a “test harness”, which deliberately throws unusual parameter values at routines and monitors the results to make certain that the routines handle unexpected values reasonably. These tools have a number of different configuration options, but their general nature prevents them from having specific knowledge about a particular program.

Another type of automated tool interacts with the interface of a program, essentially providing a somewhat more sophisticated approach to what we use to call “keyboard testing,” which was just banging randomly and rapidly on the keyboard in an (often successful) attempt to crash or confuse the program. This type of testing is more appropriate for some types of applications than others. We have never investigated using this approach for testing our games, though a young child is a good substitute.

Developers can, and should, provide this type of glass box testing for their own products. You can write test harnesses that explicitly call routines with certain parameters and check for valid results. One excellent method for doing this, especially during optimization, is to have two separate routines that use different techniques for generating the desired results, and then run both routines, comparing results. This also allows you to profile both routines under the same conditions and ultimately use the better one.

For interface testing, you can use a standard macro recorder, software that records and can replay keyboard and mouse input into a program. Although this does not allow for random actions, it does allow a test sequence to be developed and verified on a regular basis. Also, testing an application with a macro recorder makes it possible to reproduce bugs simply by using the macro.

[continued in Getting some help]

Quality: The Process, Part II [Beyond the build]

[continued from Expanding our repertoire]

Beyond the build

The most powerful programs for glass box testing include source code analysis, runtime checking, and automated testing tools. These are not generally included in compiler packages, so they need to be obtained separately, and can often be somewhat expensive.

Source code analysis tools, better known as “lint” tools in C and C++ development, are utilities that examine your source code and produce warnings for potential problems. The output is similar to that from a compiler, except that the tool performs deeper checks, even emulated code walkthroughs, and has a larger and more specific set of issues to check.

A decent source code analysis tool would likely be your best investment of any glass box testing tool. Unlike a compiler, which merely needs to produce object code for a specific platform, a lint tool can check for a whole range of problems, from portability to standards compliance, and some coding guidelines. The details of potential problems can even help a programmer to better understand nuances of the language.

Lint tools produce many more warnings and errors than a compiler, but they also provide great flexibility to disable individual warnings, even for specific lines of code. It is unlikely that a non-trivial program could pass through such a tool at the highest level without warnings (and sometimes thousands of them), but each issue or type of warning identifies a pitfall that can be considered and resolved.

When developing, I run source code analysis on a regular basis to catch potential errors that the compiler missed. In this way, I can remain confident that my code is relatively free of silly errors, so I can instead concentrate on the logic of the overall code, not individual mistakes. Also, anywhere that my code does something unusual, there is, by necessity, a comment indicating a suppressed lint warning.

Another way of performing some rudimentary source code analysis, especially for a cross-platform project, is to compile the source code under two different development environments. It is somewhat inconvenient, particularly during the initial setup, but if code can build and work correctly from two different compilers, chances are pretty good that the code is solid.

Runtime checking tools include a variety of programs that automatically monitor the behavior of the program as it executes. Often, these tools check memory or resource usage, but they can also watch for invalid pointers and range errors, verify API parameters and return values, and report on code coverage. The most common benefit of these tools is to identify memory and resource leaks.

A comprehensive runtime checking tool serves as an ideal supplement to a source code analysis tool. While the latter catches potential problems with the code itself, the runtime checker highlights problems with the logic of the application during execution. Some tools can insert extra code and information during the build, in a process known as “instrumentation”, and this improves the runtime testing even more.

One issue with runtime checking is that it tends to slow program execution significantly, so it is definitely not intended for a release version, nor for every debugging build. Nevertheless, like other testing techniques, it is best to use the available tools early and often. The earlier a bug is detected and identified, the easier and less costly it will be to fix.

In my development process, I use my source code analysis tools after writing or modifying no more than a couple of routines. I use my runtime checking tools, at the highest detection level, after every major feature update, or before every delivery to a client. This glass box testing takes place in the background while I do black box testing of the application and, especially, new or updated features. If any problems appear, I address those problems right away before considering the feature to be done.

[continued in Automatic or manual]