How to start QA – Part I – Blackbox Testing

Unboxing Black box testing.

When you are considering starting a QA department or think about outsourcing the QA, there are three key factors you should focus on immediately, if you want to make it a success. These are completeness, correctness and performance. If you can nail down these key metrics, you will be on the path to success.

There has been this classical misunderstanding when development teams talk about QA, as they generally end up  talking about quality control.

Quality control is set of testing and procedures, you do before the release of the product in order to avoid users experiencing the defects of the software.

QA tools are divided into two categories, a  black-box and a white-box.

You should start with black-box testing and later try to automate it by focusing on regression, otherwise you will most likely spend your time on building two things – testing tools and actual product – simultaneously, where both are unstable most of the time.

Black-box testing is a method of software testing that examines the functionality of an application without peering into its internal structures or workings. This method of test can be applied to virtually every level of software testing: unit,integration, system and acceptance.

I would like to start by telling you my story in the trenches of QA and hopefully inspire you guys where to start.

During the early days of my career I started working as a release manager in Glacier 2 engine that powered the Hitman absolution game and several others. I was part of the early team and my role was to release the engine internally to the game development team.

Glacier engine is a very complex software, consisting of over million lines of code. Editor made in C#, helps designers manipulate and set the game, even when it is running. It is a very complex and powerful resource server with several packers. The render engine sits inside the Editor made in C++ and support PC, xbox and playstation simultaneously.

We were building the engine and the game at the same time with my responsibilities focusing on releasing the “build” to game team and making sure they delivered throughout the day.

The day I started, we were doing a build once a month. There were close to 40 testers in the company, but all of them were testing the game. QA was just about playing the game and finding unexpected behavior. The code between the engine team and hitman game team was also shared, and there were game programmers within the game team that were working with the same code as the Glacier 2 team.

Release procedure was to merge the code between two branches, compile the code, create an executable engine, resource the server, editor and several other executable’s. Once overseen it would then be released to the team to ensure everybody gets the latest code and the executable.

Guess what happens, if you do this? You can’t even compile the code, forget about running the executable.

As a result of my experience I decided to implement some QA procedures.

The first thing I did was to get a game tester with a background in computer science, and someone I had worked with before.

The first thing you do when you doing a test in the lab or anywhere else, is to create a control group. The question was how can we make a control group?

 A scientific control is an experiment or observation designed to minimize the effects of variables other than the single independent variable.[1] This increases the reliability of the results, often through a comparison between control measurements and the other measurements. Scientific controls are a part of the scientific method.

In order to have a control group, we needed to focus on what was working. Things we knew that worked!

We created an excel sheet with all the features that were implemented. We then clicked everything in the editor and called this excel sheet the functional test.

We then interviewed all the users and asked them which of these features were critical to their workflow. We prioritized everything according to the importance of the user, creating a rule that the release would not go ahead unless everything worked correctly, including on the the new build. In order to streamline the process, we created a release branch and conducted all the testing in the new branch.

We implemented integration, built robots that compile everything and ran some automated tests as well. Simple stuff,

  •      Start the engine and editor.
  •      Load a test scene – check the resource server.
  •      Shut down the engine

We agreed with the development team to drop everything and fix bugs immediately. It was also easier for the guys to fix the bugs, instead of going through a month or the week old code, now they had 10 changes list to look.

We dramatically increased the quality (stability) in the first 45 days.

We then decided to implement a QA team. I spoke with our QA manager and convinced him to lend me some testers from the QA department.

QA were all sitting in the entrance floor, I still remember going into the room and saying,

“We are building an Engine QA team, if you are interested in learning how to build games and learn software testing, we have a place for you.”

We started a new QA team with 6 testers. The goal was to create an acceptance test which covers all the functionality used by the game team.

For the new features, we create a template and called it a claim test. It is referred to as a claim test, because we accepted that everything programmers called “done”, it is just a claim.

Programmers shelved the change-list and the testing began in the afternoon, with the morning spent doing acceptance test. The test build gets the name of “candidate”. I later realized that how important it was to name different builds to avoid confusions.

We were able to release builds everyday. Statistically 4,8 builds / 5 days.

Let’s break down what we did in Claim tests.

Everything newly done is a claim, and it has to pass certain criteria.


Let’s say that in the user story or the requirement, there is a 10 functionality need from the “feature”. Tester tests the feature against the document and reported 8 out of 10 functionality is implemented. That means 80% complete.


Tester takes the functionalities and do explanatory test, and later some boundary test, and finds 3 major bugs. The correctness of that feature is 63,5%.


We also checked and reported bugs which go under 4 categories, pass     (100%), workable (75%), needs more work (50%), not release (< 50%).

We build our KPI around 4 things.

Stability – Completeness – Correctness- Performance.


Increasing the stability: QA sits in the middle of the engine team and game team, Don’t release any builds without confirming the stability. If QA misses a major crash bug, we roll back the build to the previous one, which is only a day old.

Increasing Completeness: If any team on the engine has a low score, it means we need to focus more on requirements management and align of expectations. You can clearly see when QA reports back to developers and product owners, Developers often say, “well, I didn’t know that we need to do that as well”. Divide and conquer, or actually spending enough time on aligning expectations can decrease the misunderstanding a lot, simple feature cards, (we are not a fan of documentation in game development)


Other things your product owners should focus on to increase the quality of the requirements are:

  • Educating developers, managers, and customers about requirements engineering practices and the application domain

  • Establishing a collaborative customer-developer partnership for requirements development and management

  • Understanding the different kinds of requirements and classifying customer input into the appropriate categories

  • Taking an iterative and incremental approach to requirements development

  • Using standard templates for your vision and scope, use case, and SRS documents

  • Holding formal and informal reviews of requirements documents

  • Writing test cases against requirements

  • Prioritizing requirements in some analytical fashion

  • Instilling the team and customer discipline to handle requirements changes consistently and effectively

Karl Wiegers Describes 10 Requirements Traps to Avoid

Increasing Correctness: This is a really good tool to give feedback to the domain leads. They can go through lists of bugs and see the patterns on different pitfalls that programmers are experiencing, and create awareness and even start working with individual programmers to increase their competence.


Part II  will be about white box testing.

Before that I strongly recommend reading this nice fable about testing

Thanks for reading.


Leave a Reply

Your email address will not be published. Required fields are marked *