Automated Testing in games development

I work as a programmer in the boring, non game related industry, and there's something I've always wondered about what happens on the other side of the fence, as it where. Do game companies make use of Test Driven Development stuff such as unit testing, or do they rely entirely on manual QA testers?

In my work, if we release code with bugs in calculations, it can lead to some really nasty real world effects, such as someone building a multi-million pound wind farm in a place with no wind because a bug in our calculation told them it was going to be really windy (yes this has really happened). As a result, all our code goes through very rigorous automated unit tests every time it is built to ensure that whatever is released to the users is giving them the right answers.

Now, I know that game development doesn't need to be quite as rigorous as it isn't expected to precisely model the real world. However the reason I bring this up is the issue Elemental has had with the elemental shards not increasing the power of spells. This is the sort of thing that would be really easy to encapsulate in a unit test, leading me to believe that either Stardock doesn't use unit testing, or that its coverage is really poor. While unit testing isn't going to help with UI issues, AI failures or memory leaks, it would certainly help with these small annoying little calculation errors that seem to crop up a lot in games (not just Elemental either).

Is it just because TDD is seen as 'boring' by games devs, who are too excited about putting in all the bells and whistles that they don't have the time to make sure that things actually work?

47,300 views 19 replies
Reply #1 Top

Depends on what you're testing for, who (company wise and position wise) is testing, and the resource available. Though in general, being that most games are fairly unique outside of long lived series and also under heavy resource and time constraints, usually any sort of automated testing is likely to take as much time to write up (on a massive scale) and implement over the course of development as manual testing. Remember as well that things tend to change - a right and wrong value can change from build to build as can how it's all figured out. Unlike other development cycles, video game development can tend to be fairly unique from project to project as opposed to other sorts of development. It's not like building a car where there's a foundation to build upon and fairly consistent from project to project.

 

At best, I usually hear about tracking of player actions (player looked at these coordinates for X secs) and other player metrics as opposed to trying to get a program to be a player and even then it's as much a matter of the developers sitting and watching as much as anything else. The only other times I can think of is when simulating something that simply can't be done by players such as network code (artificially inflating ping times and such) or something like soak testing where the point isn't to test something specific but just how the game handles as a whole.

 

Not to mention that dev tools and such are usually pretty unfriendly to begin with. And to misquote something, "Crunch time is a heck of a drug."; with the greater uncertainty involved in video game development and flow of money, resources can be scarce as can attention and energy. A notable game developer mentioned that it's a lot like having to re-invent yourself every other year - if you aren't focused on the next project, if you aren't doing something good... you're left in the dust and out of a job.

 

Not that it's an excuse. Just a reality. It's a pretty high pressure industry no matter how it may seem on the outside.

Reply #2 Top

There's several different problems when testing a game compared to an application. The two main ones being you'll see a much more diverse set of hardware in the field and of course testing can tell you whether the code is working, but it can't tell you whether what it's doing is any good. You might pick up shards aren't working properly, but you still don't know if the shard system as a whole works as a game mechanic.

 

 Although the customer expectation is a problem. When I worked at Sun we'd happily release 'unfinished' products to customers since they're paying for an X year license, and one of the most important aspects of the development cycle was getting customer feedback - we'd have an app which could potentially do X, Y, Z but you need your customers to tell you precisely which route(s) they'd like development to go down. You can't do that in the games industry unfortunately.

Reply #3 Top

Quoting Archonsod, reply 2
we'd have an app which could potentially do X, Y, Z but you need your customers to tell you precisely which route(s) they'd like development to go down. You can't do that in the games industry unfortunately.
End of Archonsod's quote

 

right now you can

 

there is internet, and forums you know :D

Reply #4 Top

Well personally I think that cutting out unit testing because it saves time is a false economy. Sure you save some upfront costs, but unit testing pays itself off many times over when things get to the debugging stage. Not only do they quickly tell you that something is wrong, but they can often lead you to the exact part of the code WHERE it went wrong.

When making a change, unit tests are actually a time saver, not a time waster. Take my example of the shard damage calculator. Say you wanted to change the shards to give a 1.5x bonus instead of a 2x bonus, you'd go to the shard damage unit test and change the expected result to the new value, causing the test to fail. Then you change the code to make the test pass. This gives you two advantages:

  1. You know that you've successfully made the change which you wanted
  2. You know your change hasn't broken anything else (because your 'damage based on int' unit test is still passing), or if your change has broken something else, you know you need to work out why and fix it

Now I'm not suggesting that you can unit test everything. As I said, UI, memory leaks and so on, not to mention 'the fun factor', still need to be dealt with by human testers, but unit tests can really help iron out the fundemental mechanics issues in a program, and I don't see how that's any different in game or application development.

Reply #5 Top

Quoting Archonsod, reply 2
. You might pick up shards aren't working properly, but you still don't know if the shard system as a whole works as a game mechanic.
End of Archonsod's quote

I'm going to agree with this, because it mirrors my experience in business software too. You can unit test if a button does what the coder thinks it should do. You can't unit test if the users understand when they should be clicking it.

Games are similar. You can unit test if the mechanics and the UI work the way you intended. You can't unit test to see if they actually form a fun and coherent game that users can understand.

Reply #6 Top

Er... well yes. I already said that (twice!). The question was, why isn't it being used to guard against the stuff that it DOES work on?

Reply #7 Top

Now I'm not suggesting that you can unit test everything. As I said, UI, memory leaks and so on, not to mention 'the fun factor', still need to be dealt with by human testers
End of quote

Why do memory leaks require human testers?    That is one of the absolute most important things automation and unit test need to detect.   Humans suck at it.

Reply #8 Top

It's just there's no real way (AFAIK) to work out whether your memory has been freed up or not in a unit test. I suppose you could check out the overall memory usage before and after doing something, but since you don't know what's using the memory exactly it's hard to tell whether or not it's legitimate memory use or not. It really needs a human eye.

Reply #9 Top

Hard to say if unit testing or test driven development at that would have helped without knowing how stardock develops or how their QA works. IMHO all we can do is speculate unless one of the devs feels like sharing,

Reply #10 Top

Five year veteran of game QA here. At the publisher I work for, we do testing based on actual gameplay, and also do systems (unit) testing.  In other words, if the game has a gun, we've got a test plan that goes something like:

 

Get gun X.
Examine gun X in inventory.
Equip gun X.
Draw gun X.
Fire gun X at target, using decals to evaluate spread.
Holster gun X.
Switch perspectives, and draw gun X.
Rotate the camera.

And on and on and on.  Each step of this would also convey what is expected.  If something doesn't work as expected, it's bugged.  Naturally, we also do play testing, and if the gun is picked up by a tester there and they notice problems, they'd be expected to bug that, too, even though their primary assignment is to complete Quest Y, or earn all achievements, or whatever.

The same sort of plans exist for just about everything.  Interfaces, minigames, systems, etc.  Naturally, I can't speak for Stardock's testing procedures.

Reply #11 Top

You can do a lint checker.   Every new() needs a corresponding delete(), every malloc needs a corresponding free.   You can put assertions in your destructors that record that fact that they have been run, and you can put assertions which define parent/child relationships in objects.   And by parent-child I don't mean inheritance; I mean this object's constructor (or other methods) spawn a new() of other objects.   Those are the children.   And it's better that the guy who embeds the assertions in the code is not the same guy who wrote the program. 

Java does garbage collection.   So clearly, freeing objects as they get orphaned can be done.   It's just when you go C++ you can be more nice about it, such as your "garbage collection" prints a warning instead of actually does it, and you only include this stuff in your debug builds--not release builds.   Plus C++ lets you break the rules from time-to-time, and when you do you need some kind of waiver for the rule check.

Reply #12 Top

I was a business application developer for 14 years and have just recently made the transition into the games industry.  I have only been working for a professional game studio for a short period of time, however, when i was a business application developer we made extensive use of agile software development processes that included things like TDD, pair programming, design patterns, etc, etc.  Many people at my studio are not big fans of TDD or automated unit testing in general.  It's just a different industry.  My opinion on why TDD and other types of development processes are not used are as follows.

1.  Business applications are built and maintained over many years.  Unless a game has a server component it is built, shipped, and maybe patched a few times.  The code base is not maintained or reused save for maybe the engine and a few other things.  This sort of limits the usefulness of unit tests.  This is not true for all games but probably for most.

2.  I think the biggest one is time.  I know in every agile training semenar not having enouph time to unit test is always balked at with 100 good reasons why you should make time.  However, there's a difference between what's said in a semenar and the real world.  Many times I have days to implement a feature for a game that I would be given weeks to do on a business app.  Having made extensive use of automated unit testing in my career I have been on projects where the unit test code base is the same size or bigger than the code base of the actual software application.  There simply isn't enouph time in the schedual to account for maintaining two code bases.  You might be saying well, you're gonna have to waste alot more time fixing bugs if you don't unit test up front.  Yes, this is true however it's far more important to finish enouph of the game to see if it's fun than it is to have a code base backed up by unit tests only to find out the deisgn team decided the game is not fun and we have to start over. 

3. This kind of ties into item 2.  On business applications you might have a few features change during the development cycle, however, I've been on projects where they scrap the entire direction of the game because certain things aren't fun.  This can happen many times during the dev cycle.  Trying to maintain an army of unit tests in this manner is unrealistic.  The most important thing is getting the game to a point where you can see if what you're trying to do is fun.  It's kind of like working on a prototype real quickly to see if something works.  You typically don't unit test that.  The problem is that once you get to a point where something is fun, there is a large portion of the code written and not enouph time to go back and write a bunch of unit tests.  The bottom line is that the game industry seems to take on the risk of not unit testing to try and mitigate the risk of building alot of infrastructure code around game play features that aren't fun and need to be thrown out. 

That said, if there is a server component to a game then I would argue that the server side of things should be unit tested because you will probably be maintaining the code base over a period of years (possibly).  At this point one can make a compelling argument for unit tests.  This is probably true for the engine as well, however I do think it's a waste of time for the vast majority of game play code that is written once and never really maintained after that. 

In the game industry, FUN is the highest priority and it's really the design team that drives this, not the programmers, so things like automated unit testing, etc tend to take a back seat.

 

+2 Loading…
Reply #13 Top

Quoting ddd888, reply 3

Quoting Archonsod, reply 2 we'd have an app which could potentially do X, Y, Z but you need your customers to tell you precisely which route(s) they'd like development to go down. You can't do that in the games industry unfortunately.

 

right now you can

 

there is internet, and forums you know
End of ddd888's quote

 

Gamers can tell you what they want. They can not tell you how to get there. Saying "I want Gears of War but in a fantasy world" doesn't really tell you very much in terms of how to actually develop that sort of game. Even with more specific feedback, outside of pure playtesting and iteration, video games (or any sort of creative industry) are hardly formulaic. A lot of what makes a game isn't features or pretty pictures - it's the intangibles. The design of the levels, the use of art and sound, the refinement of the engine and controls. Things that gamers rarely ever mention or think about.

 

And these things are vitally important for development.

 

No amount of internet whining and forum feedback will tell you that moving the collision box on all ledges to be half a foot away from the art makes the game more fun. At best, you'll know after the fact when they say "Screw you! You're terrible devs! Jumping on to ledges is too damn hard!" 

 

No amount will let the artists know how the art is distinct and works with the level design to create mood and draw the player to certain things. At best, they'll say "Art sucks. I want more brown."

 

No amount will tell the programmers why the AI works or doesn't work. At best, you'll hear "The AI is too passive/too good". And that doesn't tell you how to fix it or what to fix it too because it can be any number of things.

 

One of the examples I always use is FEAR and something one of the devs talked about in an article. One of the things people always love about FEAR is the AI. The way they would flank you and surround you. They would take cover and suppress you. However, this is merely an illusion. In actuality, all the AI knows how to do is move from cover to cover. That's it. The reason they flank you is purely a result of the good level design. 

 

But that's not something any sort of player feedback will be able to guide you to doing.

 

To add on to JJ Guzz's really good explanation, imagine if shards were changed not to be a damage bonus but provide typed mana (fire mana, water mana) with damage based on how much of that mana was used in a spell. The automated checking no longer applies in the least - it would need to be re-written. And yet this is the sort of change that could happen fairly frequently. After all within the span of a month, we're going to go from individual mana pools to a global one.

 

A more extreme example of what JJ said is Borderlands. Three years into development, the -entire- art direction was scrapped completely. Every single art asset was thrown out. And this was ultimately a good thing. While this is art, this is the sort of thing that happens - Valve and Blizzard will build multiple prototypes and even work on games but never finish them because they're not fun. Starcraft Ghost was essentially developed almost to completion five times before being cancelled because Blizzard just didn't feel it was up to par and fun enough.

Reply #14 Top

 The bottom line is that the game industry seems to take on the risk of not unit testing to try and mitigate the risk of building alot of infrastructure code around game play features that aren't fun and need to be thrown out. 
End of quote

I really like the intelligent responses.   You sure that is an attribute of the game industry, or is it just that of a poor development process?    You're lucky you actually get days to implement a feature--in my job, by the time I get asked, it's already too late.   I guess I'm supposed to implement a mind-reading feature that anticipates tickets before they come in.   I already just ignore some tickets now, because I know I can just wait till Tuesday--they'll change their mind by then.  

I still think that by the time you release a product, you need to freeze the code and let QA hammer it.  Only bug fixes are permitted.   And then your product manager can judge it ready to release when new priority-1 bugs filed slows to a trickle or zero.  That's where the problem is--people have decided that time-to-market is more important, so they slash the QA cycle.   And then guess who gets to be your new QA team:   your customer base.  

Reply #15 Top

Quoting tetleytea, reply 14

I still think that by the time you release a product, you need to freeze the code and let QA hammer it.  Only bug fixes are permitted.   And then your product manager can judge it ready to release when new priority-1 bugs filed slows to a trickle or zero.  That's where the problem is--people have decided that time-to-market is more important, so they slash the QA cycle.   And then guess who gets to be your new QA team:   your customer base.  
End of tetleytea's quote

 

This practice IS used in the game industry.  Not always, but I'm 100% with you that it's best practice.  When it doesn't happen, there ARE consequences.  When it's a console game that can have a submission/certification process that's two weeks long, not locking code can mean unexpected bugs discovered at the last second, which can result in a delayed submission, which has consequences of its own on the business end.  Games for Windows certification is actually quite simple (http://www.microsoft.com/gfwcertification/en/us/default.aspx), to the point that you self-test, and send MS the results.  Consoles are a different story...

Reply #16 Top

Throw out every rulebook you own, video game programming is ad-hoc down and dirty, make it work and worry about the little problems later.

Use free code available on the internet whenever possible,  copy paste whenever possible, never spend too much time on any one thing.

This is what I learned from being a student, and from being the lead programmer in a video game startup.

However, that isn't to say you shouldn't code intelligently.

In my game, everything is modular, I can swap out stuff easily. The AI i just wrote doesn't rely on static information, it calculates its priorities dynamically in real time. I thought it was very strange that the Elemental AI was using outdated stats for equipment and things like that. They knew they were going to do patches, why didn't they design the AI to adapt automatically?

I can't judge Stardock since I don't know their situation (i wish i did... lets have a chat devs!), but video games are the complete opposite of government contract work. If we throw 100 exceptions / second... we dont care. If our sound cuts out randomly.... we don't care. If the values in the GUI are wrong... we don't care. As long as people have fun, as long as the problems don't intrude on the overall experience, it really doesn't matter to us or to the majority of customers! Make it work just barely enough and you have won the battle!

 

Edit - About Microsoft and their certification process: Its terribley slow. I believe that Fallout 3 had its release date delayed by Microsoft, and the first couple patches to fix critical bugs like a Crash to Desktop every 20-30 minutes was delayed by the certification process. Don't subject your customers to week delays for important patches, don't use an archaic quality assurance program like Game for Windows Live. If Elemental had used GFWL we would still be on patch 1.03 instead of 1.07... (also before suing me for defamation, Microsoft employee, this is pure speculation based on my memory of news articles)

Reply #17 Top

I think TDD slows down development by 20 or 30% (I think this is Microsoft figures), and I think the game companies want to get prototypes (features) out early rather than stable versions early. I'm not sure if there are good reasons for that, but it's certainly easier to change the gameplay if you get the features earlier.

So, yes, game companis prefer to churn out lots of (bugged) features and then play with them, throw many of these away and come up with new algorithms/mechanisms and test them quickly, then only iron out the last design.

But they do use unit testing in some areas, where the code requirements are stable enough (think network protocol handling, core game engine features, not gameplay).

 

Reply #18 Top

Very insightful posts here. I work in the non-game software industry and I always thought the only development method the game industry knew was "throw everything at the wall and see what sticks". Seems I'm not close enough.  Game engines such as UDK, Gamebryo or Kumquat are different story though.

Reply #19 Top

Quoting Seboss, reply 18
Very insightful posts here. I work in the non-game software industry and I always thought the only development method the game industry knew was "throw everything at the wall and see what sticks". Seems I'm not close enough.  Game engines such as UDK, Gamebryo or Kumquat are different story though.
End of Seboss's quote

 

To a certain extent, it varies by company and by discipline. Designing (whether levels, mechanics, or whatever) tends to be highly iterative and chaotic. Programming is much more straightforward. Art is somewhere in the middle. Especially in the current climate, systems like agile development and SCRUM are popular because they help wrangle the design into something more tangible and manageable.