Level Up! – How Much Testing?

Welcome to another round of Level Up! For today’s article, we’re going to be giving our answers to a question:

How much testing does it take before you feel OK about a deck?

For today’s responses, we have Michael, Travis, Arin, and Felix.

It’s something we have written about before; testing is good! Testing allows you to see what in your deck is working, and what doesn’t – provided you do it enough. Here’s the premise we went in with to write our responses:

There are many schools of thought when it comes to testing. Some players opt for a high-test approach, where combinations of cards are discussed and then tested, kind of like an A/B test. Others simply theorycraft to bits and run with whatever feels best. Occasionally, there are moments of brilliance when decks just work when built. But, that shouldn’t stop people from feeling the need to test decks. So how much testing does it take before a draft is considered good to go, or junk?


When it comes to testing, or even in this case brewing, I consider the following;


  • What utility does this set have
  • What finishers do I have access to
  • What ways are there to deal with -insert situation here-
  • What plus combos does the set have to offer
  • What tech options are there

Once I have those questions answered, I start the brewing process. Each build I have goes through roughly 20-30 games before I make any major changes such as combo swapping. At most quantities in the deck will be edited and in the case of a major flaw in one of the cards it will be removed.

TL;DR Play blue, true meta. Can’t have green without blue.


I usually just go with the first draft of whatever deck I build. If there are obvious fixes, I’ll make them, but otherwise I’ll just run with it. If the deck sucks, I might try something different, but I rarely change decks.


Testing is difficult. You need to have a good scrim partner or group to get the best testing possible. Barring that, you’ll hit the value ceiling much sooner with a less experienced group. This might sound elitist, but consider this: let’s say you’re in a chess club and you’re the only player with a rating higher than 2000. If the remaining players are all rated ~1400, you’re not going to be learning as much from your games as your competition. Similarly, if you try to play test with a newer player, you may find some time sunk into explaining the mechanics of the game or other intermediate concepts that you’ve known for some time.

So let’s say you have a good testing group, and a new deck idea. You’ve drafted it up and want to play it out. How many games do you play?

In a “perfect” world, we would have all the time we need to play hundreds or even thousands of games. But as we become more experienced, we may notice that we need fewer games to really know if a deck will work or not. We may also find ourselves with less time on our hands. Since WS does not have a pro player scene (and never will), the idea of dedicating entire weeks or months to testing as players in other games will probably remain foreign.

For me, because my time for testing is so limited, I try to record as much data as possible from a set of three games, or even a single game. I recall each decision made during the game that I made that could have been either improved or changed, usually with input from my opponent. In a pinch, I goldfish, and play a hypothetical game against an opponent who attacks for 2/2/2 almost every turn, and clears at least 1 character. It’s a very narrow range of games that this kind of goldfishing represents, but it’s mostly to prepare for using CX combos. I don’t recommend it as anyone’s sole method of testing, and can’t recommend it using it frequently.

In practice, I probably echo Arin’s sentiment; I’ll just go with a deck idea and try it out at a couple of tournaments as my ‘testing’. Between weeks, if something didn’t work well, I may make an adjustment. Otherwise, I’ll be patient and give it another try if I find that my misplays were more responsible for my losses than my luck. If I can’t make a deck do reasonably well within 3 events (somewhere between 15-20 games), I’ll make more major changes, or just scrap the deck.


So there are many approaches to take regarding deck testing ranging from 0 to infinity games played. It is possible for testing to never be done as a deck can continually go through updates and refinement. This pertains largely to games with new sets constantly and eternal formats such as Magic: the Gathering, Hearthstone, Shadowverse, and to a lesser extent WS. The reason WS is put as a lesser extent is while the game has no rotation, not every set constantly gets updated and the amount of innovation in deckbuilding is severely limited by that fact. The other listed games allows for improvement of older decks to compete with newer archetypes that pop up, unless that archetype involves Spawn of the Abyss. In Magic, you can have a Modern/Legacy/EDH deck that you constantly add to and improve as new cards get released. When choosing to swap out cards, new testing needs to be done to verify that the changes are good. In essence, the amount of test games played drops to 0 because it is a different configuration. As other decks change and adapt to a newer metagame, you will need to test against those newer decks which effectively drops your test games played count to 0 again. In this sense, you will never be done with testing (unless you just jam 3x Spawn of the Abyss in your deck in which case you’re basically done testing) until the creators stop releasing cards.


In a more middle of the road case, with a relatively stale metagame, the number of testing games can be finite. For example, you have 31 other players who consistently goes to your local legacy FNM. You know what everyone else is playing so when you test anything, you can proxy up a gauntlet of decks that you know you will be facing. In this case, assuming your opponents do not make major deck updates, after about 1000 games against each unique deck every time you change something, you should have a good enough data set to draw some conclusions regarding your configuration. Good luck if all 31 people play unique decks!


Lastly, let us discuss the most practical way to test. Play 1 game with your deck. If any cards underperformed, take them out and put something else in. Play 1 game with your new deck. Repeat until you get a game where no card under performed. Deck done. Repeat entire process every time you change anything. Then the night before any large/important tournament right before you go to bed, completely change any and all decks you will be playing to something you’ve never played before and proceed to either smash or get smashed.


Tl;dr play aggro/mono red/+2 soul rush and avoid testing completely and have fast games for sanity of mind.

