Matchups and Win Rates: Top Tier Decks (Part 1)

Posted on May 4, 2015 by Sheridan Lardner

Are you a Quiet Speculation member?

If not, now is a perfect time to join up! Our powerful tools, breaking-news analysis, and exclusive Discord channel will make sure you stay up to date and ahead of the curve.

Learn More

One of the more frustrating limitations of Wizards-published dailies is the lack of matchup information. You get lists, you get standings, you get win percentages, but you don't actually know which decks beat which decks en route to their finish. This is fine if you just want to describe the Modern metagame, but far less helpful if you are actually trying to figure out what decks are good. That's where the "MTGO Deep Dive" dataset comes in. I've focused on this project (recording dailies from the client in their entirety) in two of my last articles, the first showing overall match win percentages (MWPs) for different top decks, and the second highlighting some overperforming decks that were not necessarily top-tier. Now it's time to take the analysis one step further and see how decks matchup against each other. Is Abzan truly strong against Twin? Is Affinity vs. Burn just a race? The data will give us some answer to these questions and more.

This article uses the MTGO Deep Dive dataset to get the win rates of different decks in different matchups. This is very much in line with a similar analysis done by reddit user dafrk3in, who calculated matchups using data from Pro Tour Fate Reforged. Using data from MTGO, I run a similar analysis of matchups in our current March-April metagame. In the interest of space and of only presenting reliable results, I'm only going to discuss matchups among the top decks of MTGO. These decks are both pillars of Modern and have suitably high Ns for us to make conclusions from. This analysis will give us a sense as to how decks succeed or fail against each other, and how that knowledge can be used to make informed decisions going into events.

[wp_ad_camp_1]

Dataset and Methods

As in past articles, I'm using the so-called MTGO Deep Dive dataset I used in my last two articles. This project is the result of collaboration between me and a few other MTG friends from the MTGSalvation community. In essence, it's a set of 16 dailies recorded in their entirety. It includes not just the 4-0/3-1 matches we see online, but also the 2-2 or worse finishes that do not get published. And, of course, it also includes the matchups between decks. Although 16 may seem like a small N, these events span dozens of decks, hundreds of players, and thousands of games. The end result is a wealth of matchup data we can use to calculate the "true" MWPs of various decks in Modern.

In calculating MWPs, I have already adjusted for all byes, mirror matches, drops, draws, and other elements that could affect the accuracy of an MWP. This applies both to the overall MWP of any given deck, but also its MWP in relation to another matchup. All overall MWPs have also been compared to the "average" MWP of all MTGO decks in the sample, a weighted average that is also adjusted for all those elements above. Based on their values relative to the average, all deck MWPs receive a P value to indicate whether it is likely or unlikely to be an above average deck. A high P value (>.1) would indicate the deck is probably within expected variance and not truly above or below average. A low P value (< .1) starts to suggest a deck is above or below the average MTGO performance.

Finally, as is always the case with statistics in these articles, all other data disclaimers about the perils and pitfalls of statistics apply!

Matchups and Win Rates: Top-Tier decks

Today's article focuses on the top-tier decks of MTGO as defined and shown on the Top Decks page. For each deck, I give its prevalence in both the Top Decks and Deep Dive datasets, along with its overall MWP and the the significance of that MWP. After that, it's all matchup win-loss rates for the different top-tier decks. I'll end each section with some summary of the stats and takeaways I view as important.

It is important to NOT use these numbers as set-in-stone benchmarks for matchups. Rather, they should be checked against your own testing and game experience to see how they can confirm or challenge your own conclusions. This is the mix of quantitative and qualitative methods we want to see when looking at this kind of data. Remember: a lot of these MWPs could well be higher or lower than the "true" MWP if we had a much larger N, so we need to view this as a starting point rather than an ending one. Some of these numbers will make perfect sense (e.g. UR Twin vs. Grixis Delver). Others seem odd and demand further investigation (e.g. UR Twin vs. Abzan). Either way, it is up to us to interpret the data, not to just categorically accept or reject it based on a few datapoints.

As one last point of reference, the average MTGO-wide MWP is 49.25%. Use this as a point of reference when thinking about the different decks below.

UR Twin

Top Decks prevalence: 8.1%
Deep Dive prevalence: 8% (76)
Deep Dive matches: 242
MWP: 52.1% (p=.38)

vs. Abzan: 75% (6/8)
vs. Affinity: 58.3% (14/24)
vs. Burn: 50% (16/32)
vs. Jund: 50% (4/8)
vs. Amulet Bloom: 36.4% (4/11)
vs. Grixis Delver: 18.8% (3/16)

Twin has been called Modern's best deck, and although there is reason to suspect that is true, we don't really see it in the numbers. I think this is due in large part to the popularity of Twin. It's the third most-played deck on MTGO, but it's also the most expensive of the top three decks. This suggests you have players gravitating towards Twin just because they think it is "good", not because it is cheap (i.e. Burn) or the hot new thing (i.e. Grixis Delver). This is why Twin's MWP of 52.1% is probably lower than it would otherwise be in the hands of a skilled pilot. Moreso than Burn or Grixis Delver, Twin rewards tight play and punishes bad pilots, which explains why Twin's MWP is one of the lowest of the top-tier decks (while still being above average).

As for matchups, I am surprised the Abzan matchup is so heavily in Twin's favor. To me, this suggests player inexperience more than any other result: Twin is by no means a "good" Abzan matchup, but it's also not quite this bad. The Grixis Delver MWP is very interesting, because if true, it would go a long way to explaining why Grixis Delver is so successful right now on MTGO. The rest are about expected, although I was curious to see such an even Burn matchup. My guess is Burn wins this in any game where Twin is on the draw or misses the turn 4 combo, and Twin wins on the play and if it can go Exarch/Twin on t3/t4.

Burn

Top Decks prevalence: 9%
Deep Dive prevalence: 10.4% (99)
Deep Dive matches: 299
MWP: 53.9% (p=.11)

vs. Abzan: 63.6% (14/22)
vs. Affinity: 50% (13/26)
vs. Jund: 28.6% (4/14)
vs. UR Twin: 50% (16/32)
vs. Amulet Bloom: 36.4% (4/11)
vs. Grixis Delver: 81.8% (9/11)

Twin may be Modern's "best" deck, but Burn is its most-played. It's prevalence is actually higher in the Deep Dive dataset than the Top Decks one, which suggests there is even more Burn out there online than we see on public dailies. That said, the Deep Dive also shows Burn might have more going for it than just prevalence. With a P of .11 on its 53.9% MWP, Burn is actually pushing the upper edge of the expected variance of MTGO MWPs. Indeed, of all the tier 1 decks, Burn is the only deck that gets this close (although not quite making it). So Burn isn't just popular: it's also very strong. This makes sense to me because Burn is so dang linear, which is going to give you a lot of random wins against opponents who are too slow or too interactive.

Turning to matchups, neither the Abzan, Twin, or Affinity matchup should surprise anyone: the latter two are a race and the first is the matchup that put Burn on the map to begin with. I am very interested by Burn's apparent strength against Grixis Delver and weakness against Jund, two decks rising up for very different reasons. We know two big reasons for Jund seeing more play are because it is less painful than Abzan (thanks, Blackcleave Cliffs) and because it can use Lightning Bolt to stave off early threats. As for Grixis Delver, this also makes a lot of sense. Exterminate! is just awful against Burn, Gitaxian Probe is often free damage, and the deck has lots of fetches/shocks with no lifegain. So these are two more numbers supported by our theoretical understanding of the matchups.

Affinity

Top Decks prevalence: 6.9%
Deep Dive prevalence: 7.7% (73)
Deep Dive matches: 220
MWP: 52.7% (p=.31)

vs. Abzan: 40% (4/10)
vs. Burn: 50% (13/26)
vs. Jund: 60% (3/5)
vs. UR Twin: 41.7% (10/24)
vs. Amulet Bloom: 50% (4/8)
vs. Grixis Delver: 37.5% (6/16)

Like both Twin and Burn, Affinity does push the upper edge of the MWP range, but not with any degree of significance. And like Burn, the true prevalence of Affinity might be higher than the observed prevalence in the published dailies, which makes sense given the enduring popularity of Affinity in Modern. When I look at matchups, almost all the numbers above are in line with our expectations. Abzan is rough because of Stony Silence, and although 10 matches isn't the big N I would like to see, 40/60 seems about right for this matchup given all the factors. Burn makes perfect sense as a straight up 50/50 race, and Twin seems right at about 40/60 due to the dual pressures of efficient red removal (including the powerful Electrolyze) and a combo finish Affinity can't interact with in game 1 short of 3-4 Galvanic Blast. Of course, the Grixis Delver matchup is the most interesting, because it suggests a further reason as to why Grixis Delver is doing so well in this current metagame. Not only is the deck beating Twin, but it's also killing it against Affinity. This also makes sense from a theoretical perspective: Affinity would definitely struggle with the efficient removal of Grixis Delver, the efficient countermagic to stop cards like Plating/Thoughtcast, and fast and durable clocks they can't kill. This would only get worse in games 2/3 after Grixis Delver brought in anti-artifact effects.

Abzan

Top Decks prevalence: 4.9%
Deep Dive prevalence: 5.2% (49)
Deep Dive matches: 152
MWP: 53.3% (p=.324)

vs. Affinity: 60% (6/10)
vs. Burn: 36.4% (8/22)
vs. Jund: 71.4% (5/7)
vs. UR Twin: 25% (2/8)
vs. Amulet Bloom: 80% (4/5)
vs. Grixis Delver: 90% (9/10)

To me, the most interesting Abzan fact is not the win rates, but rather the metagame share. This deck has completely tanked across MTGO, a trend shared in paper but not nearly to the same extent. How did a deck that was 25% of the recent PT fall down to about 5% of the MTGO metagame in less than 3 months? The MWP doesn't explain it. 53.3% is at the upper end of top-tier deck performance, and although it's not quite statistically significant, it's still exactly where the so-called 50-50 "police deck" of Modern should be performing. So why is Abzan's share dropping if its overall performance is basically fine?

The matchup data gives us two possible explanations for this. The first is Burn. Burn is rampant on MTGO, and you don't want to be the deck with a bad Burn matchup. You also don't want to be spending about $2000 on a deck that has a bad Burn matchup either. It gets even worse when you are losing to Twin, a deck Abzan is supposed to beat. To be fair, I don't think the actual matchup between Twin/Abzan is this lopsided. Yes, there are reasons to believe Twin is actually favored in this matchup (or it is close to even), but this number doesn't make a lot of sense. Even so, assuming Abzan's true Twin matchup is closer to 40/60 or 50/50, that's still not enough to buoy a crappy Burn matchup. About the only saving grace for Abzan is its Grixis Delver matchup, which is exactly what we would expect of a BGx deck against a Delver deck. This number just has to be overrepresented, but even if it's just a 60/40 or 70/30 matchup, that's a big boost in Abzan's favor.

Jund

Top Decks prevalence: 3.6%
Deep Dive prevalence: 3.6% (34)
Deep Dive matches: 115
MWP: 52.2% (p=.54)

vs. Abzan: 29% (2/7)
vs. Affinity: 40% (2/5)
vs. Burn: 71.4% (10/14)
vs. UR Twin: 50% (4/8)
vs. Amulet Bloom: 66% (4/6)
vs. Grixis Delver: 50% (4/8)

As Abzan has fallen across Modern, Jund has gradually risen to take its place. Jund went from basically 0% at the time of the TC banning to about 4%-5% of paper and MTGO. The current MTGO prevalence is a little lower now than before, but Jund is still a very viable deck that is showing up everywhere. Indeed, it's getting the tier 1 bump in this most recent metagame update, and these MTGO stats give some explanation about why. From an MWP perspective, Jund is pretty average for the top-tier decks, which suggests it's not the overall deck MWP driving its rise. To see where Jund is successful, we need to look at matchups.

Our N is a bit small for some of these matchups, so I am hesitant to draw strong conclusions from much of this data. Two exceptions to this are Abzan and Burn. It makes a lot of sense Jund is weak to Abzan: Jund can't do anything about Spirit swarms, Path gives Abzan the midrange edge, and Bolt isn't very useful as removal. Dark Confidant is an easy way to improve this matchup, and my guess is if we controlled for Jund decks running Bob and those not running it, we would see bad Abzan matchups mostly in Bobless decks. But Bob himself is at odds with Jund's best matchup: Burn. I think this is one of the big reasons the deck is enjoying success these days, for similar reasons as to why Abzan is declining. A less painful manabase and Bolt go a long way to keeping Burn at bay. Expect to see more Jund as the format keeps evolving, based largely on this Burn matchup.

Amulet Bloom

Top Decks prevalence: 4.1%
Deep Dive prevalence: 3.3% (31)
Deep Dive matches: 104
MWP: 60.6% (p=.03**)

vs. Abzan: 20% (1/5)
vs. Affinity: 50% (4/8)
vs. Burn: 63.6% (7/11)
vs. Jund: 33% (2/6)
vs. UR Twin: 63.6% (7/11)
vs. Grixis Delver: 71.4% (5/7)

I don't really care too much about these specific matchups. The positive Twin matchup is nice, the unfavorable Abzan/Jund matchup makes sense (but seems overstated based on my own experience with the decks), and the strong Grixis Delver/Burn matchups are totally in line with Amulet Bloom's gameplan. But again, the most interesting data here is not in the matchup section. The real takeaway is that MWP and its corresponding P value. Of every deck in the dataset, Amulet Bloom is the only deck with more than 20 matches to have a P value so low. The .03 means it is 97% likely that Amulet Bloom has an MWP truly above average relative to the average MTGO performance. And boy, is it above average! 60% is well over the 49% average and even the expected variance. It's also well over what other decks are doing. Yes, this data has a lot of limitations, both statistical (e.g. the size of N) and contextual (e.g. it's MTGO data). But this MWP is still so far above and beyond other MWPs that it's impossible to ignore. If someone were to ask what the best deck in Modern is, I'd probably have to say Amulet Bloom. We had reasons to suspect this in the past, and this is yet another datapoint confirming the theory. This still leads to questions about its relatively small metagame share if its MWP is so high, but those questions don't undercut the MWP and its significance.

Grixis Delver

Top Decks prevalence: 8.4%
Deep Dive prevalence: 7.2% (68)
Deep Dive matches: 213
MWP: 48.4% (p=.79)

vs. Abzan: 10% (1/10)
vs. Affinity: 63.5% (10/16)
vs. Burn: 18.2% (2/11)
vs. Jund: 50% (4/8)
vs. UR Twin: 81.2% (13/16)
vs. Amulet Bloom: 28.5% (2/7)

I could write a whole article on how awesome this deck is. Oh wait... Grixis Delver is one of Modern's hottest new decks and these stats give some context to its rise. The prevalence is obviously striking: this is a deck that went from 0% to 7%-8% in about 2-3 months without a single pro player or major paper event driving that rise. This is an MTGO community special, developed more or less independently by players across the community. This homespun approach is reflected in the MWP, which is solidly average and one of the lowest of all the different top-tier deck MWPs. To me, this is expected behavior. For one, there is no established Grixis Delver baseline list for players to use. Two, because the deck has such a flavor-of-the-month feel, lots of players are picking it up without necessarily much experience. Both of these factors will bring down the deck's "true" MWP.

Grixis Delver's matchups are probably the most interesting of all the different matchups we have seen so far. If you want to know why the deck is successful, look no further than the Affinity and Twin matchup. Beating Affinity is good, but absolutely trouncing Twin is something special. Few decks can do this, and I think this is a huge reason for Grixis Delver's success and popularity. By contrast, the Abzan and Burn matchups are just terrible, which presents an odd metagame tension when deciding whether or not to play this deck. When comparing these quantitative measures to our qualitative theories for why Grixis Delver might be good or bad, we see a lot of overlap. For example, the deck's disruption is awesome against Twin and Affinity but (with the exception of Exterminate!) totally underwhelming against Abzan and just plain bad against Burn. Although I think all of these numbers are on the higher end of their actual range, their general thrust is about accurate, so you can adjust them by +/- 10% or so and probably be closer to accurate.

Next Steps

For all you MTGO and Modern regulars, you are probably wondering about decks like Merfolk, Scapeshift, RG Tron, and other decks we would expect to see matchup data about. I'll discuss these matchups, and some additional takeaways from today's article, when I revisit the Deep Dive dataset next week. Just considering the numbers today, I can't emphasize enough that these are not immovable benchmarks. We should not read these numbers and say "Grixis Delver has an 80% win rate against Twin". That's both a misuse of the dataset and a misunderstanding of how Modern matchups work. Rather, we should say "Grixis Delver seems strongly favored in Twin. What are some interactions and cards that could explain this on both sides of the table? Does this line up with my experience with the decks?" This is the way to use matchup data like we looked at today.

Join me on Wednesday when I give some metagame updates and talk about some new changes coming to the Top Decks page. Until then, enjoy those new MM2015 previews (Ignoble Hierarch confirmed at RARE!!).

21 thoughts on “Matchups and Win Rates: Top Tier Decks (Part 1)”

Roland F. Rivera Santiago says:

May 4, 2015 at 9:47 am

As always, very interesting article. I have to say that the fall of Abzan (and the corresponding rise of Jund) took me a bit by surprise at first, but your analysis definitely points out a couple of reasons why that may be the case. My curiosity was piqued by your mention of Dark Confidant, though – is there truly a niche for him in Jund, given that Tasigur is just as good in that deck as it is in Abzan? I feel that (for the most part) big Tas both eclipses him and makes running Bob supremely risky (flipping a 6-drop = not fun). Thoughts?

Log in to Reply
1. Daniel says:
  
  May 4, 2015 at 11:07 am
  
  I think with Jund’s total average curve being lower that playing bob with 2 tasigur is ok. Abzan curve average is brought up by the full playset of rhino’s. I think the general consensus is that jund is better against he field, where junk is better in the BGx mirror, but having access to bob definitely improves the mirror match. He’s bad in the burn matchup, but can still trade with a goblin guide. Sure lingering souls is a great card in grindy matchups, but against a bolt-less opponent (junk) bob still puts in great work, at the very worst giving your opponent 1 less path/decay to point at your goyfs/scoozes.
  
  Log in to Reply
2. Sheridan Lardner says:
  
  May 5, 2015 at 7:00 am
  
  I would agree with Daniel’s assessment. The big reasons you can’t run Bob in Abzan are the 4 Rhino on top of the 1-2 Tas, and the manabase. With respect to the creatures, that’s just too many cards that can randomly kill you. With respect to the lands, It’s not very likely to resolve a turn 2 Bob with more than 16-17 life as Abzan, just because you have to run shocks and fetches to enable that turn 1 IoK, Path, or Hierarch. With Jund, it’s totally plausible to resolve a Bob on just 18-19 life, which makes a big difference against decks that take an aggressive posture. Jund has the added benefit here of being able to open turn 1 Bolt into turn to Bob against an opponent who played an aggressive one drop. Abzan has to Path that turn 1 play, which just guarantees a turn 2 play that is ahead of the curve. These are just some of the reasons Bob is a much more viable card in Jund than in Abzan.
  
  Log in to Reply
3. Josh says:
  
  May 10, 2015 at 6:49 pm
  
  In line with the other comments, and as a Jund player/BGx player in general, we can get away with Bob in Jund for a generally lower curve, a slightly less painful mana base, more lifegain cards up the curve, and somewhere in the range of only 3-5 cards with a CMC of four or more.
  
  Most Jund decks are running 2-3 Scavening Ooze, 2-4 Kitchen Finks, 0-2 Huntmaster of the Fells, and 0-1 Batterskull. Also TS has become much more of a 0-2 of in the main deck of most BGx builds lately causing even less main deck life loss. There are definitely still games where you flip Tasigur or Batterskull for a loss, but even in a long game those odds are pretty slim…but I’m not going to lie, I’ve found myself bolting/terminating/abruptly decaying my own Dark Confidant on occasion just to avoid the chance in a few tight games 😀
  
  Log in to Reply
Mr Pink says:

May 4, 2015 at 10:09 pm

Is there no presence of the Soul sisters deck in the meta game now?

Log in to Reply
1. Sheridan Lardner says:
  
  May 5, 2015 at 7:01 am
  
  It definitely comes up, and until the most recent dataset update, was actually the deck with the most statistically significant MWP of all decks with a reasonable N. It’s gone down a little since then, but it still remains a very viable deck on MTGO along with Abzan Liege, Infect, RUG Twin, and Esper Mentor. Expect to see more on these decks in the next article in the series!
  
  Log in to Reply
Jacob says:

May 5, 2015 at 5:09 am

Can you do Little Kid Abzan for your next article as well? I’m extremely interested in the MUs and viability of the deck as a whole. Also, big fan of this website. I check back everyday just to see if there are any updates.

Log in to Reply
1. Sheridan Lardner says:
  
  May 5, 2015 at 7:03 am
  
  Abzan Liege is coming up next week! It has some very decent matchups across the board and is one of MTGO’s more viable decks.
  
  Glad to hear that you enjoy the content! Let us know if there’s anything you want to see more/less of, or any general changes.
  
  Log in to Reply
Kathal says:

May 5, 2015 at 5:49 am

Heyho,

a very interesting article about the match-ups from various decks based on pure data (although for some match-ups the N is quite low, but I still think, you can take those values as guidelines for those match-ups).

There are 2 things which surprised me:

1) The prevalence of both Grixis Delver and Amulet Bloom. If I’m correct, both decks have only a “few” players (relative low percentage of the deep dive metagame) compared to the top end finishes (which is the first prevalence of the data). This means for me, that those 2 decks have more 4-0/3-1 finishes than other decks with a lower player base (so the player base from Grixis Delver and Amulet Bloom 4-0/3-1 more dailies than the player bases from other decks). The exact opposite of this is Burn (10,4% to 9%) and Affinity (7,7% to 6,9%).

2) The variations of the match-ups between Jund and Junk. I never thought that some match-ups can swing that much around.

In the end, I have a small question. Would it be possible to create a metagame sheet form the “deep dive” metagame, since I think there would be several “huge” shifts compared to the high end finishes metagame spreadsheet (would be especially interesting because of Sideboard cards on MODO).

Greetings,
Kathal

Log in to Reply
1. Sheridan Lardner says:
  
  May 5, 2015 at 8:28 am
  
  Re: Delver/Bloom
  I think this is true of Amulet Bloom but not necessarily true of Grixis Delver. That deck really is everywhere, and I wouldn’t be surprised if the true prevalence were anywhere between 7%-9% if you had a full picture of the Daily metagame. But for Amulet, your theory makes a lot of sense. We already know this deck doesn’t have that many unique pilots (at least, not relative to other decks), and that it has a crazy MWP. This suggests it is indeed overperforming based on a few players’ successes.
  
  Re: Junk/Jund
  Because the Ns are a bit smaller for Jund, the only swingy matchup we can reliably comment on is that Burn matchup. This actually makes a lot of sense in my experience, because Bolt is so much better than Path here, and because the manabase gives you 2-3 free life over the course of a game. This is the sort of difference that really adds up over the course of a match.
  
  As for the metagame sheet, I’m working on getting something together to make the numbers more viewable for our readers. No ETA yet, but it’s a work in progress! I have some other Top Deck page/metagame sheet updates coming tomorrow too.
  
  Log in to Reply
  1. Kathal says:
    
    May 6, 2015 at 11:21 am
    
    Thanks man and again great work!
    
    Greetings,
    Kathal
    
    Log in to Reply
Bill says:

May 5, 2015 at 7:07 am

i am a bit confused with certain BGx match-ups, as they do not confirm my experience at all (i know my experience is statistically insignificant and i am by no means pro but i am not a new player either and i have done considerable playtest and research

of course i recognize that i might be simply wrong, i do not claim to be better than the testers in fact i might as well be a worse magic player/deckbuilder

Abzan 60% vs affinity due to stony silence for instance? game 1 is most of the times lost and Stony Silence is 3? sideboard slots, i will mulligan medicore hands without it, but aggressive mulligan is not an option unless we run 4 (never seen anyone doing this so far), also Abzan’s 90% vs Grixis Delver seems too much : UR delver is indeed a walk in the park but now they have big creatures of their own to rival Goyf/Rhino and relevant black removal, it’s still favourable but in my experience (which if of course not that much of a sample like 60-40 in Abzan’s favour, lastly 50% vs Twin seems to low, i don’t feel that this deck can win without Blood Moon, their combo is vurnerable and i am certainly not afraid of Keranos which mostly provides Bolts vs a Boltproof deck

lastly Jund’s match-up with burn seems abnormal, if that’s true i am switching to Jund on the spot, but burn has always been a bad match-up for Jund and now it’s stronger than ever, in my meta i am facing lists running 4 Atarka’s Command and 2-3 Skullcracks and life gain has never been more unreliable against them and will generally not work unless paired with disruption aiming for these effects

Log in to Reply
1. Sheridan Lardner says:
  
  May 5, 2015 at 8:35 am
  
  As I mention in the article, I agree with you on many of these points. It’s a matter of checking our quantitative results against experiences. For instance, with just 10 matches total and a win rate of 6/10, we can’t safely say that Abzan’s MWP against Affinity is just 60%. It’s probably anywhere from 40%-60%, all things considered. Think of it this way: If one Affinity pilot made a single misplay in one of their games (e.g. bad combat math, which isn’t that hard to screw up), then that immediately switched the matchup from 5/10 (50% – “even”) to 6/10 (60% – “favorable”). So the only thing we would want to definitely conclude for Abzan vs. Affinity is that the true MWP is somewhere around 50% and maybe trending positive.
  
  For Grixis Delver, I mention that Abzan’s true win rate is almost definitely not 90% and is probably closer to 60%-70%. But that’s still a very favorable matchup for the Abzan player, all things considered: most matchups are about 45%-55% in Modern, so getting up to 60%+ is huge. Again, the takeaway would not be that the deck has a 90% win rate, but rather that it has a “very favorable” matchup here that is probably at least 60%.
  
  We can use a similar framework for analyzing the different Burn matchups with Abzan and Jund. In reality, Jund’s Burn matchup is probably around 60%/40% (Again, trending the observed MWP back to 50%). But that’s still pretty solid for a BGx deck, especially if it could be as high as 65%/35% or better. This one makes more sense than the Abzan vs. Affinity MWP because we can identify more qualitative reasons to explain it (e.g. Bolt, manabase, etc.), because the N is larger, and because we have seen a rise in Jund online and this could be a big factor.
  
  So overall, I think you are right to question quantitative results like these based on your experience. The key is not to think of them as concrete benchmarks, which is something I talk about. Instead, it’s to use them as checks on your own experiences and then to kind of build an MWP interval out from 50/50 based on your experiences and the size of N.
  
  Log in to Reply
Nate Hofmann says:

May 5, 2015 at 8:05 pm

This stuff always makes my head spin with how in depth you can go with cardboard. Anyway great article as a first time reader, truly the best Modern analysis I have seen anywhere especially matchups. The only big question I have to ask is do you ever just play a pet deck instead of the best deck in the meta?

Log in to Reply
Kim Josefsen says:

May 21, 2015 at 6:55 am

It may be a bit off-topic, but is there a way to volunteer to gather info from the dailies?

Log in to Reply
1. Sean Ridgeley says:
  
  May 21, 2015 at 9:03 am
  
  Email Sheridan.
  
  Log in to Reply
Rob says:

May 21, 2015 at 7:04 am

Great article!

Log in to Reply
Anonymous says:

May 23, 2015 at 7:50 pm

When are we going to see an expanded data set with more matchups? Some of these matchups have an N value that are frighteningly low, and despite your warning about drawing conclusions, people have been throwing out these win percentages as if they are suddenly ancient truths.

Log in to Reply
1. Sheridan Lardner says:
  
  May 23, 2015 at 7:54 pm
  
  5 more dailies have been added to the dataset since this article, so you can expect updated numbers either this Wednesday or next Monday! It might even be up to 6-7 by then.
  
  Log in to Reply
vandrwll says:

May 24, 2015 at 3:49 pm

The thing that stands out to me as most worthy of questioning is the Amulet Bloom vs. Twin data. That matchup is horrific for Amulet if the Twin pilot is any good at all. Even with a near perfect draw, the games can be virtually unwinnable for Bloom Titan if Twin kept even a decent hand. I can’t believe the data suggest otherwise – and in such a lopsided fashion…

Log in to Reply
1. Sheridan Lardner says:
  
  May 24, 2015 at 6:33 pm
  
  I think it’s mostly a function of who plays Bloom and who plays Twin. Many of the Bloom players on MTGO are incredibly experienced with their decks: I’ve seen them playing those decks for 2-3 years now. This is reflected in the data itself, where many of the Bloom results were on the backs of a relatively few number of pilots. Not so with Twin, a deck that attracts both veteran players and also newer/less experienced ones who are just trying to play the “best” tier 1 deck in Modern. But those players are going to struggle against the Bloom pilots who are more experienced with the deck. Also, they will struggle because Bloom is a harder deck to play against if you don’t know what to disrupt. We only needed to see PT FRF coverage for evidence of that, and that was at the highest level of the format. All of those factors are probably at play in that particular matchup.
  
  Log in to Reply