## Introduction

In Trainer Battles (PvP), players choose their Pokémon without knowing what their opponents will choose beforehand. As Ryan pointed out, this emphasizes the Rock-Paper-Scizor aspect of the Pokémon game. Considering the new battle system still revolves around tapping, picking the right Pokémon will be more important than execution.

The million PokéCoin question is, what are the best Pokémon to pick? Intuitively, Pokémon that can handle the widest variety of threats will be the meta. But then, once that information is known by everyone, players will start to look for their best counters and use the counters more often, forming the "anti-meta". After that, the "anti-anti-meta", and so on and so forth. In this sense, the metagame will be evolving.

This article will provide a methodology to derive the final meta – the equilibrium meta by applying Game Theory.

## Terminology

### Meta

A **Meta** is defined as:

Where $w_i$ is the weight or likelihood that Pokémon $p_i$ is going to be used in battle, and

### Battle Score

The **Battle Score** of Pokémon $x$ against Pokémon $y$, $F(x,y)$, is a number that measures how well $x$ performs against $y$ in a battle.

A "good" battle score should satisfy the following properties:

- $F(x,y) > 0$ implies $x$ wins.
- $F(x,y) < 0$ implies $y$ wins.
- $F(x,y) = 0$ implies a draw.
- $F(x_1,y) > F(x_2, y)$ implies $x_1$ performs better than $x_2$ against $y$.
- $F(x,y) + F (y,x) = 0$

This way, our game will be zero-sum and symmetric, which is easier to solve.

An example of a good battle score is the **logarithm of the KD-ratio**:

The KD-ratio $KD(x,y)$ is the total number of $y$ that $x$ can defeat before fainting.

Another good battle score is the **TDO percentage difference**:

TDO percentage difference bounds the score between 1 and -1 and is well defined in the edge case where $x$ beats $y$ without taking any damage, unlike log KD.

## Solving for the Equilibrium Meta

We define a simple game $G = (P, P, M)$. Each player chooses one Pokémon to battle. The payoff for Player 1 if he uses $x$ and Player 2 uses $y$ is the battle score of $x$ against $y$, $F(x, y)$. This way, the utilities of all Pokémon are factored, and the optimal strategy for each player will be a list of probabilities for each Pokémon, which is then an approximation of the equilibrium meta.

Such game is finite since there are finite Pokémon with finite moves, hence the total combinations are finite. By Nash's Theorem, there must exist at least one optimal strategy. Our goal is to find the optimal strategy given the payoff matrix (battle score matrix).

The first step is to set a pool of Pokémon, $P$, with moves specified. Ideally we should include all different combinations of Pokémon; but that would too many of them. We might start with a dozen of meta-relevant Pokémon. The more Pokémon we include the more accurate the result becomes.

Then we need to obtain the battle score matrix, $M$. $M_{i, j}$ is the battle score of Pokémon $p_{i}$ fighting against Pokémon $p_{j}$. This can be done easily by using the "Battle Matrix" tool of GoBattleSim.

Next, we solve for the game. One can adapt the **Pivot Method** introduced by Prof. Thomas S. Ferguson from UCLA in his amazing work Game Theory, Second Edition, 2014. There are also some game solving tools available online such as this one.

The solution to this game, namely the optimal strategy, will be

$$ (p_{1}, w_{1}), (p_{2}, w_{2}), ..., (p_{n}, w_{n}) $$where $w_{i}$ is the probability of using Pokémon $p_{i}$. This is then our solution to the equilibrium meta.

## Case Study: Master Tier

To better illustrate how everything works, let’s first study some meta-relevant Pokémon in Master Tier:

**Groudon**, Mud Shot, Earthquake, Solar Beam/Fire Blast**Giratina (Altered Forme)**, Dragon Breath/Shadow Claw, Dragon Claw, Ancient Power/Shadow Sneak**Lugia**, Dragon Tail, Sky Attack, Hydro Pump**Melmetal**, Thunder Shock, Rock Slide, Flash Cannon/Thunderbolt**Dragonite**, Dragon Breath, Dragon Claw, Outrage/Hurricane**Latios**, Dragon Breath, Dragon Claw, Solar Beam**Metagross**, Bullet Punch, Meteor Mash, Earthquake**Mewtwo**, Psycho Cut, Shadow Ball, Ice Beam/Flamethrower/Thunderbolt**Gyarados**, Dragon Breath/Waterfall/Bite, Crunch, Outrage**Tyranitar**, Bite/Smack Down, Crunch, Stone Edge**Swampert**, Mud Shot, Surf, Earthquake**Heatran**, Fire Spin, Stone Edge, Iron Head**Raikou**, Thunder Shock, Wild Charge, Shadow Ball**Kyogre**, Waterfall, Hydro Pump, Blizzard

Using GoBattleSim, we obtain the battle score matrix here. The battle score is defined as the average TDO% difference of four different shield strategy combinations (0 shield and 1 shield for each side).

Next, we use the Pivot Method to solve the game. The result is:

Pokemon | Weight |
---|---|

BP.MM.E Metagross | 0.14385 |

W.HP.B Kyogre | 0.13124 |

DB.DC.O Dragonite | 0.11782 |

DB.DC.AP Giratina (Altered Forme) | 0.09649 |

B.C.SE Tyranitar | 0.09264 |

TS.WC.SB Raikou | 0.08466 |

MS.E.SB Groudon | 0.08387 |

DT.SA.HP Lugia | 0.08344 |

DB.DC.SS Giratina (Altered Forme) | 0.04016 |

TS.RS.T Melmetal | 0.03622 |

SD.C.SE Tyranitar | 0.0055 |

You might be wondering: where are the missing Pokémon such as Latios and Heatran? Well, their weight is 0, implying that they are not in the optimal meta. It can be verified that if your opponent announces his strategy as above, the **best response** from you would be the same strategy. This is, again, strictly derived from Game Theory.

## Bias Analysis

The observed final meta might differ from the theoretical equilibrium meta for several reasons. There are three implicit assumptions we made.

The first assumption is that everyone is well-informed on the fair value of each Pokémon. While such fair values must exist, it could be difficult to assess, because the real battles will be in teams of 3 Pokémon and thus the same Pokémon’s utility might vary in different teams. An improvement by u/yourcalcprof is to look at sets of three Pokémon instead of individuals, though it would be computationally expensive to derive the meta.

The second assumption is that everyone is willing to optimize their team. This might not hold because not every player cares about winning in PvP so much. In fact, according to some survey, more than half of the player-base "simply don't care about competitive PvP". Most players do it for the items.

The third assumption is that everyone is able to optimize their team. The reality is, it takes a considerable amount of stardust and Rare Candy to raise the optimal team. Think about Melmetal. It is part of the optimal strategy. It takes at least 618 Meltan candies to evolve into Melmetal, max it and buy a 2nd move. For players who don't have a Switch or a friend with a Switch, this could be very costly.

Even if the aforementioned Hardcore Assumptions hold, our method does not tell how long it takes to reach the equilibrium meta. It could happen as part of the thought process within minutes, or it could take weeks as the information spreads.

## Conclusion

With all said, the equilibrium meta does exist in theory and, even if other players don't employ the optimal strategy, you can, and by doing so you will triumph. It's also worth noting what Dondon (u/dondon151) said; that “dominant builds will be dominant no matter how many mechanics you stir into the pot”. The meta observed in Pokémon Let’s Go is already very centralized even though its mechanics are more complex, and it’s been out for only a month.

Does our job in solving the meta take away the fun in PvP? Not quite. First off, solving the meta itself is fun. Moreover, a meta at equilibrium might still provide variety and hence some level of uncertainty, which makes Trainer Battles fundamentally fun. Finally, think about Rock-Paper-Scizor. It has been around for so long and yet it's still popular, because people use it for something, such as to determine who’ll buy the next round of beers. Imagine how cool it is to have Pokémon Go PvP integrate into players’ daily lives!

Are you ready for PvP? See you in battle!