League of Legends: Skill Difference Analysis

League of Legends: Skill Difference Analysis

Creating a player rating model for League of Legends in Python

You can find this full article here!


Overview

Often in video games, when outplaying their opponent in their position, will type "(position) diff". For example, when I dominate the tank player on the other team, I often type "tank diff" in the chat to let them know.

Today we'll be looking at a dataset that contains data on different player's performances in the 2022 professional season for League of Legends. The dataset is very large, and contains data from all the different professional leagues. It has almost every stat you can think of in the game, such as kills, deaths, assists, barons, towers, gold earned per minute, and more. The data can be found here

In a game like League of Legends, which is very complex and intricate, the range of skill levels of players will vary greatly. The difference between the "skill floor" and the "skill ceiling" is a very big range, at all levels of the game. But this brings up an interesting question: is the average player in the professional leagues closer to the "skill floor" or the "skill ceiling"? Or in other words:

The Question🤔

Is the gap between the best player(s) and the average players larger than the gap between the average players and the worst players?

To answer this question, we will first find the MVP(s) and "LVPs" (Least Valuable Players) of the league, as well as the average players. Then we will perform a hypothesis test to see which gap is larger.

Of course, like in other sports, the best and worst players can be subjective. But I plan to use data to create a performance metric that represents a player's overall performance. This will be a combination of statistics that we will find are most important to winning.

Procedure✔️

We will do the following steps in this project:

  1. Retrieve the data

  2. Perform missingness analysis to see what values are missing from the data and why they are

  3. Clean the data and get them in the form we want

  4. Use bivariate analysis to create a performance metric

  5. Use univariate analysis and our performance metric to choose the best, average, and worst players in the league

  6. Perform a hypothesis test to answer our question


Missingness Analysis🤷‍♂️

After retrieving the data, we perform missingness analysis on the data to see what columns are missing values.

We found that for the data there were many columns, such as doublekills, triplekills, quadrakills, and many more were NMAR, (Not Missing At Random).

We find all rows missing any of those values are missing all of them. Furthermore, we run permutation tests to test missingness between doublekills and triplekills, and got a p-value of 0, indicating that there was a strong correlation between the missingness of those columns. We also run a similar test between doublekills and url, but found no correlation.

Next, we find that there were only 4 leagues that were missing those values. These are the leagues that have at least 1 missing value for doublekills:

For reference, there are over 40 total leagues in the dataset, and only 4 of them were missing values in doublekills, and those 4 are missing a lot.

Thus, it seems like the columns in columns_with_missing are not missing at random (NMAR), as we can determine if a value is missing by looking at the league column.

This is partially why we will only be looking at two leagues when evaluating our players today: LCS and LCK. This makes it so we don't have to deal with missing values in those columns.

We end up keeping only the columns that are relevant to our question, and as it turns out, there are no missing values for these columns in the LCS and LCK!


Data Cleaning😶‍🌫️

Besides the steps described in the missingness analysis section, there were many other steps in the data cleaning process.

First, we create a dataframe called players that has the season statistics for each player.

The numerical statistics are their per-game averages, and we write some extra code to get the non-numerical statistics in this dataframe. We also add other statistics that we calculate ourselves, such as "games played", "minutes played", and "KDA", which is (Kills+Assists)/Deaths.

Other steps included:

  • Renaming the result column to winrate

  • Rounding the numerical values to 2 decimals

Here is part of what our final players dataset looks like:

Players Dataframe

playernameteampositionKDAgamesplayed
Abbedagge100 Thievesmid3.670
AblazeoliveGolden Guardiansmid2.3743
AimingKT Rolsterbot4.7193
AriaKT Rolstermid3.4438
ArrowImmortalsbot2.6710
BautHanwha Life Esportssup2.075
BddNongshim RedForcemid3.2382
BerserkerCloud9bot4.8259
BeryLDRXsup2.4794
BibleDWG KIAsup1.615

Next, note that when thinking about creating a performance metric, we must find the variables that are the most important to winning games. We will find the variables with the highest correlation with winrate.

However, League of Legends is a team sport. In baseball, the LA Angels have arguably the two best players in the league and haven't even made the playoffs since 2019. So finding a perfect "performance metric" may not be accurate if we base it solely on individual performance.

So what we will do is create a dataframe called teams, which has the average statistics for each team. Then we will find the variables with the highest correlation with winrate, and create our performance metric from there.

We made similar adjustments to what we did for the players dataframe. Here is part of what our final teams dataset looks like:

Teams Dataframe

teamnamewinrateKDAgamesplayed
100 Thieves0.5663.94476
100 Thieves Academy0.5813.522117
100 Thieves Next0.7665.0477
1907 Fenerbahçe Academy0.6673.8213
3000.4173.08312
42 Gaming0.3752.1318
5 Ronin0.312.0642
5 Ronin Academy0.1761.89434
9z Team0.4132.79446
AGD E-Sports01.273

And that concluded our data cleaning!


Bivariate Analysis: Making a Performance Metric

Now we are ready to start creating our performance metric using our teams dataframe.

The first variable that comes to mind when winning games it getting kills and not dying. So of course, we look at the correlation between KDA and winrate.

We can see a lot of teams have a winrate of 1, which is impressive! But this is because many teams only played a couple games. After filtering out teams that didn't reach the cutoff I made, this was the results:

As expected, there is a very clear correlation between KDA and winrate, so we will use KDA in our performance metric. This makes sense, since killing the enemy and dying less definitely makes winning the game easier.

We used the following code to get the correlation coefficients for each column:

correlation_values = teams.select_dtypes(include='number').corr()['winrate']
# Sort the correlation values in descending order
sorted_correlation = correlation_values.abs().sort_values(ascending=False)
sorted_correlation.head(30)

Deciding The Performance Metric: W-SCORE

So based on the correlation coefficients calculated above, our performance metric will be:

(STD KDA X 0.9) + (STD EARNED GPM X 0.88) + (STD GOLD DIFF AT 15 X 0.75) + (STD XP DIFF AT 15 X 0.73) + (STD CS DIFF AT 15 X 0.64)

This is the sum of the top 5 player-specific statistics in terms of correlation coefficient with winrate, standardized, and weighted by their respective correlation coefficients. Of course, this isn't a perfect performance metric, as there are many intangible things that players can do to be better. But this is the best analysis we can do using the statistics we have.

Going forward, we'll refer to this statistic as "W-Score".


Univariate Analysis: Who is the best? (and worst?)

So now that we have a performance metric, we can find out the best and worst players in the LCS in terms of W-Score.

First of all, we don't want outliers in our data, so we will set a minimum amount of games played as 10.

Then, we calculated the W-Score of each player and added it to the dataframe.

playernameteampositionWSCORE
Abbedagge100 Thievesmid0.130822
AblazeoliveGolden Guardiansmid-1.37417
AimingKT Rolsterbot4.03204
AriaKT Rolstermid-2.12278
ArrowImmortalsbot-3.79693
BddNongshim RedForcemid-0.360007
BerserkerCloud9bot3.7277
BeryLDRXsup0.149048
BiofrostDignitassup-1.48645
BjergsenTeam Liquidmid5.21788

Below is the distribution of the WSCOREs for the league:

Of course, since we used standardized data, our WSCORE is roughly normally distributed.

Interesting Aggregates: Positions

Now we want to want to choose the players with the best WSCORE of the bunch. But there's a problem: League of Legends has different roles, or positions. We can expect that a player in the support role may not have the same amount of kills as someone from the ACD role, for example.

Let's take a look at the distribution of WSCORE for support players.

As we can see, the WSCORE for support players is a lot lower than that of other roles. So when we choose the best, worst, and average players, we must choose the best, worst, and average players in each role.

positionWSCOREKDAearned gpmgolddiffat15xpdiffat15csdiffat15
bot0.9049180.4539280.94434-0.217684-0.0825527-0.173609
jng-0.07959930.0552768-0.310440.1371010.01126280.0512364
mid1.100150.4616630.6267480.06585770.08656140.0320888
sup-1.39659-0.165469-1.635830.08633440.02360020.171689
top-0.350066-0.7079080.434024-0.0394479-0.0266024-0.0716953

It seems that mid and bot (ADC) players have the highest statistics, while sup and jng (jungle) players seem to have the lowest. But that doesn't mean that support and jungle players are worse, it just means their role tends to do less of these statistics. So we must adjust WSCORE to compare players to their individual roles, not just the league averages.

So now we will calculated PAWSCORE, or Position Adjusted W-Score. We use the same formula as W-Score, but when standardizing, we standardize using the player's position instead of the entire dataset.

Here is a look at some our data with PAWSCORE added:

playernameWSCOREPAWSCORE
Aiming4.032043.03437
Arrow-3.79693-4.47901
Berserker3.72772.61146
Cheoni-4.78286-5.12006
Danny3.66192.81346
Deft0.864622-0.0148235
FBI0.320103-0.319738
Ghost-1.00998-1.94507
Gumayusi4.797183.76698
Hans SamD-0.385392-1.11788

As you can see, there is a big difference between WSCORE and PAWSCORE. We can take a look at the distribution of PAWSCORE

This distribution is a little less normal than the WSCORE distribution, which makes sense.

The player on the far right is a player named Chovy.

It seems that statistically, Chovy was the best player by a wide margin in PAWSCORE, at least for the 2022 season. A quick Google search says that Chovy is certainly one of the best mid players in the league, so this is a good sign for our PAWSCORE metric. In the season we are looking at, Chovy lead his team to the finals but ended up losing.

Now that we have the PAWSCORE metric, we can answer our question: Is the gap between the best player(s) and the average players larger than the gap between the average players and the worst players?


Hypothesis Testing: Answering the Question

These are our hypotheses:

H0 (null hypothesis): The gap between the best player and the average player is not larger than the gap between the average player and the worst players.

H1 (alt. hypothesis): The gap between the best player and the average player is larger than the gap between the average player and the worst players.

We will use a significance level of 5%.

Our test statistic is going to be:

(gap between top and average) - (gap between average and worst)

\=

(mean PAWSCORE of top players - league average PAWSCORE) - (league average PAWSCORE - mean PAWSCORE of bottom players)

This is the difference between the gap between the best and average players, and the gap between the average and worst players.

This can be simplified knowing that the league average PAWSCORE is 0:

(mean PAWSCORE of top players) - abs(mean PAWSCORE of bottom players)

Finding the Best and Worst

We will look at the top 30 and bottom 30 players in terms of PAWSCORE. Note that the players in the top 30 all have a positive PAWSCORE, and the players in the bottom 30 all have a negative PAWSCORE. 30 players in either tier is a reasonable amount of players to be considered the "best of the league" and the "worst of the league". (Note: changing these numbers didn't change the final result anyway)

Calculating the gaps: for top_players, the gap is just their PAWSCORE. For the bottom_players, the gap is the absolute value of their PAWSCORE. We can store these values to compare in ABS PAWSCORE.

Our observed test statistic is the absolute difference between the means of ABS PAWSCORE for top_players and bottom_players. This value ended up being 0.334001.

We performed a permutation test, shuffling values to get the distribution of the test statistic.

n_repetitions = 100

differences = []
for _ in range(n_repetitions):
    # Step 1: Shuffle the PAWSCORES and store them in a DataFrame
    with_shuffled = combined_players.assign(Shuffled_PAW=np.random.permutation(combined_players['ABS PAWSCORE']))

    # Step 2: Compute the test statistic
    group_means = (
        with_shuffled
        .groupby('top')
        .mean()
        .loc[:, 'Shuffled_PAW']
    )    
    difference = group_means.diff().iloc[-1]

    differences.append(difference)

The distribution of the test statistic we found can be found below. The red line is the observed statistic.

Computing our p-value:

p_value = np.mean(differences >= observed_diff)

We obtained our p-value = 0.19.


Conclusion

This means that in this case, we fail to reject the null hypothesis. In other words, there is not sufficient evidence to support the claim that the gap between the best players and the average player is higher than the gap between the average player and the worst players, specifically in the LCS and LCK 2022 season.

What does this mean? This doesn't necesarilly mean that the average player in the league is as close to the bottom as they are to the top, we can't know for sure. The gap between the highest player in the league and the average player could still be higher, but there is not sufficient enough evidence to point to this.

Thus, the relative locations of the "skill floor" and "skill ceilings" remain unknown.

However, we have at least come up with an interesting performance metric to rate League of Legends players! Here is the sorted list of the top 25 players from the 2022 LCS and LCK season, based on the PAWSCORE metric, if you're curious:

playernameteampositionPAWSCORE
ChovyGen.Gmid10.8312
PrinceLiiv SANDBOXbot6.79801
SummitCloud9top6.07203
OnerT1jng5.97414
huhi100 Thievessup5.39983
Hans SamaTeam Liquidbot5.26718
BjergsenTeam Liquidmid5.17724
BlaberCloud9jng5.10206
RulerGen.Gbot4.73252
PeanutGen.Gjng4.48142
KeriaT1sup4.36877
ZeusT1top4.25127
BwipoTeam Liquidtop4.24592
GumayusiT1bot3.76698
DoranGen.Gtop3.73132
Ssumday100 Thievestop3.6527
PridestalkrGolden Guardiansjng3.62718
MohamKwangdong Freecssup3.54426
NuguriDWG KIAtop3.37927
InspiredEvil Geniusesjng3.30361
LehendsGen.Gsup3.15099
JensenCloud9mid3.08008
AimingKT Rolsterbot3.03437
BeryLDRXsup2.83118
DannyEvil Geniusesbot2.81346

Thanks for reading!

I hope you enjoyed this data science article!