You can find this full article here!

Overview

Often in video games, when outplaying their opponent in their position, will type "(position) diff". For example, when I dominate the tank player on the other team, I often type "tank diff" in the chat to let them know.

Today we'll be looking at a dataset that contains data on different player's performances in the 2022 professional season for League of Legends. The dataset is very large, and contains data from all the different professional leagues. It has almost every stat you can think of in the game, such as kills, deaths, assists, barons, towers, gold earned per minute, and more. The data can be found here

In a game like League of Legends, which is very complex and intricate, the range of skill levels of players will vary greatly. The difference between the "skill floor" and the "skill ceiling" is a very big range, at all levels of the game. But this brings up an interesting question: is the average player in the professional leagues closer to the "skill floor" or the "skill ceiling"? Or in other words:

The Question🤔

Is the gap between the best player(s) and the average players larger than the gap between the average players and the worst players?

To answer this question, we will first find the MVP(s) and "LVPs" (Least Valuable Players) of the league, as well as the average players. Then we will perform a hypothesis test to see which gap is larger.

Of course, like in other sports, the best and worst players can be subjective. But I plan to use data to create a performance metric that represents a player's overall performance. This will be a combination of statistics that we will find are most important to winning.

Procedure✔️

We will do the following steps in this project:

Retrieve the data
Perform missingness analysis to see what values are missing from the data and why they are
Clean the data and get them in the form we want
Use bivariate analysis to create a performance metric
Use univariate analysis and our performance metric to choose the best, average, and worst players in the league
Perform a hypothesis test to answer our question

Missingness Analysis🤷‍♂️

After retrieving the data, we perform missingness analysis on the data to see what columns are missing values.

We found that for the data there were many columns, such as doublekills, triplekills, quadrakills, and many more were NMAR, (Not Missing At Random).

We find all rows missing any of those values are missing all of them. Furthermore, we run permutation tests to test missingness between doublekills and triplekills, and got a p-value of 0, indicating that there was a strong correlation between the missingness of those columns. We also run a similar test between doublekills and url, but found no correlation.

Next, we find that there were only 4 leagues that were missing those values. These are the leagues that have at least 1 missing value for doublekills:

For reference, there are over 40 total leagues in the dataset, and only 4 of them were missing values in doublekills, and those 4 are missing a lot.

Thus, it seems like the columns in columns_with_missing are not missing at random (NMAR), as we can determine if a value is missing by looking at the league column.

This is partially why we will only be looking at two leagues when evaluating our players today: LCS and LCK. This makes it so we don't have to deal with missing values in those columns.

We end up keeping only the columns that are relevant to our question, and as it turns out, there are no missing values for these columns in the LCS and LCK!

Data Cleaning😶‍🌫️

Besides the steps described in the missingness analysis section, there were many other steps in the data cleaning process.

First, we create a dataframe called players that has the season statistics for each player.

The numerical statistics are their per-game averages, and we write some extra code to get the non-numerical statistics in this dataframe. We also add other statistics that we calculate ourselves, such as "games played", "minutes played", and "KDA", which is (Kills+Assists)/Deaths.

Other steps included:

Renaming the result column to winrate
Rounding the numerical values to 2 decimals

Here is part of what our final players dataset looks like:

Players Dataframe

playername	team	position	KDA	gamesplayed
Abbedagge	100 Thieves	mid	3.6	70
Ablazeolive	Golden Guardians	mid	2.37	43
Aiming	KT Rolster	bot	4.71	93
Aria	KT Rolster	mid	3.44	38
Arrow	Immortals	bot	2.67	10
Baut	Hanwha Life Esports	sup	2.07	5
Bdd	Nongshim RedForce	mid	3.23	82
Berserker	Cloud9	bot	4.82	59
BeryL	DRX	sup	2.47	94
Bible	DWG KIA	sup	1.61	5

Next, note that when thinking about creating a performance metric, we must find the variables that are the most important to winning games. We will find the variables with the highest correlation with winrate.

However, League of Legends is a team sport. In baseball, the LA Angels have arguably the two best players in the league and haven't even made the playoffs since 2019. So finding a perfect "performance metric" may not be accurate if we base it solely on individual performance.

So what we will do is create a dataframe called teams, which has the average statistics for each team. Then we will find the variables with the highest correlation with winrate, and create our performance metric from there.

We made similar adjustments to what we did for the players dataframe. Here is part of what our final teams dataset looks like:

Teams Dataframe

teamname	winrate	KDA	gamesplayed
100 Thieves	0.566	3.944	76
100 Thieves Academy	0.581	3.522	117
100 Thieves Next	0.766	5.04	77
1907 Fenerbahçe Academy	0.667	3.821	3
300	0.417	3.083	12
42 Gaming	0.375	2.131	8
5 Ronin	0.31	2.06	42
5 Ronin Academy	0.176	1.894	34
9z Team	0.413	2.794	46
AGD E-Sports	0	1.27	3

And that concluded our data cleaning!

Bivariate Analysis: Making a Performance Metric

Now we are ready to start creating our performance metric using our teams dataframe.

The first variable that comes to mind when winning games it getting kills and not dying. So of course, we look at the correlation between KDA and winrate.

We can see a lot of teams have a winrate of 1, which is impressive! But this is because many teams only played a couple games. After filtering out teams that didn't reach the cutoff I made, this was the results:

As expected, there is a very clear correlation between KDA and winrate, so we will use KDA in our performance metric. This makes sense, since killing the enemy and dying less definitely makes winning the game easier.

We used the following code to get the correlation coefficients for each column:

correlation_values = teams.select_dtypes(include='number').corr()['winrate']
# Sort the correlation values in descending order
sorted_correlation = correlation_values.abs().sort_values(ascending=False)
sorted_correlation.head(30)

Deciding The Performance Metric: W-SCORE

So based on the correlation coefficients calculated above, our performance metric will be:

(STD KDA X 0.9) + (STD EARNED GPM X 0.88) + (STD GOLD DIFF AT 15 X 0.75) + (STD XP DIFF AT 15 X 0.73) + (STD CS DIFF AT 15 X 0.64)

This is the sum of the top 5 player-specific statistics in terms of correlation coefficient with winrate, standardized, and weighted by their respective correlation coefficients. Of course, this isn't a perfect performance metric, as there are many intangible things that players can do to be better. But this is the best analysis we can do using the statistics we have.

Going forward, we'll refer to this statistic as "W-Score".

Univariate Analysis: Who is the best? (and worst?)

So now that we have a performance metric, we can find out the best and worst players in the LCS in terms of W-Score.

First of all, we don't want outliers in our data, so we will set a minimum amount of games played as 10.

Then, we calculated the W-Score of each player and added it to the dataframe.

playername	team	position	WSCORE
Abbedagge	100 Thieves	mid	0.130822
Ablazeolive	Golden Guardians	mid	-1.37417
Aiming	KT Rolster	bot	4.03204
Aria	KT Rolster	mid	-2.12278
Arrow	Immortals	bot	-3.79693
Bdd	Nongshim RedForce	mid	-0.360007
Berserker	Cloud9	bot	3.7277
BeryL	DRX	sup	0.149048
Biofrost	Dignitas	sup	-1.48645
Bjergsen	Team Liquid	mid	5.21788

Below is the distribution of the WSCOREs for the league:

Of course, since we used standardized data, our WSCORE is roughly normally distributed.

Interesting Aggregates: Positions

Now we want to want to choose the players with the best WSCORE of the bunch. But there's a problem: League of Legends has different roles, or positions. We can expect that a player in the support role may not have the same amount of kills as someone from the ACD role, for example.

Let's take a look at the distribution of WSCORE for support players.

As we can see, the WSCORE for support players is a lot lower than that of other roles. So when we choose the best, worst, and average players, we must choose the best, worst, and average players in each role.

position	WSCORE	KDA	earned gpm	golddiffat15	xpdiffat15	csdiffat15
bot	0.904918	0.453928	0.94434	-0.217684	-0.0825527	-0.173609
jng	-0.0795993	0.0552768	-0.31044	0.137101	0.0112628	0.0512364
mid	1.10015	0.461663	0.626748	0.0658577	0.0865614	0.0320888
sup	-1.39659	-0.165469	-1.63583	0.0863344	0.0236002	0.171689
top	-0.350066	-0.707908	0.434024	-0.0394479	-0.0266024	-0.0716953

It seems that mid and bot (ADC) players have the highest statistics, while sup and jng (jungle) players seem to have the lowest. But that doesn't mean that support and jungle players are worse, it just means their role tends to do less of these statistics. So we must adjust WSCORE to compare players to their individual roles, not just the league averages.

So now we will calculated PAWSCORE, or Position Adjusted W-Score. We use the same formula as W-Score, but when standardizing, we standardize using the player's position instead of the entire dataset.

Here is a look at some our data with PAWSCORE added:

playername	WSCORE	PAWSCORE
Aiming	4.03204	3.03437
Arrow	-3.79693	-4.47901
Berserker	3.7277	2.61146
Cheoni	-4.78286	-5.12006
Danny	3.6619	2.81346
Deft	0.864622	-0.0148235
FBI	0.320103	-0.319738
Ghost	-1.00998	-1.94507
Gumayusi	4.79718	3.76698
Hans SamD	-0.385392	-1.11788

As you can see, there is a big difference between WSCORE and PAWSCORE. We can take a look at the distribution of PAWSCORE

This distribution is a little less normal than the WSCORE distribution, which makes sense.

The player on the far right is a player named Chovy.

It seems that statistically, Chovy was the best player by a wide margin in PAWSCORE, at least for the 2022 season. A quick Google search says that Chovy is certainly one of the best mid players in the league, so this is a good sign for our PAWSCORE metric. In the season we are looking at, Chovy lead his team to the finals but ended up losing.

Now that we have the PAWSCORE metric, we can answer our question: Is the gap between the best player(s) and the average players larger than the gap between the average players and the worst players?

Hypothesis Testing: Answering the Question

These are our hypotheses:

H0 (null hypothesis): The gap between the best player and the average player is not larger than the gap between the average player and the worst players.

H1 (alt. hypothesis): The gap between the best player and the average player is larger than the gap between the average player and the worst players.

We will use a significance level of 5%.

Our test statistic is going to be:

(gap between top and average) - (gap between average and worst)

(mean PAWSCORE of top players - league average PAWSCORE) - (league average PAWSCORE - mean PAWSCORE of bottom players)

This is the difference between the gap between the best and average players, and the gap between the average and worst players.

This can be simplified knowing that the league average PAWSCORE is 0:

(mean PAWSCORE of top players) - abs(mean PAWSCORE of bottom players)

Finding the Best and Worst

We will look at the top 30 and bottom 30 players in terms of PAWSCORE. Note that the players in the top 30 all have a positive PAWSCORE, and the players in the bottom 30 all have a negative PAWSCORE. 30 players in either tier is a reasonable amount of players to be considered the "best of the league" and the "worst of the league". (Note: changing these numbers didn't change the final result anyway)

Calculating the gaps: for top_players, the gap is just their PAWSCORE. For the bottom_players, the gap is the absolute value of their PAWSCORE. We can store these values to compare in ABS PAWSCORE.

Our observed test statistic is the absolute difference between the means of ABS PAWSCORE for top_players and bottom_players. This value ended up being 0.334001.

We performed a permutation test, shuffling values to get the distribution of the test statistic.

n_repetitions = 100

differences = []
for _ in range(n_repetitions):
    # Step 1: Shuffle the PAWSCORES and store them in a DataFrame
    with_shuffled = combined_players.assign(Shuffled_PAW=np.random.permutation(combined_players['ABS PAWSCORE']))

    # Step 2: Compute the test statistic
    group_means = (
        with_shuffled
        .groupby('top')
        .mean()
        .loc[:, 'Shuffled_PAW']
    )    
    difference = group_means.diff().iloc[-1]

    differences.append(difference)

The distribution of the test statistic we found can be found below. The red line is the observed statistic.

Computing our p-value:

p_value = np.mean(differences >= observed_diff)

We obtained our p-value = 0.19.

Conclusion

This means that in this case, we fail to reject the null hypothesis. In other words, there is not sufficient evidence to support the claim that the gap between the best players and the average player is higher than the gap between the average player and the worst players, specifically in the LCS and LCK 2022 season.

What does this mean? This doesn't necesarilly mean that the average player in the league is as close to the bottom as they are to the top, we can't know for sure. The gap between the highest player in the league and the average player could still be higher, but there is not sufficient enough evidence to point to this.

Thus, the relative locations of the "skill floor" and "skill ceilings" remain unknown.

However, we have at least come up with an interesting performance metric to rate League of Legends players! Here is the sorted list of the top 25 players from the 2022 LCS and LCK season, based on the PAWSCORE metric, if you're curious:

playername	team	position	PAWSCORE
Chovy	Gen.G	mid	10.8312
Prince	Liiv SANDBOX	bot	6.79801
Summit	Cloud9	top	6.07203
Oner	T1	jng	5.97414
huhi	100 Thieves	sup	5.39983
Hans Sama	Team Liquid	bot	5.26718
Bjergsen	Team Liquid	mid	5.17724
Blaber	Cloud9	jng	5.10206
Ruler	Gen.G	bot	4.73252
Peanut	Gen.G	jng	4.48142
Keria	T1	sup	4.36877
Zeus	T1	top	4.25127
Bwipo	Team Liquid	top	4.24592
Gumayusi	T1	bot	3.76698
Doran	Gen.G	top	3.73132
Ssumday	100 Thieves	top	3.6527
Pridestalkr	Golden Guardians	jng	3.62718
Moham	Kwangdong Freecs	sup	3.54426
Nuguri	DWG KIA	top	3.37927
Inspired	Evil Geniuses	jng	3.30361
Lehends	Gen.G	sup	3.15099
Jensen	Cloud9	mid	3.08008
Aiming	KT Rolster	bot	3.03437
BeryL	DRX	sup	2.83118
Danny	Evil Geniuses	bot	2.81346

Thanks for reading!

I hope you enjoyed this data science article!

Jeremy Quinto