The role for machine learning in player recruitment

Recruitment in football is underpinned by judgements about how a given player will perform in future, at a given club and in a given role. In some cases it will be shaped by the quality of career he is likely to have beyond his time with that club, and the future value his registration may therefore hold.

Those judgements are informed by what that player has done in the past, as well as the performances and career trajectories of other players who are – for any number reasons- considered comparable.

So, put simply, we are using the past performances of a number of players to guide our judgement on the future performance of a specific individual. As we do so, we apply what can be considered context. For each player we think about: who they played for and in which division(s); the styles of teams they played for, and their roles within those teams; the evolution of their performances over time; the injuries they had, their recurrences and the number of absences they caused; and the transfers they made, and how all of the above changed as a result of each move.

Many of the things we consider contextual are those where our view is difficult to quantify or categorise. Judgements are shaped by a large number of individual cases over many seasons, and it is not straight forward to ‘reverse engineer’ how they were formed or clearly substantiate them.

Equally, we take into account elements of past performance which can easily be quantified. For example: the number of starts a player has made over recent seasons; the finishing position of their team in a league; and the number of goals the player has scored.

In evaluating both sets of considerations, it is readily accepted that just because something happened before it does not mean it will happen again. Equally, we accept that knowing these things improves our decisions. Put another way, the past performances of the player in question and comparable players do not give us complete certainty about future performance, but they give us more certainty than if we did not know them.

It is for that reason that the experience, craft knowledge and judgement of scouts is valuable. And it is also for that reason that we should welcome machine learning into the conversation.

Many of the factors we consider contextual can be quantified or categorised to some extent. These derived metrics will be imperfect, but can be better than having nothing at all. Examples might include: team strength ratings; categorisations of team or individual playing style; league strength ratings; and player contribution to a team.

We can therefore develop metrics to represent (imperfectly) a wide array of ‘contextual’ factors, and put these alongside the more easily quantifiable factors. The output can be a richer dataset, which is much more representative of a player’s performance and situation. Feeding this into a machine, and asking it to make predictions as to what will happen in future for potential transfer targets is also an imperfect process. But the machine can process a volume of information at a speed that a person cannot, and uncover patterns that are hard to spot.

The machine can only make predictions based on the information it has been fed. Assuming the model is well built, if we know what data it has been fed and we understand the main workings of the model – such as how much impact different metrics have on its predictions (‘feature importance’) – and we understand how the model has performed on examples we have tested it on, then why not see what it says?

The threshold for allowing the machine learning model into the conversation should not be ‘is it right all the time?’ – that is not the threshold for allowing an individual into the conversation. ‘Can the model’s predictions improve the understanding of decision makers as to what might happen in future?’ would be a more reasonable challenge.

When the opportunity is approached in that way, and the strengths and weaknesses of the model’s output are known and understood surely the question becomes ‘why wouldn’t you give the machine a voice at the table?’.

Left Field Football Consulting partners with DataRobot, the leading end-to-end enterprise AI platform, to deliver machine learning capabilities to football organisations.

Toujours mieux comprendre

Besoin d’information ?