# AI won't tell you who wins the World Cup, but it's already transforming football

**Authors:** Adeline Bertin
**Categories:** Data & AI
**Tags:** Analytics
**Last Updated:** 2026-06-11T13:40:38.169Z
**Reading Time:** 12 min read

---

## Summary

From tracking players to scouting talent, AI is quietly reshaping football. Selim Kebaier, founder of STAT12, explains how data performance analytics works, what it can already do, and what comes next as the 2026 World Cup kicks off today.

---

# Interview: Selim Kebaier

[![Selim-Kebaier.png](https://i.postimg.cc/ncKtGyHK/Selim-Kebaier.png)](https://postimg.cc/N50VgP2L)

*Selim Kebaier is the founder of [STAT12](https://stat12.com/), an AI-powered football analytics startup dedicated to democratising performance analysis across all levels of the game. Selim Kebaier also sits on the board of [French Tech Aix-Marseille](https://lafrenchtech-aixmarseille.fr/), a partner of Albert School. Albert School will also be organising a special [AI Discovery Day, dedicated to AI and the 2026 World Cup](https://www.albertschool.com/events) online in early July.*

---

## What can AI actually do to improve team performance?

When people talk about AI, they tend to think of generative AI: text, images, video, the kind produced by ChatGPT or Claude. That is not what we do, and it is not the central topic when it comes to data and AI in football either. Our specialism is data performance, which covers everything related to on-pitch performance, through the use of **computer vision.** We have built an **AI model capable of generating advanced statistical reports from any match video.** We trained the model on videos we had first annotated manually. The broader, more relevant and cleaner the dataset, the more effective the model. There is a great deal of research in this area right now. That is the real AI story here, more than predictive analytics.


## What data do you collect?

We do tracking. We identify players and then follow them continuously throughout the match. This allows us to capture data on **distance covered, intensity, sprints, dribbles, tackles and passes.**

Since our goal is **to democratise access to data,** making it available to academies and semi-professional clubs, the main difficulty we face, compared with top-tier competitions that are covered by banks of cameras, is working from a single camera positioned at the side of the pitch in a fairly standard way. The upside is that for tracking purposes, a wide-angle view of the pitch is actually preferable. The downside is **the risk of occlusion: when one player passes in front of another.** In those cases, there is a strong chance the AI makes an error and an ID swap occurs, where, say, the player wearing number 4 ends up taking on the identifier of number 8, who had just crossed in front of him. The challenge, then, is to detect players and track them continuously without losing them.

A great deal of research is under way on this problem. We collaborate, for our part, with a laboratory at EPFL, the Swiss Federal Institute of Technology in Lausanne, which has allowed us to resolve the occlusion risk in part. It is now a process of **continuous improvement:** we add data, analyse matches, identify edge cases, then go back and correct them. This trains the model to be as effective as possible, with the ultimate goal of achieving 100% automated, error-free detection. The single-camera constraint is compounded by image quality: the sharper the image, the more precise the detection.

My estimate is that **within one to three years, we will have highly reliable models** that, given a video as input, will outperform humans in generating the statistics needed for match analysis.

## How is that data then used?

Once the data has been collected, the analysis phase begins. We have developed an analysis platform, accessible to coaches and managers, which brings together all the actions of a match. After each game, they can consult the **general statistics dashboard, covering both collective and individual performance:** possession, shots, offsides, kilometres covered, high-intensity runs, number of sprints and match momentum, showing which team dominated as the game progressed. There are also dedicated sections for defensive analysis, attacking analysis and set pieces. You can generate statistics on every dribble attempted by each player, with success rates, across a single match or an entire season. You can go further still, pulling up the number of dribbles made in the opposition half, to refine the analysis, since a dribble in defence carries a very different value from one in attack.

Pushing the analysis this far addresses the objections of sceptics about data use in football, which often stem from a lack of familiarity with the subject. The standard argument is: data is all very well, but on its own it means nothing. If a player completes 70% of his passes, you might conclude he is a good passer; but if all of those passes went backwards, the figure is meaningless. Except that it is entirely possible to break down forward passes and backward passes separately.

## Beyond match strategy, what progress does data collection and analysis enable in youth recruitment and injury prevention?

In recruitment, the space where we operate, **whoever has the data has the market.** If we can build a network across academies in France and Africa and collect data at scale, it would allow us to do scouting, to identify talent. We are developing a scouting platform on which you will be able to run very precise searches: for example, a left back under 18 who ranks in the top 5% for aerial duels and the top 10% on another specific metric. The platform will return the players matching the search.

Injury prevention relies more on the **physical statistics generated by the GPS devices worn by players on the pitch.** Collecting that data makes it possible to measure and then manage their workload, determining how many high-intensity runs they can handle and at what pace. It sits closer to sports medicine and physical preparation. AI, drawing on the statistical data from previous seasons, helps identify players in the red zone and can recommend rest. Each player has their own profile, of course; some have greater cardiovascular capacity. Coaches know this. The idea is to be as precise as possible.

There is also everything related to the **supporter experience.** Having a large fan base is a real asset for a football club, because it means potentially collecting a great deal of information about those supporters. The use cases are almost limitless: the aim is to gather as much data as possible upfront. New use cases then emerge from that data. It feeds on itself.

## What comes next?

For us, the next step will be to deploy a chat interface for coaches: **an AI-powered augmented assistant they can interact with to pull up charts, request a replay of a specific sequence or retrace the goalkeeper's distribution patterns.** This could offer the coach a second opinion, for example on which player to substitute, and confirm what is already visible to the naked eye: a player who has been covering a kilometre every five minutes at a steady rate and who starts to tire around the sixtieth minute. The same applies to failed and successful dribbles, as well as duels won.

On the prediction side, once a model has been properly trained on a clean, precise, relevant and well-stocked dataset, it is perfectly conceivable to have a consultative model on a tablet that can tell you, depending on context: this strategy has a 96% success rate, that one 65%, and the last 35%. **The coach will obviously have the final say** and will make the decision alone. You already see a lot of American football coaches with headsets and tablets showing dashboards: they are better equipped and already use video and match data to build their strategy.

## How long has data performance been developing in sport?

The field is relatively young. **Data performance first emerged in baseball in the United States, roughly twenty years ago.** That story is told in the film *Moneyball*, released in 2011 and based on the true story of a sporting director, [Billy Beane](https://en.wikipedia.org/wiki/Billy_Beane), played by Brad Pitt: a former baseball player turned general manager of the Oakland Athletics. Beane hires as his assistant a young Yale-educated data enthusiast, Peter Brand, played by Jonah Hill, to overhaul the club's entire sporting operation. The two of them **bet on player performance data to build the best possible team** on a tight budget. The team they assemble goes on to reach the finals of the American championship. Since that achievement, everyone has followed suit and data has become standard practice in American baseball.

The pioneer sports in this field, **baseball and American football in particular, have the advantage of being highly sequenced.** Every ten or twenty seconds, play stops and a new strategy is put in place. **Football, by contrast, is far less segmented:** there is a kick-off, then the game flows. Players move around, some change position, it is more fluid and less stop-start. The manager does not halt play every ten seconds to review strategy. There is a game plan at the start, some adjustments at half-time perhaps, but for the rest of the time it is up to the players to perform on the pitch. That is why data performance took longer to arrive in European football.

## Could a well-trained generative AI eventually replace coaches?

You could imagine that in a few years, an AI with sufficient data at its disposal could outperform any coach: which tactic to deploy, which formation, which player combinations, and so on. If all of that data can be captured, AI models will be able to **detect things that humans do not necessarily pick up.** For now, though, access to that data is still limited.

The question is more pressing when it comes to refereeing. Several technologies already exist that enhance a referee's capacity for judgement: VAR (Video Assistant Referees) and Goal Line Technology are both extremely precise. These technologies rely on cameras, sensors and trackside equipment. Players also wear GPS vests, which allow them to be geolocated to the nearest millimetre.

At that point, **judgement becomes binary: the machine says offside or not offside, goal or not goal, and the referee follows the decision.** We will reach a stage where the machine can take decisions in place of the referee. We are already in a transitional phase, with technology-assisted refereeing at major competitions.

The new [FIFA ball](https://www.fifa.com/en/tournaments/mens/worldcup/canadamexicousa2026/official-match-ball), launched for the 2026 World Cup, contains numerous sensors to capture as much information as possible about the power of each shot, the rotation of the ball and its speed. **If players have GPS and the ball has chips, it becomes possible to reconstruct the entire match,** millimetre by millimetre, from start to finish, and produce augmented reality. You could imagine watching the match in 3D on your coffee table at home. It is more of a futuristic novelty for now, but the potential use cases are plentiful.

## Models trained on historical patterns have so far proved poor at anticipating the unexpected. Genius and surprise elude them. Is that likely to change soon?

It is the question that interests every punter: **is it possible to build a model that predicts the winner?** Opta, the reference player in football statistics, holds vast amounts of data and has models of varying quality for predicting results. At this stage, I think it is largely a novelty. It is mainly good for generating buzz, because we lack the data needed to build truly precise models. The more fine-grained and precise the data, the better the predictive models will become.

Before AI evolved from machine learning to deep learning, **mathematical models for predicting a favourite based on recent results always existed.** Old-school predictive models, if one can call them that, already produced predictions. If they genuinely worked and consistently beat bookmakers and betting sites, every user would be making a fortune!

Uncertainty, surprises, emotion: these are what make sport beautiful. Not everyone welcomes the idea of turning the discipline into a science, which is understandable. Even if an AI becomes, by design, as accurate as possible in its predictions, **it will not eliminate surprises.** There have always been matches where a team judged far superior to its opponent ends up losing. That is the beauty of sport, and it will never disappear.

## In conclusion, can we say that AI is revolutionising football?

**AI is revolutionising recruitment in football.** Traditionally, a club would send scouts all over the world: to Africa, Latin America, Asia. They had extensive networks. As soon as they heard about an exceptional player, they would bring him back to their country to develop. What Billy Beane revolutionised in baseball was precisely the application of data to that process. Once we have data on every football match, which is our ambition, we will get there too.

The real challenge right now is collecting the maximum amount of reliable data, across multiple dimensions: physical, technical, tactical, and at scale. From that, we will be able to identify the best talent, anticipate injuries more effectively, and deepen the level of analysis available to coaches. AI will also make it possible **to measure the compatibility between players,** drawing on a comprehensive performance assessment for each individual alongside a personality study. The more data we have, the more we can refine searches using a trained AI model.

That said, if technology makes things possible, **what we choose to do with it remains decisive.** Progress is faster in American sports than in football, partly because many clubs remain resistant. The new generation of coaches is more open and more curious about these subjects. It will happen gradually.

## Which teams will you be watching closely at the 2026 World Cup?
I am Franco-Tunisian, so I will naturally be following both teams. I would say France and Spain are the strong favourites, followed by Argentina and Portugal, with Germany, the Netherlands and Belgium in a second tier of contenders.



## Key Takeaways

1. The real AI story in football is not generative AI but computer vision and data performance, turning match videos into advanced statistical insights
2. Access to data is becoming the decisive asset in recruitment; whoever builds the most comprehensive dataset will have the market
3. AI integration cannot be opportunistic. It requires structured use cases, rigorous diagnosis and progressive validation before any scaling
4. The coach of tomorrow will not be replaced by AI but augmented by it, using predictive models as a second opinion while retaining full decision-making authority
5. Uncertainty, surprise and emotion are what make sport beautiful, and no predictive model, however precise, will ever eliminate them


---

*Article from [Albert's Deep Dive](https://deepdive.albertschool.com) - Albert School's Journal*
