TLDR: Researchers have developed a new AI framework called Expandable Decision-Making States (EDMS) for analyzing soccer tactics. This system combines detailed relational variables (like space score and passing lanes) with an action-masking scheme to create player models that are both tactically interpretable and robust. EDMS significantly improves the accuracy of predicting player actions and evaluating their value, offering deeper insights into team strategies and individual player movements across various real-world datasets.
Analyzing the intricate dynamics of team sports like soccer has long posed a significant challenge for quantitative methods. With 22 players constantly interacting on a shared field, the resulting state space is incredibly complex and high-dimensional. Traditional rule-based analyses, while intuitive for coaches, often fall short in scope, covering only specific situations. Modern machine learning models, on the other hand, frequently perform pattern-matching without explicitly representing individual players as decision-making agents.
A recent research paper, titled “Expandable Decision-Making States for Multi-Agent Deep Reinforcement Learning in Soccer Tactical Analysis,” by Kenjiro Ide, Taiga Someya, Kohei Kawaguchi, and Keisuke Fujii, addresses this very problem. The authors propose a novel framework designed to build player-level agent models from real-world data, ensuring that the learned values and policies are both tactically interpretable and robust across diverse data sources.
Introducing Expandable Decision-Making States (EDMS)
The core of their proposal is the Expandable Decision-Making States (EDMS). This is a semantically rich state representation that goes beyond simple raw positions and velocities of players and the ball. Instead, EDMS augments this basic information with relational variables that hold significant tactical meaning. These include concepts like the ‘scoring of space,’ ‘pass score,’ and ‘shot score,’ which quantify the value of different actions and areas on the field.
A key aspect of EDMS is its ‘expandable’ nature. It’s built from modular, per-player descriptors, allowing practitioners to easily extend the state representation to include more players, richer tactical cues, or even provider-specific variables without altering the underlying reinforcement learning architecture. This flexibility ensures the system can adapt to evolving analytical needs.
Action Masking for Realistic Decision-Making
In addition to EDMS, the researchers introduced an action-masking scheme. This mechanism assigns distinct sets of possible actions to players based on whether they are in possession of the ball (on-ball agents) or not (off-ball agents). For instance, only the ball carrier can choose actions like ‘pass,’ ‘dribble,’ or ‘shot,’ while off-ball teammates focus on ‘support moves’ or ‘moving in eight directions.’ This prevents unrealistic scenarios, such as an off-ball player attempting a shot, which significantly improves the interpretability of learned action values and enhances learning efficiency.
Interpretable Tactical Concepts
Unlike previous models that relied on raw coordinate features, EDMS maps learned value functions and action policies directly to human-interpretable tactical concepts. This means that the AI can explain its decisions in terms of ‘marking pressure,’ ‘passing lanes,’ and ‘ball accessibility,’ making the insights far more valuable for coaches and analysts. The system also inherently aligns agent choices with the rules of play, such as the offside rule, by dynamically adjusting player values based on their legal status.
How EDMS Works: A Closer Look at State Variables
The EDMS framework categorizes state variables into ‘Absolute State’ (static game context like offside lines and formations) and ‘Relative State’ (player-specific, dynamic information). The Relative State is further divided into ‘On-ball’ and ‘Off-ball’ states.
For off-ball players, variables include the distance to the ball, the time for the nearest opponent to reach the player, and the time for an opponent to reach a potential pass lane. A crucial variable is the ‘space score,’ which quantifies the value of space based on Voronoi regions (considering player positions and velocities) and the importance of field areas (higher near the opponent’s goal and in the center). The ‘pass score’ combines these elements to evaluate passing opportunities.
On-ball players’ states include the time for the nearest opponent to reach the ball, distance and angle to the opponent’s goal, a ‘dribble score’ (change in space score from moving), a ‘shot score’ (probability of a shot being blocked), and a ‘long ball score.’
Experimental Validation and Real-World Applicability
The researchers conducted extensive experiments, primarily using a proprietary J.League dataset, and compared EDMS against a baseline model (PVS – Position and Velocity States) that only uses raw physical information. The results consistently showed that EDMS, especially when combined with action masking, significantly reduced both action-prediction loss and temporal-difference (TD) error. This indicates that EDMS leads to more accurate predictions of player actions and a better understanding of the value of different game situations.
Qualitative case studies further demonstrated EDMS’s ability to highlight high-risk, high-reward tactical patterns, such as fast counterattacks and defensive breakthroughs. For instance, the model correctly assigned low Q-values to players in offside positions, reflecting the tactical invalidity of such moves.
A significant contribution of this work is its integration into the open-source OpenSTARLab RLearn library. This integration allowed for cross-provider evaluation and reproducible experiments using multiple commercial and open datasets, including LaLiga data (from StatsBomb and SkillCorner), the FIFA World Cup 2022 dataset, and SoccerNet Game State Reconstruction resources. EDMS consistently outperformed the baseline across all these diverse datasets, proving its generalizability and robustness.
The study also explored the practical applicability of EDMS by visualizing Q-values on actual broadcast footage from the SoccerNet dataset. This capability holds immense potential for enhancing fan engagement and providing deeper match commentary, in addition to serving as a tactical feedback tool for coaches and players.
Also Read:
- Interactive AI Learns Teamwork from Narrated Physical Demonstrations
- Navigating Complex Traffic: A New AI Approach for Autonomous Vehicles to Understand Diverse Human Driving Styles
Conclusion and Future Directions
The research concludes that EDMS, coupled with action masking, offers a superior framework for multi-agent deep reinforcement learning in soccer tactical analysis. The effectiveness stems from its high-quality, semantically rich state representation, which, when combined with appropriate action constraints, allows the AI to accurately evaluate situations and learn optimal policies more efficiently.
Future work aims to explore more advanced multi-agent reinforcement learning models, such as those based on value decomposition methods, to better capture cooperative actions. The introduction of a game-theoretic approach is also envisioned to model the complex strategic dependencies and decision-making among players, further deepening the evaluation of play.


