# Asmodee Business Deep Dive: Mapping Games and Players on Board Game Arena

**Authors:** Guillaume Rabeau
**Categories:** Business Deep Dives
**Tags:** Asmodee, BGA, BGG, théorie des graphes, similarité cosinus, clustering, Gaussian Mixture, recommandation
**Last Updated:** 2025-11-05T15:55:43.620Z
**Reading Time:** 2 min read

---

## Summary

Using 200 million rows, we model the links between 1,105 games and their players on Board Game Arena, enriched via the BoardGameGeek API, to map the ecosystem.

---

Asmodee's Business Deep Dive was arguably one of the most technical we've done at Albert School. With a massive database of more than 200 million rows, we had to find a way to understand the relationships between games and players on the online platform Board Game Arena. For this deep dive, I was fortunate to be in a group with Anna Spira, whose knowledge of board games proved very helpful.

We first sought to understand the relationships among the games themselves. By analyzing match histories, we compiled a list of the 1,105 games played in this dataset. Using the Board Game Geek API, we were then able to build a new database focused solely on the games. To visualize them, we turned to graph theory, ideal for illustrating relationships between games.

Each node represented a game, and an edge meant the games shared a common category. Based on cosine similarity between games, we could then determine which were complementary and which cannibalized one another. In the final visualization, this is shown as a green or red edge.

Finally, because the goal was to deliver an interactive tool to Asmodee, we coded the ability to add a game to the graph and visualize its interactions with the others without the user having to write a single line of code.

Once this first level of solution was reached, we turned to the players. Who are they? How do they play? To answer these questions, we implemented a two-step segmentation. An initial exploratory dendrogram analysis, based on the Ward linkage method (hierarchical clustering), allowed us to visualize proximity between user profiles.

We then enriched this approach with a Gaussian Mixture model, more flexible and probabilistic, capable of revealing latent clusters without imposing hard boundaries. This work surfaced six very distinct player segments: casual players who try things without committing; highly active expert competitors; inactive or newly registered profiles; regular players without a focus on performance; ultra-committed players with a strong reputation; and, finally, moderately experienced players, steady in their engagement without being extreme. Each segment was analyzed through variables such as number of games played, time spent, prestige earned, and community karma.

Finally, building on these insights, we constructed a personalized recommendation algorithm. It draws both on the structure of the game graph (to detect similar or complementary titles) and on the player's profile derived from clustering. By combining product similarity, user behavior, and a few business rules (such as non-redundancy or prioritizing new releases), the algorithm can propose games suited to each type of player. It steers novices toward simple, engaging games, supports progression for regular profiles, and challenges the most experienced players.

This project allowed us to combine advanced data science techniques with strong product thinking. The goal was not just to produce visualizations or models, but to offer a truly personalized experience, capable of evolving player engagement by accounting for their profile, their implicit preferences, and the dynamics of the gaming ecosystem as a whole.

## Key Takeaways

1. Map relationships across titles with **graph theory** by building a **game graph** from Board Game Arena data enriched via the **Board Game Geek API**, then assess overlap using **cosine similarity** to spot **complementarity** vs. **cannibalization**.
2. Uncover nuanced **player segments** by combining **hierarchical clustering (Ward linkage)** with a **Gaussian Mixture Model (GMM)** using engagement, **prestige**, and **karma** signals.
3. Ship an **interactive tool** that lets non-technical users add a game and instantly visualize edges and portfolio impact without writing code.
4. Deliver a **hybrid recommendation algorithm** that fuses the **game graph** with **player profiles**, guided by **business rules** like non-redundancy and new-release priority.
5. Validate decisions with **offline metrics** (silhouette, BIC, MAP@K) and **online A/B tests** (CTR, retention, playtime) to drive measurable lifts in engagement.

## Frequently Asked Questions

### What is a game graph, and why use graph theory for Board Game Arena data?

A game graph models each board game as a node and connects games (edges) that share categories or mechanics. Graph theory makes it easy to visualize clusters, overlaps, and relationships, helping publishers spot discovery paths, portfolio gaps, and cross-promotion opportunities on Board Game Arena.

### How does cosine similarity reveal complementary or cannibalizing board games?

Cosine similarity compares games as vectors built from shared categories/tags; higher similarity signals strong overlap that can cannibalize attention, while lower similarity suggests complementary discovery paths. In practice, set business thresholds and validate with behavioral data (e.g., switching rates, co-play) before labeling edges.

### How were player segments built using hierarchical clustering and a Gaussian Mixture Model?

Ward linkage (hierarchical clustering) provides an interpretable view of user proximity, then a Gaussian Mixture Model captures probabilistic, overlapping behaviors. Inputs typically include games played, time spent, prestige, karma, recency/frequency, and possibly win rate—with scaling and outlier handling to stabilize the model.

### How can Asmodee or publishers apply these segments in product and marketing on Board Game Arena?

Use segments to tailor onboarding, recommend suitable game complexity, and promote expansions or tournaments to high-skill cohorts. They also inform retention campaigns, community recognition (karma/prestige), and targeted launches that minimize cannibalization across similar titles.

### How do personalized recommendations combine the game graph and player profiles?

A hybrid approach ranks games by graph-based similarity and adjusts scores using the player’s segment preferences and recent activity. Business rules (e.g., non-redundancy, new-release priority, diversity of mechanics) improve novelty and reduce recommendation fatigue, while cluster priors solve cold start.

### What data and tools are needed to enrich BGA data with the Board Game Geek API?

Join BGA match histories to Board Game Geek API metadata (categories, mechanics, families) to build content vectors per game. Store enriched data in a graph database or columnar warehouse, map taxonomy consistently, and handle API rate limits with caching or batch ingestion.

### How do you evaluate segmentation and recommendation quality?

Use silhouette or Davies–Bouldin for clustering coherence and BIC/AIC for Gaussian Mixture Model selection. For recommendations, track offline metrics like MAP@K and hit rate, then confirm impact via A/B tests on CTR, session length, retention, prestige earned, and downstream play conversion.

### What are the scalability and privacy considerations with 200M+ rows of gameplay data?

Scale preprocessing with distributed compute (e.g., Spark), vectorization, and approximate nearest neighbors for similarity, plus precomputed embeddings and cached edges. Anonymize user IDs, minimize sensitive attributes, provide opt-outs, and ensure GDPR-compliant data governance.


---

*Article from [Albert's Deep Dive](https://deepdive.albertschool.com) - Albert School's Journal*
