Dice Vision System

Computer Vision solution using ML to detect, track, and read dice rolls in real time, integrating results into a centralized game server.

Client:

EM games, Germany

Project Overview

A gaming company sought to develop an AI-powered vision system to automate dice recognition during gameplay. The solution needed to reliably detect dice in a controlled environment, determine when they stopped rolling, read the values shown, and transmit the results to a Unity-based game server. The system had to operate in real time, ensuring accuracy and responsiveness without relying on generative AI outputs.

Challenge

The project faced multiple technical and operational challenges:

  • Real-Time Precision: Detecting dice movement and final stability within a strict 3-second window.

  • Face Recognition: Accurately reading the dice face value using limited training data (initially only 101 images for two-dice combinations).

  • Integration: Ensuring seamless communication with the Unity gaming server and maintaining consistent performance in live conditions.

  • Constraints: The client prohibited the use of generative AI, requiring a fully deterministic ML pipeline.

  • Commercial Risks: The client requested low-risk payment stages, necessitating incremental, testable deliverables.

Tech Stack

  • Programming & Frameworks: Python 3.9+, PyTorch (neural network framework), Ultralytics YOLOv8 (object detection), OpenCV (video & movement tracking).

  • Annotation Tools: CVAT, Label Studio.

  • Communication: REST API for score reporting, RTSP/UDP for video streaming, USB for local setups.

  • Hardware Environment: Gaming box with uniform lighting, 1080p top-down camera (30+ FPS), local server (Intel Core Ultra 5, 32GB RAM) running Unity engine.

Solution

Axis designed a modular CV pipeline composed of four stages:

  1. Movement Detection: OpenCV + optical flow to identify dice rolling events.

  2. Dice Detection: YOLOv8 models trained to locate dice within the frame.

  3. Stop Detection: Tracking ROIs until movement ceases within the 3-second stability window.

  4. Face Recognition & Scoring: Classifier to read pip values on each die, calculate totals, and send results to the Unity game server.

The project was divided into four commercial stages:

  1. Dataset preparation.

  2. Delivery of trained detection and classification models.

  3. Testing on unseen data.

  4. Integration with the game engine.

Interested in building an AI-powered product recommendation?

Interested in building an AI-powered product recommendation or try-on experience?