UEFA Champions League AI Model Audit - Week 8
Llama 4 Maverick led UEFA Champions League predictions with 2.06 avg points/match, followed by Mistral 7B v0.2 (2.00) and Llama 3.1 405B Turbo (1.89). Models achieved 45.86% correct tendency overall. Biggest upset: Atletico Madrid vs Bodo/Glimt (1-2).
Llama 4 Maverick led UEFA Champions League predictions with 2.06 avg points/match, followed by Mistral 7B v0.2 (2.00) and Llama 3.1 405B Turbo (1.89). Models achieved 45.86% correct tendency overall. Biggest upset: Atletico Madrid vs Bodo/Glimt (1-2).
Top 10 Models
| # | Model | Matches | Total Points | Avg Pts/Match | Tendency % | Exact % |
|---|---|---|---|---|---|---|
| 1 | Llama 4 Maverick (Meta) | 18 | 37 | 2.06 | 61.1% | 11.1% |
| 2 | Mistral 7B v0.2 (Mistral) | 14 | 28 | 2.00 | 64.3% | 7.1% |
| 3 | Llama 3.1 405B Turbo (Meta) | 18 | 34 | 1.89 | 55.6% | 11.1% |
| 4 | Llama 3.2 3B Turbo (Meta) | 18 | 34 | 1.89 | 50.0% | 11.1% |
| 5 | DeepSeek R1 (Reasoning) | 16 | 30 | 1.88 | 50.0% | 12.5% |
| 6 | Qwen 2.5 72B Turbo (Alibaba) | 18 | 33 | 1.83 | 55.6% | 11.1% |
| 7 | Llama 3 8B Lite (Meta) | 18 | 32 | 1.78 | 55.6% | 11.1% |
| 8 | Mistral 7B v0.3 (Mistral) | 18 | 32 | 1.78 | 61.1% | 5.6% |
| 9 | Cogito v2 405B (Deep Cogito) | 17 | 28 | 1.65 | 47.1% | 11.8% |
| 10 | Llama 4 Scout (Meta) | 18 | 28 | 1.56 | 61.1% | 5.6% |
Match-by-Match Audit
- Napoli vs Chelsea: 79.2% correct tendency (19/24 models)
- Pafos vs Slavia Praha: 19.2% correct tendency (5/26 models)
- Atletico Madrid vs Bodo/Glimt: 0.0% correct tendency (0/25 models)
- Manchester City vs Galatasaray: 48.0% correct tendency (12/25 models)
- PSV Eindhoven vs Bayern MΓΌnchen: 83.3% correct tendency (20/24 models)
- Athletic Club vs Sporting CP: 61.5% correct tendency (16/26 models)
- Ajax vs Olympiakos Piraeus: 57.7% correct tendency (15/26 models)
- Union St. Gilloise vs Atalanta: 0.0% correct tendency (0/26 models)
- Arsenal vs Kairat Almaty: 100.0% correct tendency (23/23 models)
- Liverpool vs Qarabag: 76.0% correct tendency (19/25 models)
- Barcelona vs FC Copenhagen: 61.5% correct tendency (16/26 models)
- Benfica vs Real Madrid: 0.0% correct tendency (0/25 models)
- Borussia Dortmund vs Inter: 51.9% correct tendency (14/27 models)
- Monaco vs Juventus: 32.0% correct tendency (8/25 models)
- Paris Saint Germain vs Newcastle: 39.1% correct tendency (9/23 models)
- Eintracht Frankfurt vs Tottenham: 52.0% correct tendency (13/25 models)
- Club Brugge KV vs Marseille: 16.0% correct tendency (4/25 models)
- Bayer Leverkusen vs Villarreal: 48.0% correct tendency (12/25 models)
Biggest Consensus Misses
- Benfica vs Real Madrid (4-2) | Consensus: A (92.0%)
- Union St. Gilloise vs Atalanta (1-0) | Consensus: A (88.5%)
- Atletico Madrid vs Bodo/Glimt (1-2) | Consensus: H (76.0%)
- Monaco vs Juventus (0-0) | Consensus: A (68.0%)
- Pafos vs Slavia Praha (4-1) | Consensus: A (50.0%)
Methodology
Avg points/match is calculated based on model predictions for each match. Correct tendency is defined as predicting the correct match outcome (H/D/A). Exact score hits count models that correctly predicted the final scoreline. Points are awarded as follows: correct tendency (3 pts for win/draw/loss), exact score (additional 3 pts for exact scoreline).
Generation cost: $0.0032
Tokens: 7,640 input + 1,326 output
Frequently Asked Questions
What is this article about?
You might also like
UEFA Champions League Round of 32 AI Model Predictions Audit
Trinity Large Preview, Gemma 3 27B, and Qwen3 30B A3B led UEFA Champions League Round of 32 predictions with 2.00 avg points per match. Models achieved 47.07% correct tendency overall, with Galatasaray vs Juventus (5-2) as the biggest consensus miss.
Feb 23, 2026
UEFA Europa League Round of 32 AI Model Performance Audit
GLM-5 (OpenRouter) led UEFA Europa League predictions this week with 3.25 points per match, followed by Llama 4 Scout (OpenRouter) at 2.88 and Mistral Small 3.2 24B (OpenRouter) at 2.25. Models achieved 52.63% correct tendency overall, though Ludogorets vs Ferencvarosi TC (2-1) caught most models off guard.
Feb 23, 2026
UEFA Conference League Round of 32 AI Prediction Audit
Trinity Large Preview led with 3.13 points per match, followed by Phi-4 (2.38) and Kimi K2.5 (2.13). Models achieved 33.19% correct tendency overall, with FC Noah's 1-0 win over AZ Alkmaar being the biggest surprise.