Bundesliga AI Model Performance Audit - Regular Season 24
MiniMax M2.5 led Bundesliga predictions with 2.89 points per match, followed by GPT-OSS 20B (2.33) and Llama 4 Scout (2.11). Models achieved 34.24% correct tendency overall, with the 1899 Hoffenheim vs FC St. Pauli upset (0-1) catching 94.7% consensus predictions wrong.
MiniMax M2.5 led Bundesliga predictions with 2.89 points per match, followed by GPT-OSS 20B (2.33) and Llama 4 Scout (2.11). Models achieved 34.24% correct tendency overall, with the 1899 Hoffenheim vs FC St. Pauli upset (0-1) catching 94.7% consensus predictions wrong.
Bundesliga Regular Season 24 featured 9 matches across the league, testing AI prediction models against both expected outcomes and surprising results. The audit covers model performance accuracy for key fixtures including Borussia Dortmund vs Bayern München and several unexpected outcomes that challenged consensus predictions.
Top 10 Models
| Rank | Model | Matches | Total Points | Avg Pts/Match | Tendency % | Exact % |
|---|---|---|---|---|---|---|
| 1 | MiniMax M2.5 (OpenRouter) | 9 | 26 | 2.89 | 66.7% | 0.0% |
| 2 | GPT-OSS 20B (OpenRouter) | 9 | 21 | 2.33 | 44.4% | 22.2% |
| 3 | Llama 4 Scout (OpenRouter) | 9 | 19 | 2.11 | 55.6% | 11.1% |
| 4 | GLM-5 (OpenRouter) | 9 | 18 | 2.00 | 44.4% | 22.2% |
| 5 | Kimi K2.5 (OpenRouter) | 9 | 17 | 1.89 | 44.4% | 0.0% |
| 6 | Trinity Large Preview (OpenRouter) | 9 | 15 | 1.67 | 44.4% | 11.1% |
| 7 | Phi-4 (OpenRouter) | 9 | 12 | 1.33 | 33.3% | 11.1% |
| 8 | Qwen3 30B A3B (OpenRouter) | 9 | 12 | 1.33 | 33.3% | 11.1% |
| 9 | Devstral 2 (OpenRouter) | 9 | 12 | 1.33 | 33.3% | 11.1% |
| 10 | Gemma 3 27B (OpenRouter) | 9 | 12 | 1.33 | 33.3% | 11.1% |
Match-by-Match Audit
Hamburger SV vs RB Leipzig (1-2)
- Correct tendency: 55.6% (10/18 models)
- Exact score hits: 50.0% (9/18 models)
- Consensus: Away win (55.6%) - Correct
Eintracht Frankfurt vs SC Freiburg (2-0)
- Correct tendency: 10.5% (2/19 models)
- Exact score hits: 0.0% (0/19 models)
- Consensus: Draw (68.4%) - Incorrect
VfB Stuttgart vs VfL Wolfsburg (4-0)
- Correct tendency: 94.7% (18/19 models)
- Exact score hits: 0.0% (0/19 models)
- Consensus: Home win (94.7%) - Correct
Borussia Dortmund vs Bayern München (2-3)
- Correct tendency: 47.4% (9/19 models)
- Exact score hits: 0.0% (0/19 models)
- Consensus: Draw (47.4%) - Incorrect
Bayer Leverkusen vs FSV Mainz 05 (1-1)
- Correct tendency: 47.4% (9/19 models)
- Exact score hits: 21.1% (4/19 models)
- Consensus: Draw (47.4%) - Correct
Werder Bremen vs 1. FC Heidenheim (2-0)
- Correct tendency: 31.6% (6/19 models)
- Exact score hits: 0.0% (0/19 models)
- Consensus: Draw (52.6%) - Incorrect
Borussia Mönchengladbach vs Union Berlin (1-0)
- Correct tendency: 10.5% (2/19 models)
- Exact score hits: 5.3% (1/19 models)
- Consensus: Away win (52.6%) - Incorrect
1899 Hoffenheim vs FC St. Pauli (0-1)
- Correct tendency: 0.0% (0/19 models)
- Exact score hits: 0.0% (0/19 models)
- Consensus: Home win (94.7%) - Incorrect
FC Augsburg vs 1. FC Köln (2-0)
- Correct tendency: 10.5% (2/19 models)
- Exact score hits: 0.0% (0/19 models)
- Consensus: Draw (89.5%) - Incorrect
Biggest Consensus Misses
- 1899 Hoffenheim vs FC St. Pauli (0-1) | Consensus: Home win (94.7%) | Counts H/D/A: 18/1/0
- FC Augsburg vs 1. FC Köln (2-0) | Consensus: Draw (89.5%) | Counts H/D/A: 2/17/0
- Eintracht Frankfurt vs SC Freiburg (2-0) | Consensus: Draw (68.4%) | Counts H/D/A: 2/13/4
- Werder Bremen vs 1. FC Heidenheim (2-0) | Consensus: Draw (52.6%) | Counts H/D/A: 6/10/3
- Borussia Mönchengladbach vs Union Berlin (1-0) | Consensus: Away win (52.6%) | Counts H/D/A: 2/7/10
Methodology
kroam.xyz uses a quota-based scoring system that rewards both accuracy and boldness:
Tendency Points (2-6 points): Models earn points for correctly predicting the match outcome (home win, draw, or away win). The points awarded depend on prediction rarity—if most models predicted a home win but the away team won, models who correctly predicted the away win earn more points (up to 6). Common predictions earn fewer points (minimum 2).
Goal Difference Bonus (+1 point): If the model predicts the correct goal difference (e.g., predicted 2-1 and result was 3-2, both +1 difference), they earn a bonus point.
Exact Score Bonus (+3 points): Predicting the exact final score earns 3 additional points.
Maximum: 10 points per prediction (6 tendency + 1 goal diff + 3 exact).
This system ensures that models taking calculated risks on unlikely outcomes are rewarded when correct, while also recognizing precision in exact score predictions. Learn more about our methodology.
Frequently Asked Questions
Q: Which AI model performed best in Bundesliga Regular Season 24? A: MiniMax M2.5 (OpenRouter) performed best with 2.89 points per match across 9 matches, achieving 66.7% correct tendency rate.
Q: How accurate were AI predictions for Bundesliga this round? A: Models achieved 34.24% correct tendency overall, with 8.48% exact score hit rate across 170 total predictions.
Q: What was the biggest upset in Bundesliga Regular Season 24? A: 1899 Hoffenheim vs FC St. Pauli (0-1) was the biggest upset, with 94.7% consensus predicting a home win but FC St. Pauli winning away.
Q: How does kroam.xyz score AI football predictions? A: kroam.xyz uses a quota-based system rewarding both accuracy and boldness, with maximum 10 points per prediction (6 tendency + 1 goal difference + 3 exact score).
Generation cost: $0.0023
Tokens: 4,964 input + 2,110 output
Frequently Asked Questions
What is this article about?
Which AI model performed best in Bundesliga Regular Season 24?**?
Q: Which AI model performed best in Bundesliga Regular Season 24?
Q: How accurate were AI predictions for Bundesliga this round?
You might also like
Bundesliga Round 23 AI Model Performance Audit
Llama 3.3 70B Instruct led Bundesliga predictions with 3.13 points per match, followed by MiniMax M2.1 (2.50) and GLM-5 (2.25). Models achieved 32.75% correct tendency overall, though the 1. FC Heidenheim vs VfB Stuttgart 3-3 draw caught most models off guard.
Feb 23, 2026
UEFA Conference League Round of 32 AI Prediction Audit
GPT-OSS 20B led UEFA Conference League predictions with 2.88 points per match, followed by Trinity Large Preview (2.63) and GLM-5 (2.25). Models achieved 38.16% correct tendency overall, with Fiorentina vs Jagiellonia (2-4) as the biggest upset.
Mar 2, 2026
UEFA Europa League Round of 32 AI Model Performance Audit
Mistral Small 3.2 24B led predictions with 3.38 avg points/match, followed by Phi-4 (2.88) and Llama 4 Scout (2.75). Models achieved 38.82% correct tendency. VfB Stuttgart's 0-1 loss to Celtic was the biggest consensus miss.