Premier League Week 25 AI Model Predictions & Accuracy Audit
Rnj-1 Instruct led Premier League predictions with 3.10 avg points/match, followed by Qwen 2.5 7B Turbo (2.90) and MiniMax M2 (2.60). Models achieved 34.70% correct tendency overall, with Fulham vs Everton (1-2) as the biggest consensus miss.
Rnj-1 Instruct led Premier League predictions with 3.10 avg points/match, followed by Qwen 2.5 7B Turbo (2.90) and MiniMax M2 (2.60). Models achieved 34.70% correct tendency overall, with Fulham vs Everton (1-2) as the biggest consensus miss. Premier League Regular Season - 25 featured 10 matches, including high-profile fixtures like Liverpool vs Manchester City. AI prediction accuracy is critical for benchmarking model performance in competitive scenarios. This audit provides a statistical breakdown of the round's outcomes.
Top 10 Models
| # | Model | Matches | Total Points | Avg Pts/Match | Tendency % | Exact % |
|---|---|---|---|---|---|---|
| 1 | Rnj-1 Instruct (Essential AI) | 10 | 31 | 3.10 | 70.0% | 10.0% |
| 2 | Qwen 2.5 7B Turbo (Alibaba) | 10 | 29 | 2.90 | 70.0% | 10.0% |
| 3 | MiniMax M2 (Synthetic) | 10 | 26 | 2.60 | 60.0% | 20.0% |
| 4 | GPT-OSS 120B (Synthetic) | 5 | 11 | 2.20 | 40.0% | 20.0% |
| 5 | Llama 3.1 8B Turbo (Meta) | 10 | 21 | 2.10 | 50.0% | 10.0% |
| 6 | Marin 8B Instruct (Marin Community) | 10 | 19 | 1.90 | 50.0% | 10.0% |
| 7 | Nemotron Nano 9B v2 (NVIDIA) | 9 | 14 | 1.56 | 44.4% | 11.1% |
| 8 | MiniMax M2.1 (Synthetic) | 9 | 14 | 1.56 | 44.4% | 11.1% |
| 9 | Llama 4 Maverick (Meta) | 10 | 15 | 1.50 | 40.0% | 10.0% |
| 10 | Mistral Small 3 24B (Mistral) | 10 | 15 | 1.50 | 40.0% | 10.0% |
Match-by-Match Audit
- Liverpool vs Manchester City (1-2): Correct tendency: 42.3%, exact score hits: 34.6%, consensus: D (46.2%) incorrect.
- Brighton vs Crystal Palace (0-1): Correct tendency: 8.7%, exact score hits: 0.0%, consensus: D (73.9%) incorrect.
- Newcastle vs Brentford (2-3): Correct tendency: 34.6%, exact score hits: 0.0%, consensus: D (46.2%) incorrect.
- Wolves vs Chelsea (1-3): Correct tendency: 77.8%, exact score hits: 11.1%, consensus: A (77.8%) correct.
- Burnley vs West Ham (0-2): Correct tendency: 20.0%, exact score hits: 0.0%, consensus: D (76.0%) incorrect.
- Arsenal vs Sunderland (3-0): Correct tendency: 65.4%, exact score hits: 3.8%, consensus: H (65.4%) correct.
- Fulham vs Everton (1-2): Correct tendency: 0.0%, exact score hits: 0.0%, consensus: D (96.0%) incorrect.
- Bournemouth vs Aston Villa (1-1): Correct tendency: 35.7%, exact score hits: 35.7%, consensus: A (50.0%) incorrect.
- Manchester United vs Tottenham (2-0): Correct tendency: 50.0%, exact score hits: 0.0%, consensus: H (50.0%) correct.
- Leeds vs Nottingham Forest (3-1): Correct tendency: 12.5%, exact score hits: 0.0%, consensus: D (54.2%) incorrect.
Biggest Consensus Misses
- Fulham vs Everton (1-2) | Consensus: D (96.0%) | Counts H/D/A: 1/24/0
- Burnley vs West Ham (0-2) | Consensus: D (76.0%) | Counts H/D/A: 1/19/5
- Brighton vs Crystal Palace (0-1) | Consensus: D (73.9%) | Counts H/D/A: 4/17/2
- Leeds vs Nottingham Forest (3-1) | Consensus: D (54.2%) | Counts H/D/A: 3/13/8
- Bournemouth vs Aston Villa (1-1) | Consensus: A (50.0%) | Counts H/D/A: 4/10/14
Methodology
kroam.xyz uses a quota-based scoring system that rewards both accuracy and boldness:
Tendency Points (2-6 points): Models earn points for correctly predicting the match outcome (home win, draw, or away win). The points awarded depend on prediction rarityβif most models predicted a home win but the away team won, models who correctly predicted the away win earn more points (up to 6). Common predictions earn fewer points (minimum 2).
Goal Difference Bonus (+1 point): If the model predicts the correct goal difference (e.g., predicted 2-1 and result was 3-2, both +1 difference), they earn a bonus point.
Exact Score Bonus (+3 points): Predicting the exact final score earns 3 additional points.
Maximum: 10 points per prediction (6 tendency + 1 goal diff + 3 exact).
This system ensures that models taking calculated risks on unlikely outcomes are rewarded when correct, while also recognizing precision in exact score predictions. Learn more about our methodology.
Frequently Asked Questions
Q: Which AI model performed best in Premier League Regular Season - 25? A: Rnj-1 Instruct (Essential AI) performed best with an average of 3.10 points per match.
Q: How accurate were AI predictions for Premier League this round? A: The average correct tendency was 34.70%, and the exact score hit rate was 8.53%.
Q: What was the biggest upset in Premier League Regular Season - 25? A: Fulham vs Everton (1-2) was the biggest consensus miss, with 96.0% of models predicting a draw.
Q: How does kroam.xyz score AI football predictions? A: Points are awarded based on tendency accuracy (2-6 pts), goal difference bonus (+1 pt), and exact score bonus (+3 pts), up to 10 points per match.
Generation cost: $0.0062
Tokens: 5,223 input + 1,828 output
Frequently Asked Questions
What is this article about?
Which AI model performed best in Premier League Regular Season - 25?**?
Q: Which AI model performed best in Premier League Regular Season - 25?
Q: How accurate were AI predictions for Premier League this round?
You might also like
Premier League Week 27 AI Model Prediction Audit
Mistral Small 3.2 24B led Premier League predictions with 2.11 points per match, followed by Devstral 2 (1.89) and Gemma 3 27B (1.56). Models achieved 41.52% correct tendency overall, though Aston Villa's 1-1 draw with Leeds caught 94.7% consensus wrong.
Feb 23, 2026
Premier League Round 26 AI Model Predictions Audit
GLM 4.7 Flash led Premier League predictions this week with 4.67 points per match, followed by DeepSeek R1 0528 (4.63) and Kimi K2 Instruct (3.88). Models achieved 46.15% correct tendency overall, though Chelsea's 2-2 draw with Leeds caught most models off guard.
Feb 16, 2026
UEFA Europa League Round of 32 AI Model Performance Audit
GLM-5 (OpenRouter) led UEFA Europa League predictions this week with 3.25 points per match, followed by Llama 4 Scout (OpenRouter) at 2.88 and Mistral Small 3.2 24B (OpenRouter) at 2.25. Models achieved 52.63% correct tendency overall, though Ludogorets vs Ferencvarosi TC (2-1) caught most models off guard.