La Liga Week 24 AI Model Audit: Top Performers & Accuracy
Phi-4 led La Liga predictions with 3.50 points per match, followed by Trinity Large Preview and Gemma 3 12B (both 3.25). Models achieved 39.99% correct tendency overall, with the Getafe vs Villarreal (2-1) result catching all models off guard.
Phi-4 led La Liga predictions with 3.50 points per match, followed by Trinity Large Preview and Gemma 3 12B (both 3.25). Models achieved 39.99% correct tendency overall, with the Getafe vs Villarreal (2-1) result catching all models off guard. This round covered 9 La Liga Regular Season matches, where prediction accuracy is critical for assessing model reliability in competitive fixtures. Below is the statistical audit of AI performance.
Top 10 Models
| # | Model | Matches | Total Points | Avg Pts/Match | Tendency % | Exact % |
|---|---|---|---|---|---|---|
| 1 | Phi-4 (OpenRouter) | 8 | 28 | 3.50 | 62.5% | 37.5% |
| 2 | Trinity Large Preview (OpenRouter) | 8 | 26 | 3.25 | 62.5% | 37.5% |
| 3 | Gemma 3 12B (OpenRouter) | 8 | 26 | 3.25 | 62.5% | 37.5% |
| 4 | Qwen3 30B A3B (OpenRouter) | 8 | 23 | 2.88 | 62.5% | 25.0% |
| 5 | MiniMax M2.5 (OpenRouter) | 4 | 11 | 2.75 | 50.0% | 25.0% |
| 6 | Devstral Small (OpenRouter) | 8 | 22 | 2.75 | 50.0% | 37.5% |
| 7 | DeepSeek R1 (OpenRouter) | 5 | 13 | 2.60 | 60.0% | 20.0% |
| 8 | Llama 4 Maverick (OpenRouter) | 9 | 23 | 2.56 | 44.4% | 33.3% |
| 9 | Llama 3.1 8B (OpenRouter) | 5 | 11 | 2.20 | 60.0% | 20.0% |
| 10 | Llama 4 Scout (OpenRouter) | 8 | 17 | 2.13 | 37.5% | 25.0% |
Match-by-Match Audit
- Mallorca vs Real Betis: Result 1-2. Correct tendency 36.4%, exact hits 36.4%. Consensus D (54.5%) incorrect.
- Levante vs Valencia: Result 0-2. Correct tendency 12.5%, exact hits 0.0%. Consensus D (83.3%) incorrect.
- Rayo Vallecano vs Atletico Madrid: Result 3-0. Correct tendency 4.2%, exact hits 0.0%. Consensus D (50.0%) incorrect.
- Oviedo vs Athletic Club: Result 1-2. Correct tendency 50.0%, exact hits 29.2%. Consensus A (50.0%) correct.
- Real Madrid vs Real Sociedad: Result 4-1. Correct tendency 57.7%, exact hits 0.0%. Consensus H (57.7%) correct.
- Sevilla vs Alaves: Result 1-1. Correct tendency 83.3%, exact hits 83.3%. Consensus D (83.3%) correct.
- Getafe vs Villarreal: Result 2-1. Correct tendency 0.0%, exact hits 0.0%. Consensus A (83.3%) incorrect.
- Espanyol vs Celta Vigo: Result 2-2. Correct tendency 29.2%, exact hits 0.0%. Consensus A (66.7%) incorrect.
- Elche vs Osasuna: Result 0-0. Correct tendency 86.7%, exact hits 0.0%. Consensus D (86.7%) correct.
Biggest Consensus Misses
- Levante vs Valencia (0-2): Consensus D (83.3%) incorrect. Counts H/D/A: 1/20/3.
- Getafe vs Villarreal (2-1): Consensus A (83.3%) incorrect. Counts H/D/A: 0/4/20.
- Espanyol vs Celta Vigo (2-2): Consensus A (66.7%) incorrect. Counts H/D/A: 1/7/16.
- Mallorca vs Real Betis (1-2): Consensus D (54.5%) incorrect. Counts H/D/A: 2/12/8.
- Rayo Vallecano vs Atletico Madrid (3-0): Consensus D (50.0%) incorrect. Counts H/D/A: 1/12/11.
Methodology
kroam.xyz uses a quota-based scoring system that rewards both accuracy and boldness:
Tendency Points (2-6 points): Models earn points for correctly predicting the match outcome (home win, draw, or away win). The points awarded depend on prediction rarityβif most models predicted a home win but the away team won, models who correctly predicted the away win earn more points (up to 6). Common predictions earn fewer points (minimum 2).
Goal Difference Bonus (+1 point): If the model predicts the correct goal difference (e.g., predicted 2-1 and result was 3-2, both +1 difference), they earn a bonus point.
Exact Score Bonus (+3 points): Predicting the exact final score earns 3 additional points.
Maximum: 10 points per prediction (6 tendency + 1 goal diff + 3 exact).
This system ensures that models taking calculated risks on unlikely outcomes are rewarded when correct, while also recognizing precision in exact score predictions. Learn more about our methodology.
Frequently Asked Questions
Q: Which AI model performed best in La Liga Regular Season - 24? A: Phi-4 (OpenRouter) performed best with an average of 3.50 points per match.
Q: How accurate were AI predictions for La Liga this round? A: Models achieved 39.99% correct tendency and 16.54% exact score hit rate across 9 matches.
Q: What was the biggest upset in La Liga Regular Season - 24? A: Getafe vs Villarreal (2-1) was the biggest upset, with 0% of models predicting the correct tendency.
Q: How does kroam.xyz score AI football predictions? A: kroam.xyz uses a quota-based system awarding up to 10 points per prediction: 2-6 for correct tendency, +1 for correct goal difference, and +3 for exact score.
Generation cost: $0.0020
Tokens: 4,884 input + 1,743 output
Frequently Asked Questions
What is this article about?
Which AI model performed best in La Liga Regular Season - 24?**?
Q: Which AI model performed best in La Liga Regular Season - 24?
Q: How accurate were AI predictions for La Liga this round?
You might also like
La Liga Round 25 AI Model Performance: Top Predictors & Accuracy
MiniMax M2.5 led La Liga predictions with 3.22 points per match, followed by Gemma 3 12B (3.00) and DeepSeek R1-0528 (2.56). Models achieved 38.60% correct tendency overall, with Real Sociedad vs Oviedo (3-3) being the biggest consensus miss.
Feb 23, 2026
UEFA Europa League Round of 32 AI Model Performance Audit
GLM-5 (OpenRouter) led UEFA Europa League predictions this week with 3.25 points per match, followed by Llama 4 Scout (OpenRouter) at 2.88 and Mistral Small 3.2 24B (OpenRouter) at 2.25. Models achieved 52.63% correct tendency overall, though Ludogorets vs Ferencvarosi TC (2-1) caught most models off guard.
Feb 23, 2026
UEFA Conference League Round of 32 AI Prediction Audit
Trinity Large Preview led with 3.13 points per match, followed by Phi-4 (2.38) and Kimi K2.5 (2.13). Models achieved 33.19% correct tendency overall, with FC Noah's 1-0 win over AZ Alkmaar being the biggest surprise.