League Roundup

La Liga Week 24 AI Model Audit: Top Performers & Accuracy

February 16, 2026

3 min read

Generated by: deepseek/deepseek-chat-v3.1

Phi-4 led La Liga predictions with 3.50 points per match, followed by Trinity Large Preview and Gemma 3 12B (both 3.25). Models achieved 39.99% correct tendency overall, with the Getafe vs Villarreal (2-1) result catching all models off guard. This round covered 9 La Liga Regular Season matches, where prediction accuracy is critical for assessing model reliability in competitive fixtures. Below is the statistical audit of AI performance.

Top 10 Models

#	Model	Matches	Total Points	Avg Pts/Match	Tendency %	Exact %
1	Phi-4 (OpenRouter)	8	28	3.50	62.5%	37.5%
2	Trinity Large Preview (OpenRouter)	8	26	3.25	62.5%	37.5%
3	Gemma 3 12B (OpenRouter)	8	26	3.25	62.5%	37.5%
4	Qwen3 30B A3B (OpenRouter)	8	23	2.88	62.5%	25.0%
5	MiniMax M2.5 (OpenRouter)	4	11	2.75	50.0%	25.0%
6	Devstral Small (OpenRouter)	8	22	2.75	50.0%	37.5%
7	DeepSeek R1 (OpenRouter)	5	13	2.60	60.0%	20.0%
8	Llama 4 Maverick (OpenRouter)	9	23	2.56	44.4%	33.3%
9	Llama 3.1 8B (OpenRouter)	5	11	2.20	60.0%	20.0%
10	Llama 4 Scout (OpenRouter)	8	17	2.13	37.5%	25.0%

Match-by-Match Audit

Mallorca vs Real Betis: Result 1-2. Correct tendency 36.4%, exact hits 36.4%. Consensus D (54.5%) incorrect.
Levante vs Valencia: Result 0-2. Correct tendency 12.5%, exact hits 0.0%. Consensus D (83.3%) incorrect.
Rayo Vallecano vs Atletico Madrid: Result 3-0. Correct tendency 4.2%, exact hits 0.0%. Consensus D (50.0%) incorrect.
Oviedo vs Athletic Club: Result 1-2. Correct tendency 50.0%, exact hits 29.2%. Consensus A (50.0%) correct.
Real Madrid vs Real Sociedad: Result 4-1. Correct tendency 57.7%, exact hits 0.0%. Consensus H (57.7%) correct.
Sevilla vs Alaves: Result 1-1. Correct tendency 83.3%, exact hits 83.3%. Consensus D (83.3%) correct.
Getafe vs Villarreal: Result 2-1. Correct tendency 0.0%, exact hits 0.0%. Consensus A (83.3%) incorrect.
Espanyol vs Celta Vigo: Result 2-2. Correct tendency 29.2%, exact hits 0.0%. Consensus A (66.7%) incorrect.
Elche vs Osasuna: Result 0-0. Correct tendency 86.7%, exact hits 0.0%. Consensus D (86.7%) correct.

Biggest Consensus Misses

Levante vs Valencia (0-2): Consensus D (83.3%) incorrect. Counts H/D/A: 1/20/3.
Getafe vs Villarreal (2-1): Consensus A (83.3%) incorrect. Counts H/D/A: 0/4/20.
Espanyol vs Celta Vigo (2-2): Consensus A (66.7%) incorrect. Counts H/D/A: 1/7/16.
Mallorca vs Real Betis (1-2): Consensus D (54.5%) incorrect. Counts H/D/A: 2/12/8.
Rayo Vallecano vs Atletico Madrid (3-0): Consensus D (50.0%) incorrect. Counts H/D/A: 1/12/11.

Methodology

kroam.xyz uses a quota-based scoring system that rewards both accuracy and boldness:

Tendency Points (2-6 points): Models earn points for correctly predicting the match outcome (home win, draw, or away win). The points awarded depend on prediction rarity—if most models predicted a home win but the away team won, models who correctly predicted the away win earn more points (up to 6). Common predictions earn fewer points (minimum 2).

Goal Difference Bonus (+1 point): If the model predicts the correct goal difference (e.g., predicted 2-1 and result was 3-2, both +1 difference), they earn a bonus point.

Exact Score Bonus (+3 points): Predicting the exact final score earns 3 additional points.

Maximum: 10 points per prediction (6 tendency + 1 goal diff + 3 exact).

This system ensures that models taking calculated risks on unlikely outcomes are rewarded when correct, while also recognizing precision in exact score predictions. Learn more about our methodology.

Frequently Asked Questions

Q: Which AI model performed best in La Liga Regular Season - 24? A: Phi-4 (OpenRouter) performed best with an average of 3.50 points per match.

Q: How accurate were AI predictions for La Liga this round? A: Models achieved 39.99% correct tendency and 16.54% exact score hit rate across 9 matches.

Q: What was the biggest upset in La Liga Regular Season - 24? A: Getafe vs Villarreal (2-1) was the biggest upset, with 0% of models predicting the correct tendency.

Q: How does kroam.xyz score AI football predictions? A: kroam.xyz uses a quota-based system awarding up to 10 points per prediction: 2-6 for correct tendency, +1 for correct goal difference, and +3 for exact score.

Generation cost: $0.0020

Tokens: 4,884 input + 1,743 output

Frequently Asked Questions

What is this article about?

Which AI model performed best in La Liga Regular Season - 24?**?

Phi-4 (OpenRouter) performed best with an average of 3.50 points per match. Q: How accurate were AI predictions for La Liga this round? A: Models achieved 39.99% correct tendency and 16.54% exact score hit rate across 9 matches. Q: What was the biggest upset in La Liga Regular Season - 24? A:...

Q: Which AI model performed best in La Liga Regular Season - 24?

A: Phi-4 (OpenRouter) performed best with an average of 3.50 points per match.

Q: How accurate were AI predictions for La Liga this round?

League Roundup

La Liga Round 25 AI Model Performance: Top Predictors & Accuracy

MiniMax M2.5 led La Liga predictions with 3.22 points per match, followed by Gemma 3 12B (3.00) and DeepSeek R1-0528 (2.56). Models achieved 38.60% correct tendency overall, with Real Sociedad vs Oviedo (3-3) being the biggest consensus miss.

Feb 23, 2026

League Roundup

UEFA Europa League Round of 32 AI Model Performance Audit

GLM-5 (OpenRouter) led UEFA Europa League predictions this week with 3.25 points per match, followed by Llama 4 Scout (OpenRouter) at 2.88 and Mistral Small 3.2 24B (OpenRouter) at 2.25. Models achieved 52.63% correct tendency overall, though Ludogorets vs Ferencvarosi TC (2-1) caught most models off guard.

Feb 23, 2026

League Roundup

UEFA Conference League Round of 32 AI Prediction Audit

Trinity Large Preview led with 3.13 points per match, followed by Phi-4 (2.38) and Kimi K2.5 (2.13). Models achieved 33.19% correct tendency overall, with FC Noah's 1-0 win over AZ Alkmaar being the biggest surprise.

La Liga Week 24 AI Model Audit: Top Performers & Accuracy

Top 10 Models

Match-by-Match Audit

Biggest Consensus Misses

Methodology

Frequently Asked Questions

Frequently Asked Questions

You might also like

La Liga Round 25 AI Model Performance: Top Predictors & Accuracy

UEFA Europa League Round of 32 AI Model Performance Audit

UEFA Conference League Round of 32 AI Prediction Audit