Back to Archive
AutoBench Agronomy LLM Benchmark - December 2025
The first AutoBench run for the Agronomy domain with models Gemini 3 Pro, Gpt 5.1, Grok 4.1, Opus 4.5 and more
Past
Date
December 10, 2025
Version
2025-12-10
Models
40
New Models
17
Run data
Model | Average (All Topics) | Coding | Creative Writing | Current News | General Culture | Grammar | History | Logics | Math | Science | Technology |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 7.70 (#40) | - | - | - | - | - | - | - | - | - | - | |
| 7.31 (#39) | - | - | - | - | - | - | - | - | - | - | |
| 5.43 (#38) | - | - | - | - | - | - | - | - | - | - | |
| 3.95 (#37) | - | - | - | - | - | - | - | - | - | - | |
| 3.88 (#36) | - | - | - | - | - | - | - | - | - | - | |
| 3.41 (#35) | - | - | - | - | - | - | - | - | - | - | |
| 2.08 (#34) | - | - | - | - | - | - | - | - | - | - | |
| 1.95 (#33) | - | - | - | - | - | - | - | - | - | - | |
| 0.81 (#32) | - | - | - | - | - | - | - | - | - | - | |
| 0.80 (#31) | - | - | - | - | - | - | - | - | - | - | |
| 0.67 (#30) | - | - | - | - | - | - | - | - | - | - | |
| 0.43 (#29) | - | - | - | - | - | - | - | - | - | - | |
| 0.40 (#28) | - | - | - | - | - | - | - | - | - | - | |
| 0.36 (#27) | - | - | - | - | - | - | - | - | - | - | |
| 0.34 (#26) | - | - | - | - | - | - | - | - | - | - | |
| 0.33 (#25) | - | - | - | - | - | - | - | - | - | - | |
| 0.30 (#24) | - | - | - | - | - | - | - | - | - | - | |
| 0.21 (#23) | - | - | - | - | - | - | - | - | - | - | |
| 0.21 (#22) | - | - | - | - | - | - | - | - | - | - | |
| 0.16 (#21) | - | - | - | - | - | - | - | - | - | - | |
| 0.16 (#20) | - | - | - | - | - | - | - | - | - | - | |
| 0.13 (#19) | - | - | - | - | - | - | - | - | - | - | |
| 0.11 (#18) | - | - | - | - | - | - | - | - | - | - | |
| 0.10 (#17) | - | - | - | - | - | - | - | - | - | - | |
| 0.10 (#16) | - | - | - | - | - | - | - | - | - | - | |
| 0.10 (#15) | - | - | - | - | - | - | - | - | - | - | |
| 0.08 (#14) | - | - | - | - | - | - | - | - | - | - | |
| 0.08 (#13) | - | - | - | - | - | - | - | - | - | - | |
| 0.07 (#12) | - | - | - | - | - | - | - | - | - | - | |
| 0.07 (#11) | - | - | - | - | - | - | - | - | - | - | |
| 0.07 (#10) | - | - | - | - | - | - | - | - | - | - | |
| 0.07 (#9) | - | - | - | - | - | - | - | - | - | - | |
| 0.05 (#8) | - | - | - | - | - | - | - | - | - | - | |
| 0.03 (#7) | - | - | - | - | - | - | - | - | - | - | |
| 0.03 (#6) | - | - | - | - | - | - | - | - | - | - | |
| 0.03 (#5) | - | - | - | - | - | - | - | - | - | - | |
| 0.02 (#4) | - | - | - | - | - | - | - | - | - | - | |
| 0.02 (#3) | - | - | - | - | - | - | - | - | - | - | |
| 0.02 (#2) | - | - | - | - | - | - | - | - | - | - | |
| 0.01 (#1) | - | - | - | - | - | - | - | - | - | - |