A study by Princeton University shows that benchmarks made for AI agents don't account for costs and are prone to overfitting.
https://venturebeat.com/ai/ai-agent-benchmarks-are-misleading-study-warns/?utm_source=dlvr.it&utm_medium=blogger
https://venturebeat.com/ai/ai-agent-benchmarks-are-misleading-study-warns/?utm_source=dlvr.it&utm_medium=blogger