APIEval-20

An open benchmark for AI agents that test APIs

4.5(100 Bewertungen)

Kostenlos

Gestartet in 2026

APIEval-20 besuchen

Sobre

APIEval-20 is a black-box benchmark for API testing agents. Each agent gets only a JSON schema and one sample payload, then generates a test suite. We run those tests against live reference APIs with planted bugs and score bug detection, API coverage, and efficiency. Unlike LLM-as-judge evals, scoring is fully objective: a bug is either caught or it isn’t. Tasks span auth, errors, pagination, schemas, and multi-step flows. Open on Hugging Face.

Vorteile

+Avaliação objetiva de agentes de teste de API
+Cobertura de tarefas como autenticação, erros e paginação
+Disponível no Hugging Face para fácil acesso

Nachteile

−Requer conhecimento técnico para implementação
−Pode não ser adequado para todos os tipos de APIs

APIEval-20

Sobre

Vorteile

Nachteile

Você também pode gostar

Voice Agent API

Agentic API Grader by SaaStr.ai

SEEV-AI