← Повернутись до каталогу
Show HN: LLMadness – March Madness Model Evals logo

Show HN: LLMadness – March Madness Model Evals

4.2(10,000 відгуків)
Freemium· від US$20.00/міс.
Запущено в 2026

Sobre

I wanted to play around with the non-coding agentic capabilities of the top LLMs so I built a model eval predicting the March Madness bracket.<p>After playing around a bit with the format, I went with the following setup:<p>- 63 single-game predictions v. full one-shot bracket<p>- Maxed out at 10 tool calls per game<p>- Upset-specific instruction in the system prompt<p>- Exponential scoring by round (1, 2, 4, 8, 16, 32)<p>There were some interesting learnings:<p>- Unsurprisingly, most brackets a

Плюси

  • +Capacidade de fazer previsões de March Madness
  • +Modelo de avaliação de LLMs
  • +Opção de fazer previsões de jogos individuais ou um-shot bracket

Мінуси

  • Limitações no número de tool calls por jogo
  • Necessidade de instruções específicas para upsets

Você também pode gostar