What gets measured gets machine learned. A brief history of LLM Benchmarking and the devil in their details
Thanks so much for sharing these! I'm loving them!
Loving your work, guys.
Do you know of anyone that is independently benchmarking and reporting on the relative performance of available open/closed models, at the moment?
Guys, thank you so much for both the AI fundamentals episode and the link to the notes repo.
Thanks so much for sharing these! I'm loving them!
Loving your work, guys.
Do you know of anyone that is independently benchmarking and reporting on the relative performance of available open/closed models, at the moment?
Guys, thank you so much for both the AI fundamentals episode and the link to the notes repo.