LMArena's leads on pioneering LLM evals with ChatBot Arena and MT-Bench, adjusting for human bias with Style Control, and replacing static benchmarks with dynamic evaluations.
Share this post
In the Arena: How LMSys changed LLM…
Share this post
LMArena's leads on pioneering LLM evals with ChatBot Arena and MT-Bench, adjusting for human bias with Style Control, and replacing static benchmarks with dynamic evaluations.