Language Performance Benchmark

Italian Benchmark Evaluates Large Language Models, Includes AI Translation

A new community-driven initiative evaluates large language models using Italian-native tasks, with AI translation among the ...

VentureBeat

MLPerf 3.1 adds large language model benchmarks for inference

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More MLCommons is growing its suite of MLPerf AI benchmarks with the addition ...

Hosted on MSN

LiveBench: A Dynamic Benchmark for Large Language Models

In an article recently submitted to the arXiv* server, researchers introduced LiveBench, a benchmark designed to prevent test set contamination and biases from large language model (LLM) judging and ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results

Italian Benchmark Evaluates Large Language Models, Includes AI Translation

MLPerf 3.1 adds large language model benchmarks for inference

LiveBench: A Dynamic Benchmark for Large Language Models

Trending now