Why AI benchmarks are misleading

Dec 06, 2021

This issue was sponsored by Edge Impulse, the world’s easiest platform for embedded ML.

For decades, researchers have used benchmarks to measure progress in different areas of artificial intelligence such as vision and language. Especially in the past few years, with deep learning becoming very popular, benchmarks have become a narrow focus for many research labs and scientists. But while benchmarks can help compare the performance of AI systems on specific problems, they are often taken out of context, sometimes to harmful results.

In a paper accepted at the NeurIPS 2021 conference, scientists at University of California, Berkley, University of Washington, and Google outline the limits of popular AI benchmarks. The scientists warn that progress on benchmarks is often used to make claims of progress toward general areas of intelligence, which is far beyond the tasks these benchmarks are designed for.

“We do not deny the utility of such benchmarks, but rather hope to point to the risks inherent in their framing,” the researchers write.

Read the full article on TechTalks.

For more on AI research:

Build embedded ML models in minutes with Edge Impulse! Sign up for your free account in December and you'll be automatically entered to win one of 100 Arduino Machine Vision bundles.

TechTalks

Why AI benchmarks are misleading

Discussion about this post