A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2021; you can also visit the original URL.
The file type is application/pdf
.
A critical analysis of metrics used for measuring progress in artificial intelligence
[article]
2021
arXiv
pre-print
Comparing model performances on benchmark datasets is an integral part of measuring and driving progress in artificial intelligence. A model's performance on a benchmark dataset is commonly assessed based on a single or a small set of performance metrics. While this enables quick comparisons, it may entail the risk of inadequately reflecting model performance if the metric does not sufficiently cover all performance characteristics. It is unknown to what extent this might impact benchmarking
arXiv:2008.02577v2
fatcat:u6phhkiwnvclhaksgwqudbpkw4