STATISTICAL SIGNIFICANCE IN MULTILINGUAL INFORMATION RETRIEVAL (MLIR) SYSTEM

Sadanandam Manchala
2012 IOSR Journal of Engineering  
The efficiency of retrieval system is precise by comparing performance on a regular set of queries in Information Retrieval (IR) and MLIR systems. Significance tests are often used to estimate the reliability of such comparisons. In this research paper, we revisit the question of how such significance tests should be used. We find that the t-test is highly reliable than the sign and Wilcoxon test, and is far more reliable than simply showing a large percentage difference in effectiveness
more » ... s between IR and MLIR systems. Our results show that previous experimental work on significance tests over-estimated the error of such tests. We also re-consider comparisons between the reliability of Average Precision(AP), Mean Average Precision(MAP), Average Mean Reciprocal Rank (AMRR) and Average Discounted Cumulative Gain (ADCG) arguing that past comparisons did not consider the assessor effort required to compute such measures. This research shows that judge effort would be better spent building test collections with more queries, each assessed in less detail.
doi:10.9790/3021-0204794802 fatcat:furywk2wzzbdbhyyuqzzenq5hu