Feature Selection for Learning-to-Rank using Simulated Annealing
International Journal of Advanced Computer Science and Applications
Machine learning is being applied to almost all corners of our society today. The inherent power of large amount of empirical data coupled with smart statistical techniques makes it a perfect choice for almost all prediction tasks of human life. Information retrieval is a discipline that deals with fetching useful information from a large number of documents. Given that today millions, even billions, of digital documents are available, it is no surprise that machine learning can be tailored to
... can be tailored to this task. The task of learning-to-rank has thus emerged as a wellstudied domain where the system retrieves the relevant documents from a document corpus with respect to a given query. To be successful in this retrieving task, machine learning models need a highly useful set of features. To this end, meta-heuristic optimization algorithms may be utilized. The aim of this work is to investigate the applicability of a notable meta-heuristic algorithm called simulated annealing to select an effective subset of features from the feature pool. To be precise, we apply simulated annealing algorithm on the well-known learning-torank datasets to methodically select the best subset of features. Our empirical results show that the proposed framework achieve gain in accuracy while using a smaller subset of features, thereby reducing training time and increasing effectiveness of learningto-rank algorithms.