The dimensions of indexing

W John Wilbur, Won Kim
2003 AMIA Annual Symposium Proceedings  
Indexing of documents is an important strategy intended to make the literature more readily available to the user. Here we describe several dimensions of indexing that are important if indexing is to be optimal. These dimensions are coverage, predictability, and transparency. MeSH terms and text words are compared in MEDLINE in regard to these dimensions. Part of our analysis consists in applying AdaBoost with decisions trees as the weak learners to estimate how reliably index terms are being
more » ... signed and how complex the criteria are by which they are being assigned. Our conclusions are that MeSH terms are more predictable and more transparent than text words.
pmid:14728266 pmcid:PMC1480214 fatcat:qjqs25tanfegzlsrhqv3dkch5e