An evaluation of statistical approaches to MEDLINE indexing

Y Yang
1996 Proceedings : a conference of the American Medical Informatics Association. AMIA Fall Symposium  
Whether or not high accuracy classification methods can be scaled to large applications is crucial for the ultimate usefulness of such methods in text categorization. This paper applies two statistical learning algorithms, the Linear Least Squares Fit (LLSF) mapping and a Nearest Neighbor classifier named ExpNet, to a large collection of MEDLINE documents. With the use of suitable dimensionality reduction techniques and efficient algorithms, both LLSF and ExpNet successfully scaled to this very
more » ... large problem with a result significantly outperforming word-matching and other automatic learning methods applied to the same corpus.
pmid:8947688 pmcid:PMC2233015 fatcat:6ol3ci44mbfwhcvua42woighpy