A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2021; you can also visit the original URL.
The file type is application/pdf
.
Learning Domain Terms - Empirical Methods to Enhance Enterprise Text Analytics Performance
2020
Proceedings of the 28th International Conference on Computational Linguistics: Industry Track
unpublished
Performance of standard text analytics algorithms are known to be substantially degraded on consumer generated data, which are often very noisy. These algorithms also do not work well on enterprise data which has a very different nature from News repositories, storybooks or Wikipedia data. Text cleaning is a mandatory step which aims at noise removal and correction to improve performance. However, enterprise data need special cleaning methods since it contains many domain terms which appear to
doi:10.18653/v1/2020.coling-industry.18
fatcat:afs4j7lflzahtl7ebzjhevi4rq