Characteristics of document similarity measures for compliance analysis

Asad Sayeed, Soumitra Sarkar, Yu Deng, Rafah Hosn, Ruchi Mahindru, Nithya Rajamani
2009 Proceeding of the 18th ACM conference on Information and knowledge management - CIKM '09  
Due to increased competition in the IT Services business, improving quality, reducing costs and shortening schedules has become extremely important. A key strategy being adopted for achieving these goals is the use of an asset-based approach to service delivery, where standard reusable components developed by domain experts are minimally modified for each customer instead of creating custom solutions. One example of this approach is the use of contract templates, one for each type of service
more » ... ered. A compliance checking system that measures how well actual contracts adhere to standard templates is critical for ensuring the success of such an approach. This paper describes the use of document similarity measures -Cosine similarity and Latent Semantic Indexing -to identify the top candidate templates on which a more detailed (and expensive) compliance analysis can be performed. Comparison of results of using the different methods are presented.
doi:10.1145/1645953.1646106 dblp:conf/cikm/SayeedSDHMR09 fatcat:4udjsossize5jiyv55sv4aan74