Automatic processing of large corpora for the resolution of anaphora references

Ido Dagan, Alon Itai
1990 Proceedings of the 13th conference on Computational linguistics -   unpublished
Manual acquisition of semantic constraints in broad domains is very expensive. This paper presents an automatic scheme for collecting statistics on cooccurrence patterns in a large corpus. To a large extent, these statistics reflect, semantic constraints and thus are used to disambiguate anaphora references and syntactic ambiguities. The scherne was implemented by gathering statistics on the output of other linguistic tools. An experiment was performed to resolve references of the pronoun "it"
more » ... n sentences that were randomly selected from the corpus. Ttle results of the experiment show that in most of the cases the cooccurrence statistics indeed reflect the semantic constraints and thus provide a basis {'or a useful disambiguat.ion tool.
doi:10.3115/991146.991209 fatcat:q3x4xqsdyvaqvjiqhaeptixdfa