A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2003; you can also visit the original URL.
The file type is application/pdf
.
The DOP Estimation Method Is Biased and Inconsistent
2002
Computational Linguistics
A "Data-Oriented Parsing" or DOP model for statistical parsing associates fragments of linguistic representations with numerical weights, where these weights are estimated by normalizing the empirical frequency of each fragment in a training corpus (see Bod (1998) and references cited therein). This note observes that this estimation method is biased and inconsistent; i.e., that the estimated distribution does not in general converge on the true distribution as the size of the training corpus
doi:10.1162/089120102317341783
fatcat:hsx5shojtrevnn2fqsjm2qwqyy