A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2018; you can also visit the original URL.
The file type is
This paper investigates semi-supervised methods for discriminative language modeling, whereby n-best lists are "hallucinated" for given reference text and are then used for training n-gram language models using the perceptron algorithm. We perform controlled experiments on a very strong baseline English CTS system, comparing three methods for simulating ASR output, and compare the results with training with "real" n-best list output from the baseline recognizer. We find that methods based ondoi:10.1109/icassp.2012.6289043 dblp:conf/icassp/SagaeLPXGKKRSSBCCHHKLPR12 fatcat:emi5pcwldrgdrlvi26wtyh6aqy