A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2021; you can also visit the original URL.
The file type is application/pdf
.
Filters
Spelling Correction with Denoising Transformer
[article]
2021
arXiv
pre-print
This procedure is used to train the production spelling correction model based on a transformer architecture. This model is currently served in the HubSpot product search. ...
We present a novel method of performing spelling correction on short input strings, such as search queries or individual words. ...
Conclusion We presented a novel method for spelling correction -a denoising autoencoder transformer based on a noise generation procedure which generates artificial spelling mistakes in a realistic manner ...
arXiv:2105.05977v1
fatcat:utgqze55lncojepfrfykofbnse
Combining a Context Aware Neural Network with a Denoising Autoencoder for Measuring String Similarities
[article]
2018
arXiv
pre-print
The experimental results show that the resulting metrics succeeds in 85.4\% of the cases in finding the correct version of a non-standard spelling among the closest words, compared to 63.2\% with the established ...
Non-Standard and standard spellings of the same words, and (2) the context of the words. ...
The next examples in Table 2 presents a non-standard spelling for which the approach with denoising autoencoder fails to recognize the correct version in the five closest word: The correct version of ...
arXiv:1807.06414v1
fatcat:2tcv2b56qvgkjb5dwa2pt5f6ka
Contextual Text Denoising with Masked Language Models
[article]
2019
arXiv
pre-print
We propose a new contextual text denoising algorithm based on the ready-to-use masked language model. ...
Recently, with the help of deep learning models, significant advances have been made in different Natural Language Processing (NLP) tasks. ...
9 https://github.com/pytorch/fairseq/tree/master/exam such as CoNLL-2014, to further fine-tune the denoising model in a supervised way to improve the performance. ...
arXiv:1910.14080v1
fatcat:z6cqbph3jjdgzivxumiowwchsi
An Improved Text Extraction Approach with Auto Encoder for Creating Your Own Audiobook
2022
International Journal of Information Retrieval Research
Our result analysis demonstrates that with denoising and spell checking, our model has achieved an accuracy of 98.11% when compared to 84.02% without any denoising or spell check mechanism. ...
As an initial step, deep learning techniques are constructed to denoise the images that are fed to the system. This is followed by text extraction with the help of OCR engines. ...
From the bar plot, we can infer that the post processing method i.e. denoised with spell speck gives significant accuracy of about 98.6% compared to 95% with only denoising and no spellcheck. ...
doi:10.4018/ijirr.289570
fatcat:zjmtlsoxzveuxfn5cw2dzv6wka
Context-aware Stand-alone Neural Spelling Correction
[article]
2020
arXiv
pre-print
Inspired by this, we address the stand-alone spelling correction problem, which only corrects the spelling of each token without additional token insertion or deletion, by utilizing both spelling information ...
On the contrary, humans can easily infer the corresponding correct words from their misspellings and surrounding context. ...
Having a surprisingly robust language processing system to denoise the scrambled spellings, humans can relatively easily solve spelling correction (Rawlinson, 1976) . ...
arXiv:2011.06642v1
fatcat:bp5qmobl65fjhns6p5zzohlncm
Improving Grammatical Error Correction via Pre-Training a Copy-Augmented Architecture with Unlabeled Data
[article]
2019
arXiv
pre-print
We pre-train the copy-augmented architecture with a denoising auto-encoder using the unlabeled One Billion Benchmark and make comparisons between the fully pre-trained model and a partially pre-trained ...
Neural machine translation systems have become state-of-the-art approaches for Grammatical Error Correction (GEC) task. ...
We build a statistical-based spell error correction system and correct the spell errors in our training data. ...
arXiv:1903.00138v3
fatcat:z3tyvcqg5ndjbehxuuq5aa2hhy
Improving Grammatical Error Correction via Pre-Training a Copy-Augmented Architecture with Unlabeled Data
2019
Proceedings of the 2019 Conference of the North
We pre-train the copy-augmented architecture with a denoising auto-encoder using the unlabeled One Billion Benchmark and make comparisons between the fully pre-trained model and a partially pretrained ...
Neural machine translation systems have become state-of-the-art approaches for Grammatical Error Correction (GEC) task. ...
We build a statistical-based spell error correction system and correct the spell errors in our training data. ...
doi:10.18653/v1/n19-1014
dblp:conf/naacl/ZhaoWSJL19
fatcat:gnstwmpncfemhbhi4gqpynd2ni
Stacked DeBERT: All Attention in Incomplete Data for Text Classification
[article]
2020
arXiv
pre-print
In this paper, we propose Stacked DeBERT, short for Stacked Denoising Bidirectional Encoder Representations from Transformers. ...
These intermediate features are given as input to novel denoising transformers which are responsible for obtaining richer input representations. ...
Our approach consists of obtaining richer input representations from input tokens by stacking denoising transformers on an embedding layer with vanilla transformers. ...
arXiv:2001.00137v1
fatcat:7malbdga7jd5bj6pzej6naos5y
Improving Robustness of Neural Machine Translation with Multi-task Learning
2019
Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1)
Our model achieves a BLEU score of 32.8 on the shared task French to English dataset, which is 7.1 BLEU points higher than the baseline vanilla transformer trained with clean text 1 . ...
In this work, we propose a multitask learning algorithm for transformer-based MT systems that is more resilient to this noise. ...
Denoising text: Sakaguchi et al. (2017) proposes semi-character level recurrent neural network (scRNN) to correct words with scrambling characters. ...
doi:10.18653/v1/w19-5368
dblp:conf/wmt/ZhouZZAN19
fatcat:7uwf4rvgzrcatcx2qczpqs5eq4
Read, Listen, and See: Leveraging Multimodal Information Helps Chinese Spell Checking
[article]
2021
arXiv
pre-print
Chinese Spell Checking (CSC) aims to detect and correct erroneous characters for user-generated text in the Chinese language. ...
However, these methods use either heuristics or handcrafted confusion sets to predict the correct character. ...
., 2019) with the Transformer library (Wolf et al., 2020) . ...
arXiv:2105.12306v1
fatcat:hzjipz5y4va5dpx7tqve7urzu4
Pre-Training-Based Grammatical Error Correction Model for the Written Language of Chinese Hearing Impaired Students
2022
IEEE Access
Via the re-ranking strategy, our model can correct various kinds of errors including spelling and complex syntax errors. ...
The comparison experiments with baseline models show that our model obtains superior performance either in the hearing impaired students' grammatical error correction or in a common grammatical error correction ...
model + Spelling correction S1 denotes the N-gram language model for correcting the spelling errors in section 3.1. ...
doi:10.1109/access.2022.3159676
fatcat:fhez37kyovbmlnr3rjnmxkayhe
Denoising of Disturbed Signal using Reconstruction Technique of EMD for Railway Bearing Condition Monitoring
2020
Zenodo
This study presents an effective denoising noisy signal for bearing condition monitoring. ...
The Hilbert- Huang spectrum (HHT) spectrum of reconstruction signal was generated by applying Hilbert transform. ...
HHT with the denoising signal using reconstruction technique works well efficiently than HHT without denoising signal using reconstruction technique for bearing condition monitoring Fig. 1 1 IMF component ...
doi:10.5281/zenodo.4418876
fatcat:2k4rz5cnezg7vkxfkwr7i2ztsy
Denoising of Disturbed Signal using Reconstruction Technique of EMD for Railway Bearing Condition Monitoring
2020
Zenodo
This study presents an effective denoising noisy signal for bearing condition monitoring. ...
The Hilbert- Huang spectrum (HHT) spectrum of reconstruction signal was generated by applying Hilbert transform. ...
HHT with the denoising signal using reconstruction technique works well efficiently than HHT without denoising signal using reconstruction technique for bearing condition monitoring Fig. 1 1 IMF component ...
doi:10.5281/zenodo.4418872
fatcat:ufzbzzsqlzcxvhqz4no6qrdckq
Denoising of Disturbed Signal using Reconstruction Technique of EMD for Railway Bearing Condition Monitoring
2020
Zenodo
This study presents an effective denoising noisy signal for bearing condition monitoring. ...
The Hilbert- Huang spectrum (HHT) spectrum of reconstruction signal was generated by applying Hilbert transform. ...
HHT with the denoising signal using reconstruction technique works well efficiently than HHT without denoising signal using reconstruction technique for bearing condition monitoring Fig. 1 1 IMF component ...
doi:10.5281/zenodo.4418875
fatcat:fsbm6mapc5ewzaqhkcx7ctbkoe
Denoising based Sequence-to-Sequence Pre-training for Text Generation
2019
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
We conduct experiments on two text generation tasks: abstractive summarization, and grammatical error correction. ...
Meanwhile, we design a hybrid model of Transformer and pointer-generator networks as the backbone architecture for PoDA. ...
Simple spelling errors are corrected based on edit distance. The dataset statistics are shown in Table 5 . and GLEU score from 56.52 to 59.02(+2.50) for JFLEG. ...
doi:10.18653/v1/d19-1412
dblp:conf/emnlp/WangZJLL19
fatcat:b6w57svbkjhrziz5it6wo472ua
« Previous
Showing results 1 — 15 out of 483 results