A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2019; you can also visit the original URL.
The file type is application/pdf
.
Filters
Neural Chinese Word Segmentation with Dictionary Knowledge
[article]
2018
arXiv
pre-print
Chinese word segmentation (CWS) is an important task for Chinese NLP. Recently, many neural network based methods have been proposed for CWS. ...
The experimental results on two benchmark datasets validate that our approach can effectively improve the performance of Chinese word segmentation, especially when training data is insufficient. ...
Luckily, many of these rare words are included in Chinese dictionary. If the neural model is aware of that "人工智能" is a Chinese word, then it can better segment the aforementioned sentence. ...
arXiv:1807.05849v1
fatcat:2urhgejndjdzbauc2repbrjxrm
Neural Word Segmentation Learning for Chinese
[article]
2016
arXiv
pre-print
Most previous approaches to Chinese word segmentation formalize this problem as a character-based sequence labeling task where only contextual information within fixed sized local windows and simple interactions ...
In this paper, we propose a novel neural framework which thoroughly eliminates context windows and can utilize complete segmentation history. ...
Chinese
idiom dictionaries. 3
Table 5 : 5 Comparison with previous neural network models. ...
arXiv:1606.04300v2
fatcat:g6d5dnzul5c7fmzssdccbmkss4
Neural Word Segmentation Learning for Chinese
2016
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Most previous approaches to Chinese word segmentation formalize this problem as a character-based sequence labeling task so that only contextual information within fixed sized local windows and simple ...
In this paper, we propose a novel neural framework which thoroughly eliminates context windows and can utilize complete segmentation history. ...
Chinese
idiom dictionaries. 3
Table 5 : 5 Comparison with previous neural network models. ...
doi:10.18653/v1/p16-1039
dblp:conf/acl/CaiZ16
fatcat:dinvkgncqbge3fhrh6eft35nnu
Segmenting Chinese Microtext: Joint Informal-Word Detection and Segmentation with Neural Networks
2017
Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence
word detection can be helpful for microtext processing.In this work, we investigate it under the neural setting, by proposing a joint segmentation model that integrates the detection of informal words ...
State-of-the-art Chinese word segmentation systems typically exploit supervised modelstrained on a standard manually-annotated corpus,achieving performances over 95% on a similar standard testing corpus.However ...
We combine the advantages of both work, aiming to enhance the segmentation of Chinese microtext. On the one hand, we construct training examples automatically with the help of an external dictionary. ...
doi:10.24963/ijcai.2017/591
dblp:conf/ijcai/ZhangFY17
fatcat:ylgslg5q6fbhvnnktutkd7hkum
Text Window Denoising Autoencoder: Building Deep Architecture for Chinese Word Segmentation
[chapter]
2013
Communications in Computer and Information Science
On the PKU dataset of Chinese word segmentation bakeoff 2005, applying this method decreases the F1 error rate by 11.9% for deep neural network based models. ...
We are the first to apply deep learning methods to Chinese word segmentation to our best knowledge. ...
We demonstrated that deep neural networks for Chinese word segmentation can be effectively trained with this model. ...
doi:10.1007/978-3-642-41644-6_1
fatcat:e47ermthx5gkxmv3ilzzs2vf2a
Stochastic Tokenization with a Language Model for Neural Text Classification
2019
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
Sentences are usually segmented with words or subwords by a morphological analyzer or byte pair encoding and then encoded with word (or subword) representations for neural networks. ...
For unsegmented languages such as Japanese and Chinese, tokenization of a sentence has a significant impact on the performance of text classification. ...
(Cai et al., 2017) proposed a similar architecture to the caching mechanism for neural Chinese word segmentation. ...
doi:10.18653/v1/p19-1158
dblp:conf/acl/HiraokaSM19
fatcat:i5afsmop2nck7fgurnj7faq4qm
Generating Abbreviations for Chinese Named Entities Using Recurrent Neural Network with Dynamic Dictionary
2016
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing
It combines recurrent neural network (RNN) with an architecture determining whether a given sequence of characters can be a word or not. ...
To address this problem, we propose a novel neural network architecture to perform task. ...
Hence, most of the Chinese natural language processing methods assume a Chinese word segmenter is used in a pre-processing step to produce word-segmented Chinese sentences as input. ...
doi:10.18653/v1/d16-1069
dblp:conf/emnlp/ZhangQGZH16
fatcat:n2htt6dtyjh5fcrhfv6genl3re
Intelligent diagnosis with Chinese electronic medical records based on convolutional neural networks
2019
BMC Bioinformatics
Therefore, effective word segmentation, word representation and model architecture are the core technologies in the literature on Chinese EMRs. ...
Chinese language compared with English. ...
Furthermore, there were not
Discussion
Impact of the Chinese medical dictionary on word segmentation With the dictionary-based word segmentation method incorporating our pediatric medical dictionary ...
doi:10.1186/s12859-019-2617-8
fatcat:vsddtl6yhrhrhh34u33kqsndra
Bidirectional Gated Recurrent Unit Neural Network for Chinese Address Element Segmentation
2020
ISPRS International Journal of Geo-Information
This method uses the Bi-GRU neural network to generate tag features based on Chinese word segmentation and then uses the Viterbi algorithm to perform tag inference to achieve the segmentation of Chinese ...
Coupled with the diversity and complexity in Chinese address expressions, the segmentation of Chinese address elements is a substantial challenge. ...
Currently, the most commonly used Chinese word segmentation methods are as follows: (1) The dictionary-based string-matching method matches the strings to be segmented with a dictionary library one by ...
doi:10.3390/ijgi9110635
fatcat:7da2vdilnrcufmked3nazktnyu
Shrinking
2019
Proceedings of the 2019 Conference of the North
For languages without natural word boundaries, like Japanese and Chinese, word segmentation is a prerequisite for downstream analysis. ...
Morphological analyzers are trained on data hand-annotated with segmentation boundaries and part of speech tags. ...
A neural model with only the unigram character input can solve word segmentation and POS tagging only if it builds some knowledge about the dictionary internally. ...
doi:10.18653/v1/n19-1281
dblp:conf/naacl/TolmachevKK19
fatcat:pv2kqfv3yfaj7kayx7hfrl3thy
A New Chinese Word Segmentation Method Based on Maximum Matching
2018
Journal of Information Hiding and Multimedia Signal Processing
However, Chinese unique composition determines the Chinese is far more complicated than English. So in this paper, we propose a new Chinese word segmentation method based on maximum matching. ...
Automatic Chinese word segmentation is a hot issue in information extraction, machine translation, information retrieval, automatic text categorization, speech recognition, and the voice conversion, natural ...
Chinese word segmentation, i.e., a Chinese character sequence is segmented into words according to certain rules. ...
dblp:journals/jihmsp/ZhaoLYS18
fatcat:q5engxoupjd35fk4jnufqui35y
Vietnamese Word Segmentation
2001
Natural Language Processing Pacific Rim Symposium
We evaluate the performance by comparing its word segmentation results with the manually annotated corpus and its performance proves to be very good. ...
This word segmentation system is applied to Text-to-speech of Vietnamese and POS-tagger of Vietnamese. ...
We apply WFST model for Chinese Word segmentation into our task as follows (Richard Sproat, 1996) : We represent the dictionary D as a Weighted Finite State Transducer. ...
dblp:conf/nlprs/DienKT01
fatcat:ghy3j4jwyvftply4ckwh7n6kqu
Morphological Analysis for Unsegmented Languages using Recurrent Neural Network Language Model
2015
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing
We present a new morphological analysis model that considers semantic plausibility of word sequences by using a recurrent neural network language model (RNNLM). ...
In unsegmented languages, since language models are learned from automatically segmented texts and inevitably contain errors, it is not apparent that conventional language models contribute to morphological ...
Neural network based models have been proposed for Chinese word segmentation and POS tagging (Pei et al., 2014) or word segmentation (Mansur et al., 2013) . ...
doi:10.18653/v1/d15-1276
dblp:conf/emnlp/MoritaKK15
fatcat:zyshrvbbkbfdbkbkbr3ad4swuu
Chinese News Text Classification Based on Convolutional Neural Network
2022
Journal on Big Data
This paper introduces a combinedconvolutional neural network text classification model based on word2vec and improved TF-IDF: firstly, the word vector is trained through word2vec model, then the weight ...
With the explosive growth of Internet text information, the task of text classification is more important. ...
Chinese Word Segmentation The common means of text word segmentation are: using word segmentation tools to segment the text directly, using existing dictionaries to segment words, and establishing word ...
doi:10.32604/jbd.2022.027717
fatcat:lovtogkvgrcynnlrnbg7euonde
Pre-screening Textual Based Evaluation for the Diagnosed Female Breast Cancer (WBC)
2019
Revue d'intelligence artificielle : Revue des Sciences et Technologies de l'Information
Keywords: virtual assistance, sequence to sequence neural network, bigram and trigram Neural network for word segmentation (WS) CWS is usually refer as Chinese-based labelling. For each ...
The integrated models are critical to text-based Chinese word segmentation (CWS). The sequence-to-sequence learning was introduced to covert the CWS into a framework of sequence classification. ...
Recently, a different number of neural network approaches models have been proposed for Chinese word segmentation (CWS). ...
doi:10.18280/ria.330401
fatcat:ada3hfbnd5a5zmhazyepc7coxq
« Previous
Showing results 1 — 15 out of 4,629 results