Filters








3,729 Hits in 4.4 sec

Neural Chinese Word Segmentation with Dictionary Knowledge [article]

Junxin Liu, Fangzhao Wu, Chuhan Wu, Yongfeng Huang, Xing Xie
2018 arXiv   pre-print
Chinese word segmentation (CWS) is an important task for Chinese NLP. Recently, many neural network based methods have been proposed for CWS.  ...  The experimental results on two benchmark datasets validate that our approach can effectively improve the performance of Chinese word segmentation, especially when training data is insufficient.  ...  Luckily, many of these rare words are included in Chinese dictionary. If the neural model is aware of that "人工智能" is a Chinese word, then it can better segment the aforementioned sentence.  ... 
arXiv:1807.05849v1 fatcat:2urhgejndjdzbauc2repbrjxrm

Intelligent diagnosis with Chinese electronic medical records based on convolutional neural networks

Xiaozheng Li, Huazhen Wang, Huixin He, Jixiang Du, Jian Chen, Jinzhun Wu
2019 BMC Bioinformatics  
Therefore, effective word segmentation, word representation and model architecture are the core technologies in the literature on Chinese EMRs.  ...  Chinese language compared with English.  ...  Furthermore, there were not Discussion Impact of the Chinese medical dictionary on word segmentation With the dictionary-based word segmentation method incorporating our pediatric medical dictionary  ... 
doi:10.1186/s12859-019-2617-8 fatcat:vsddtl6yhrhrhh34u33kqsndra

Text Window Denoising Autoencoder: Building Deep Architecture for Chinese Word Segmentation [chapter]

Ke Wu, Zhiqiang Gao, Cheng Peng, Xiao Wen
2013 Communications in Computer and Information Science  
We are the first to apply deep learning methods to Chinese word segmentation to our best knowledge.  ...  On the PKU dataset of Chinese word segmentation bakeoff 2005, applying this method decreases the F1 error rate by 11.9% for deep neural network based models.  ...  To our best knowledge, we are the first to apply deep learning methods to Chinese word segmentation.  ... 
doi:10.1007/978-3-642-41644-6_1 fatcat:e47ermthx5gkxmv3ilzzs2vf2a

DUTIR at the CCKS-2018 Task1: A Neural Network Ensemble Approach for Chinese Clinical Named Entity Recognition

Ling Luo, Nan Li, Shuaichi Li, Zhihao Yang, Hongfei Lin
2018 China Conference on Knowledge Graph and Semantic Computing  
., stroke, word segmentation and dictionary features) are adopted.  ...  In this task, we presented a neural network ensemble approach, which combines five individual neural network models (i.e., CNN-CRF, BiLSTM-CRF, BiLSTM-CNN-CRF, BiLSTM+CNN-CRF and Lattice LSTM).  ...  ., word embedding, dictionary feature and stroke feature) are introduced into the model. Then with the embeddings as input, five neural network models are trained by the annotated training set.  ... 
dblp:conf/ccks/LuoLLYL18 fatcat:e6r6g53xajgenjcdohwedebzuu

Neural Word Segmentation Learning for Chinese [article]

Deng Cai, Hai Zhao
2016 arXiv   pre-print
Most previous approaches to Chinese word segmentation formalize this problem as a character-based sequence labeling task where only contextual information within fixed sized local windows and simple interactions  ...  In this paper, we propose a novel neural framework which thoroughly eliminates context windows and can utilize complete segmentation history.  ...  Chinese idiom dictionaries. 3 Table 5 : 5 Comparison with previous neural network models.  ... 
arXiv:1606.04300v2 fatcat:g6d5dnzul5c7fmzssdccbmkss4

Radical-Enhanced Chinese Character Embedding [article]

Yaming Sun, Lei Lin, Duyu Tang, Nan Yang, Zhenzhou Ji, Xiaolong Wang
2014 arXiv   pre-print
We develop a dedicated neural architecture to effectively learn character embedding and apply it on Chinese character similarity judgement and Chinese word segmentation.  ...  However, existing Chinese processing algorithms typically regard word or character as the basic unit but ignore the crucial radical information.  ...  Results and Analysis Chinese Word Segmentation In this part, we apply character embedding as features for Chinese word segmentation using neural CRF.  ... 
arXiv:1404.4714v1 fatcat:kvmujth2kjhuvpkbxjplgyqz3u

A New Chinese Word Segmentation Method Based on Maximum Matching

Yue Zhao, Hang Li, Shoulin Yin, Yang Sun
2018 Journal of Information Hiding and Multimedia Signal Processing  
However, Chinese unique composition determines the Chinese is far more complicated than English. So in this paper, we propose a new Chinese word segmentation method based on maximum matching.  ...  Automatic Chinese word segmentation is a hot issue in information extraction, machine translation, information retrieval, automatic text categorization, speech recognition, and the voice conversion, natural  ...  Chen [7] proposed adversarial multicriteria learning for Chinese word segmentation by integrating shared knowledge from multiple heterogeneous segmentation criteria.  ... 
dblp:journals/jihmsp/ZhaoLYS18 fatcat:q5engxoupjd35fk4jnufqui35y

Using Domain Knowledge for Low Resource Named Entity Recognition [article]

Yuan Shi
2022 arXiv   pre-print
We use dictionary information for each word to strengthen its word embedding and domain labeled data to reinforce the recognition effect.  ...  To solve these problems, enlightened by a processing method of Chinese named entity recognition, we propose to use domain knowledge to improve the performance of named entity recognition in areas with  ...  In pre-trained word vector, in addition to word level knowledge, character-aware neural language model is also introduced to extract character level knowledge.  ... 
arXiv:2203.14738v1 fatcat:qz4k4rbjvzg35ahltt4vjrpwvm

Neural Word Segmentation Learning for Chinese

Deng Cai, Hai Zhao
2016 Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)  
Most previous approaches to Chinese word segmentation formalize this problem as a character-based sequence labeling task so that only contextual information within fixed sized local windows and simple  ...  In this paper, we propose a novel neural framework which thoroughly eliminates context windows and can utilize complete segmentation history.  ...  Chinese idiom dictionaries. 3 Table 5 : 5 Comparison with previous neural network models.  ... 
doi:10.18653/v1/p16-1039 dblp:conf/acl/CaiZ16 fatcat:dinvkgncqbge3fhrh6eft35nnu

Generating Abbreviations for Chinese Named Entities Using Recurrent Neural Network with Dynamic Dictionary

Qi Zhang, Jin Qian, Ya Guo, Yaqian Zhou, Xuanjing Huang
2016 Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing  
It combines recurrent neural network (RNN) with an architecture determining whether a given sequence of characters can be a word or not.  ...  To address this problem, we propose a novel neural network architecture to perform task.  ...  Hence, most of the Chinese natural language processing methods assume a Chinese word segmenter is used in a pre-processing step to produce word-segmented Chinese sentences as input.  ... 
doi:10.18653/v1/d16-1069 dblp:conf/emnlp/ZhangQGZH16 fatcat:n2htt6dtyjh5fcrhfv6genl3re

Constructing Bi-order-Transformer-CRF with Neural Cosine Similarity Function for power metering entity recognition

Kaihong Zheng, Jingfeng Yang, Lukun Zeng, Qihang Gong, Sheng Li, Shangli Zhou
2021 IEEE Access  
Specifically, to alleviate the problem of fuzzy entity boundaries, we train our power metering word-vectors, and then we design Neural Cosine Similarity Function for distinguishing similar entities and  ...  In recent years, knowledge graphs are applied to provide knowledge support and data support for power grid monitoring and decision-making.  ...  The mentioned Chinese data is used as the self-definition Dictionary of "jieba" word segmentation tool.  ... 
doi:10.1109/access.2021.3112541 fatcat:3yl5cjtpi5g6naiqswttxrznzu

Research on Domain Term Dictionary Construction Based on Chinese Wikipedia

Yu-wen ZHANG, Bao-an LI, Xue-qiang LV, Ning SUN, Jing-Jing TIAN
2019 DEStech Transactions on Computer Science and Engineering  
Then we use word clustering algorithm and seed word extraction method to construct an original domain dictionary. Moreover, neural network method is applied to extend domain dictionary.  ...  In this paper, we propose a novel method for constructing domain term dictionary based on Chinese Wikipedia web resource and deep learning technology.  ...  The appearance of these words are mainly due to the Chinese word segmentation errors of word segment tools. The accuracy of the new words identified by Bi-directional LSTM is higher than others.  ... 
doi:10.12783/dtcse/ammms2018/27260 fatcat:hyl22pe4pndyhh363firzqyxpe

An Algorithm of Vocabulary Enhanced Intelligent Question Answering Based on FLAT1 [chapter]

Jing Sheng Lei, Shi Chao Ye, Sheng Ying Yang, Wei Song, Guan Mian Liang
2021 Frontiers in Artificial Intelligence and Applications  
In recent years, the lexical enhancement structure of word nodes combined with word nodes has been proved to be an effective method for Chinese named entity recognition.  ...  This method uses a new dictionary that combines the entity information of the knowledge graph, and only uses layer normalization for the removal of residual connection for the shallower network model.  ...  Except for byte points, the added word nodes is added according to the dictionary. The FLAT model uses the word segmentation information of the giga dictionary.  ... 
doi:10.3233/faia210460 fatcat:vdhu3vyqzbhgngiuryagzof2vi

Vietnamese Word Segmentation

Dinh Dien, Hoang Kiem, Nguyen Van Toan
2001 Natural Language Processing Pacific Rim Symposium  
We evaluate the performance by comparing its word segmentation results with the manually annotated corpus and its performance proves to be very good.  ...  This word segmentation system is applied to Text-to-speech of Vietnamese and POS-tagger of Vietnamese.  ...  We apply WFST model for Chinese Word segmentation into our task as follows (Richard Sproat, 1996) : We represent the dictionary D as a Weighted Finite State Transducer.  ... 
dblp:conf/nlprs/DienKT01 fatcat:ghy3j4jwyvftply4ckwh7n6kqu

Shrinking

Arseny Tolmachev, Daisuke Kawahara, Sadao Kurohashi
2019 Proceedings of the 2019 Conference of the North  
For languages without natural word boundaries, like Japanese and Chinese, word segmentation is a prerequisite for downstream analysis.  ...  Morphological analyzers are trained on data hand-annotated with segmentation boundaries and part of speech tags.  ...  A neural model with only the unigram character input can solve word segmentation and POS tagging only if it builds some knowledge about the dictionary internally.  ... 
doi:10.18653/v1/n19-1281 dblp:conf/naacl/TolmachevKK19 fatcat:pv2kqfv3yfaj7kayx7hfrl3thy
« Previous Showing results 1 — 15 out of 3,729 results