7,419 Hits in 7.1 sec

An Efficient Approach to Learning Chinese Judgment Document Similarity Based on Knowledge Summarization [article]

Yinglong Ma, Peng Zhang, Jiangang Ma
2018 arXiv   pre-print
By utilizing domain ontologies for judgment documents, the core semantics of Chinese judgment documents is summarized based on knowledge blocks.  ...  However, current approaches for judgment document similarity computation failed to capture the core semantics of judgment documents and therefore suffer from lower accuracy and higher computation complexity  ...  We use a word segmentation system to segment every Chinese text into a set of Chinese words, and further form the normalized bags of words model for the corpus that is trained to generate a vector for  ... 
arXiv:1808.01843v1 fatcat:abegbxq6kfeoxg7hkuru3amn4u

Named entity recognition for Chinese judgment documents based on BiLSTM and CRF

Wenming Huang, Dengrui Hu, Zhenrong Deng, Jianyun Nie
2020 EURASIP Journal on Image and Video Processing  
For Chinese named entity recognition in judgment documents, we propose the use a bidirectional long-short-term memory (BiLSTM) model, which uses character vectors and sentence vectors trained by distributed  ...  AbstractChinese named entity recognition (CNER) in the judicial domain is an important and fundamental task in the analysis of judgment documents.  ...  It contains all over 260,000 words of various judicial documents obtained from the Net of Chinese Judicial Documents. The documents include criminal cases, civil cases, and administrative cases.  ... 
doi:10.1186/s13640-020-00539-x fatcat:hllcdk6lejfa3osf3pai6lmnsa

Metaphor Analysis in Political Discourse Based on Discourse Dynamics Framework for Metaphor: A Case Study

Chunfang Huang
2022 Theory and Practice in Language Studies  
metaphor data, offering a tool for understanding people, revealing something of speaker's ideas, affective aspects and values.  ...  Assisted by computer software, such as tables of Microsoft Office, to sort metaphors of each segment of this speech, this article investigates metaphors in the speech by progressive process of metaphor  ...  And they become the research tool of conveying the political ideas and attitudes of Chinese government in fighting COVID-19: calling for greater legislation, law enforcement, judicial and law observance  ... 
doi:10.17507/tpls.1201.11 fatcat:qt5t5brzljgapbm5uqrfu3ligu

An Ontology-Based and Deep Learning-Driven Method for Extracting Legal Facts from Chinese Legal Texts

Yong Ren, Jinfeng Han, Yingcheng Lin, Xiujiu Mei, Ling Zhang
2022 Electronics  
and deep learning-driven method for extracting legal facts from Chinese legal texts.  ...  In the information extraction test of judicial datasets composed of Chinese legal texts on theft, the proposed method effectively extracts up to 38 categories of legal facts from legal texts and the number  ...  The CLTO is an improvement of Judicial Case Ontology [12] , which is not suitable for Chinese legal texts.  ... 
doi:10.3390/electronics11121821 fatcat:ou3zrd6nlnhalk22ayh3i3l73y

Recognition of Chinese Legal Elements Based on Transfer Learning and Semantic Relevance

Dian Zhang, Hewei Zhang, Long Wang, Jiamei Cui, Wen Zheng, Yan Huang
2022 Wireless Communications and Mobile Computing  
This research method makes full use of the semantic information of text, which is essential in the judicial field of document processing.  ...  This paper proposes a Chinese legal element identification method based on BERT's contextual relationship capture mechanism to identify the elements by measuring the similarity between legal elements and  ...  The dataset used in this paper is from the 2019 China Law Research Cup Judicial Artificial Intelligence challenge and is selected from legal documents publicly available on the Chinese judicial documents  ... 
doi:10.1155/2022/1783260 fatcat:dg75q5o33vbjdde6bp5pszqbga

Search Between Chinese and Japanese Text Collections

Fredric C. Gey
2007 NTCIR Conference on Evaluation of Information Access Technologies  
While Chinese search without translation against Japanese documents performed credibly well for title only runs, the reverse (Japanese topic search of Chinese documents without translation) was poor.  ...  We performed search experiments to segment and use Chinese search topics directly as if they were Japanese topics and vice versa.  ...  /) to segment the Japanese document collection into words.  ... 
dblp:conf/ntcir/Gey07 fatcat:gh4bixhhznggtjqp7wfixyfkqy

Research on Text Proofreading Method for Judgment Document

XU Yabin, Ji Xuan
2016 International Journal of Security and Its Applications  
The second aspect is to maximizing identify legal terminology and common name entities, then proofread the collocation mistakes between words, phrases and legalese by using Markov Model.  ...  Automatic proofreading of judgment document can effectively overcome human factors and ensure the quality of proofreading.  ...  Although it can be resolved by the Chinese word segmentation technology, however, the difference between Chinese grammar and English grammar is big, some of the method in English text proofreading is difficult  ... 
doi:10.14257/ijsia.2016.10.7.27 fatcat:3szg4cb5yna3zillbuvm62vjri

How Similar are Chinese and Japanese for Cross-Language Information Retrieval?

Fredric C. Gey
2005 NTCIR Conference on Evaluation of Information Access Technologies  
The best performance of Chinese topic search for Japanese documents was achieved using a hybrid approach which combined MT pivot translation with direct use of Chinese topic expressions.  ...  Our focus was on Chinese topic searches against the Japanese News document collection, and on Japanese topic search against the Chinese News Document Collection.  ...  / to segment the Japanese document collection into words.  ... 
dblp:conf/ntcir/Gey05 fatcat:wwf7fnzqz5bwja4onw4ifp6luq

A Low-Cost Named Entity Recognition Research Based on Active Learning

Han Huang, Hongyu Wang, Dawei Jin
2018 Scientific Programming  
The testing data include Chinese judicial documents and Chinese electronic medical records (EMRs).  ...  Acknowledgments is study was supported by the Innovative Education Program for Graduate Students of Zhongnan University of Economics and Law (no. 201811409).  ...  Since there is no delimiter in Chinese itself, Chinese word segmentation is the basis of the data analysis.  ... 
doi:10.1155/2018/1890683 fatcat:jjmvkcew75b3dalziixdf3iqla

CJRC: A Reliable Human-Annotated Benchmark DataSet for Chinese Judicial Reading Comprehension [chapter]

Xingyi Duan, Baoxin Wang, Ziyue Wang, Wentao Ma, Yiming Cui, Dayong Wu, Shijin Wang, Ting Liu, Tianxiang Huo, Zhen Hu, Heng Wang, Zhiyuan Liu
2019 Lecture Notes in Computer Science  
We present a Chinese judicial reading comprehension (CJRC) dataset which contains approximately 10K documents and almost 50K questions with answers.  ...  The experimental results show that there is enough space for improvement compared to human annotators.  ...  Conclusion In this paper, we construct a benchmark dataset named CJRC (Chinese Judicial Reading Comprehension).  ... 
doi:10.1007/978-3-030-32381-3_36 fatcat:jsyu36ricnggzlkddermwnqwru

Convolutional-neural-network-based Multilabel Text Classification for Automatic Discrimination of Legal Documents

Ming Qiu, Yiru Zhang, Tianqi Ma, Qingfeng Wu, Fanzhu Jin
2020 Sensors and materials  
Then, we use Jieba, a word segmentation tool of Chinese letters, and TensorFlow VocabularyProcessor to generate vocabularies.  ...  Then, the case description after segmenting each word is mapped into a word index in the vocabularies. We use a word index vector as an input to the MLTCNN.  ...  Chinese words are segmented to Chinese character sequences. We used Jieba for Chinese word segmentation. There are three Chinese word segmentation patterns supported by Jieba.  ... 
doi:10.18494/sam.2020.2794 fatcat:jxqetl45mfeffn6fxyl52n5b4u

A summarization system for Chinese news from multiple sources

Hsin-Hsi Chen, June-Jei Kuo, Sheng-Jie Huang, Chuan-Jie Lin, Hung-Chia Wung
2003 Journal of the American Society for Information Science and Technology  
This article proposes a summarization system for multiple documents.  ...  To reduce information loss during summarization, informative words in a document are introduced.  ...  For example, a Chinese sentence is composed of characters without word boundary. Word segmentation is indispensable for Chinese.  ... 
doi:10.1002/asi.10315 fatcat:zdnrhmkax5cnxk2bo6pzp3u2zu

Analysis of the Semantic Scope of Two Korean Terms Equivalent to English Court

Emilia Wojtasik-Dziekan
2020 International Journal for the Semiotics of Law  
The article aims to analyze the semantic fields of two Korean terms in the field of a specialized judicial terminology, i.e. court and tribunal, which are usually reflected in English by one hypernym term  ...  In case of North Korea, word 법원 beobweon is not in official use for court names. Conclusions The language of Korean law is particularly rich in the terminology of Chinese linguistic origin.  ...  Although this term, written in Korean, has Chinese origins, which is documented by possible double way of writing, i.e. with the use of Chinese characters, 7 this borrowing did not appear in the Korean  ... 
doi:10.1007/s11196-020-09693-x fatcat:3fk5muuyj5eolgjztahw2nrx2a

Legal Text Recognition Using LSTM-CRF Deep Learning Model

Hesheng Xu, Bin Hu, Suneet Kumar Gupta
2022 Computational Intelligence and Neuroscience  
Therefore, the Bi-LSTM-CRF model using word segmentation is more suitable for recognizing extended entities.  ...  For the two types of entities, place names and organization names, the F1 values obtained by the Bi-LSTM-CRF model using word segmentation are 67.60% and 89.45%, respectively, higher than the F1 values  ...  tools and chooses a word segmentation tool suitable for the Chinese word segmentation system. is word segmentation tool can significantly impact Chinese legal texts' word segmentation and affinity. e person's  ... 
doi:10.1155/2022/9933929 pmid:35341203 pmcid:PMC8947905 fatcat:kxbmbxui4vakzdrja3btbt4hhi

A Latent Dirichlet Allocation and Fuzzy Clustering Based Machine Learning Model for Text Thesaurus

Jia Luo, Dongwen Yu, Zong Dai
2020 International Journal of Computers Communications & Control  
The topic keywords will be used as a seed dictionary for new word discovery.  ...  In order to verify the efficiency of machine learning in new word discovery, algorithms based on association rules, N-Gram, PMI, andWord2vec were used for comparative testing of new word discovery.  ...  The word segmentation stage is to perform Chinese word segmentation, part-of-speech tagging for each text, and output its results in a prescribed format.  ... 
doi:10.15837/ijccc.2020.2.3811 fatcat:zjh25fqlh5gd5f6lqsnrbnt6ii
« Previous Showing results 1 — 15 out of 7,419 results