A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2019; you can also visit the original URL.
The file type is application/pdf
.
Filters
A new character-based indexing method using frequency data for Japanese documents
1995
Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '95
A character based indexing is preferable for Japanese IR systems since Japanese words are not segmented. ...
Since frequency data is used to determine hashed entries for character pairs and to establish a special string index, both search speed and precision are improved. ...
New Character Index Method As for index organization, we propose a new hashing scheme for character pairs and an effective index whose entries are character stlings with more than 2 characters. 2WC ~,s ...
doi:10.1145/215206.215347
dblp:conf/sigir/OgawaI95
fatcat:gnxviy4ryrhzlmnmbydlwrc7c4
Experiments in Japanese text retrieval and routing using the NEAT system
1998
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '98
The study includes a comparison of different indexing strategies for documents and queries, investigation of term weighting strategies principally derived for use with English texts, and the application ...
Indexing using dictionary based morphological analysis and character strings are both shown to be individually effective, but marginally better in combination. ...
Data Fusion For data fusion we combined the ranked document lists produced independently by morphologically segmented and character-based indexes in response to a query. ...
doi:10.1145/290941.290992
dblp:conf/sigir/JonesSKS98
fatcat:rgvygdng6bfidlirmagy6sllxe
Asian Language Parsing Evaluated by Hummingbird SearchServer TM at NTCIR-3
2002
NTCIR Conference on Evaluation of Information Access Technologies
SearchServer 5.3's segmenter for Asian text, compared to an overlapping n-gram approach, was found to modestly increase precision scores for Japanese, to have a neutral impact for Chinese, and to be detrimental ...
Newline suppression was found to be of only minor benefit for n-gram parsing. Normalizing Han characters to Hangul had almost no effect on the Korean test collection. ...
Search Techniques The NTCIR organizers created several "topics": 50 for Chinese and Japanese (which were translations of each other) and 30 for Korean (the Korean news articles covered a different time ...
dblp:conf/ntcir/Tomlinson02
fatcat:a4a5bxbn65cntnqlbif6elebiu
Use of the Japio Technical Field Dictionaries and Commercial Rule-based Engine for NTCIR-PatentMT
2013
NTCIR Conference on Evaluation of Information Access Technologies
Japio applied the Japio Technical Field Dictionaries to a commercial machine translation engine for the NTCIR9-PatentMT (JE and EJ subtasks). ...
The Japio Technical Field Dictionaries (technical-field-oriented machine translation dictionaries) are created from the Japio Terminology Database based on each entry's frequency in the bilingual patent ...
ACKNOWLEDGMENTS We would like to thank the NTCIR-10 organizers for giving us a precious opportunity to get objective and comparative evaluations of our machine translation system outputs. ...
dblp:conf/ntcir/OshioMK13
fatcat:iddxmoyjdbc3lha4etna7sninu
Applications of multilingual text retrieval
1996
Proceedings of HICSS-29: 29th Hawaii International Conference on System Sciences
We describe our experience with a range of projects involving text retrieval in Spanish, Japanese and Chinese. ...
The Center for Intelligent Information Retrieval (CIIR) at the University of Massachusetts is involved in a variety of industrial, government, and digital library applications which have a need for multilingual ...
This work was supported in part by the NSF Center for Intelligent Information Retrieval at the University of Massachusetts. ...
doi:10.1109/hicss.1996.495303
dblp:conf/hicss/CroftBF96
fatcat:knqz6vs645g7teq6l4jl4kwime
Monolingual Experiments with Far-East Languages in NTCIR-6
2007
NTCIR Conference on Evaluation of Information Access Technologies
of a combined "unigram & bigram" indexing scheme combined with an automatic wordsegmenting approach for Chinese and Japanese languages; and 3) evaluate the relative performance of the various data fusion ...
strategies used to combine separate result lists in order to enhance retrieval effectiveness. ...
Acknowledgments The authors would like to thank the NTCIR-6 task organizers for their efforts in developing various testcollections. ...
dblp:conf/ntcir/AbdouS07
fatcat:mc64mvj7fbh3bavshcijql4ix4
Automated Japanese essay scoring system:jess
2004
Proceedings. 15th International Workshop on Database and Expert Systems Applications, 2004.
The final evaluated score is calculated by deducting from a perfect score assigned by a learning process using editorials and columns from the Mainichi Daily News newspaper. ...
A diagnosis for the essay is also given. Our system does not need any essays graded by human experts. ...
visit for us during our survey of the e-rater system. ...
doi:10.1109/dexa.2004.1333440
dblp:conf/dexaw/IshiokaK04
fatcat:7dkiwszwufhkrcfxucz6dmq5wm
Overview of the NTCIR-5 WEB Navigational Retrieval Subtask 2 (Navi-2)
2005
NTCIR Conference on Evaluation of Information Access Technologies
document data and 400 topics were distributed to the participants and, in turn, 35 run results were submitted by 4 participants and 28 by the organizers. ...
In the Subtask, we attempted to assess the retrieval effectiveness of web search systems from a viewpoint of "Known Item Search" using a common data set, and built a re-usable test collection. 1.36TB web ...
We also appreciate the helpful advice of Professors Jun Adachi and Noriko Kando, National Institute of Informatics, and the intensive work on document data preparation by Mr. Shin Kato and Mr. ...
dblp:conf/ntcir/OyamaTIAY05
fatcat:ft3yc42u7zfxdeeopn7bi6cyla
Regression Model and Query Expansion for NTCIR-2 Ad Hoc Retrieval Task
2001
NTCIR Conference on Evaluation of Information Access Technologies
First, we discuss a simplified logistic regression model, which enable us to adjust the regression model for working well in each of various document databases. ...
To do automatically the adjustment, a method for estimating parameters in the regression model, is developed based on a kind of classical discriminant analysis. ...
A. Chen at the UC Berkeley for teaching the Berkeley's formula of logistic regression when the author was a visiting scholar at the UC Berkeley. ...
dblp:conf/ntcir/Kishida01
fatcat:m54xknyhxvg43cd5omzglvj36e
Japanese-Chinese Cross-Language Information Retrieval: An Interlingua Apporach
2000
International Journal of Computational Linguistics and Chinese Language Processing
In this paper, we propose a Han Character (Kanji) oriented Interlingua model of indexing and retrieving Japanese and Chinese information. ...
Similar indexing approaches for multiple European languages through term association (e.g., latent semantic indexing) or through conceptual mapping (using lexical ontology such as, WordNet) are being intensively ...
Their tools helped us to speed up our research. Thanks to Dr. Akira Maeda for allowing us to use his correlation calculation tool and Dr. Michael Berry for the LSI++ and SVDPACK packages. ...
dblp:journals/ijclclp/HasanM00
fatcat:vjj645dn3bdcvicuayagxv2ybm
NTCIR-6 CLQA Question Answering Experiments at the Tokyo Institute of Technology
2007
NTCIR Conference on Evaluation of Information Access Technologies
We describe our language independent, data-driven approach to Japanese language question answering and our new document retrieval and answer projection method which resulted in a small performance gain ...
Using this method, we achieve a formal run score of 0.17 for the top answer with document support for subtask 2b. ...
Acknowledgments This research was supported in part by JSPS and the Japanese government 21st century COE programme. ...
dblp:conf/ntcir/NovakWHIF07
fatcat:wbawtcndkvhlznbkyjbzi5ymqi
Report on CLIR Task for the NTCIR-4 Evaluation Campaign
2004
NTCIR Conference on Evaluation of Information Access Technologies
separate result lists extracted from a corpus written in English, Chinese, Japanese or Korean. ...
freely available translation tools used to translate English-language topics into Chinese, Japanese or Korean; and 3) to evaluate the relative performance of the various merging strategies used to combine ...
Finally, we could use our new Z-score (see Section 1.4 and Equation 2 ) to define a comparable document score across collections. ...
dblp:conf/ntcir/Savoy04
fatcat:hbbre7bk6fd25d6unvylyapq6a
Comparative study of monolingual and multilingual search models for use with asian languages
2005
ACM Transactions on Asian Language Information Processing
Finally, we address basic problems related to multilingual searches, in which queries written in English are used to search documents written in the English, Chinese, Japanese, and Korean languages. ...
the Chinese, Japanese, Korean, and English languages. ...
The Chinese language data-fusion experiments also included the Okapi and "Lnu-ltc" models based on character indexing. ...
doi:10.1145/1105696.1105701
fatcat:zpiscx4znzgyvfyzqqsxxckniq
Automated Japanese essay scoring system based on articles written by experts
2006
Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the ACL - ACL '06
The final evaluation score is calculated by deducting from a perfect score assigned by a learning process using editorials and columns from the Mainichi Daily News newspaper. ...
A diagnosis for the essay is also given. ...
visit for us during our survey of the e-rater system. ...
doi:10.3115/1220175.1220205
dblp:conf/acl/IshiokaK06
fatcat:bbyqqhic5feaphwimywlyi6jyi
NTCIR-4 CLIR Experiments at Oki
2004
NTCIR Conference on Evaluation of Information Access Technologies
We adopted the pivot language approach for C-J and J-C search using English as a pivot language. ...
Our IR system can handle queries and documents in Chinese, English and Japanese. ...
Acknowledgements We used CEDICT [3] , ChaSen [8] , EDICT [1] , GETA [5] and Japanese-English News Article Alignment Data [13] in the research. ...
dblp:conf/ntcir/NakagawaK04
fatcat:3dnn4mqwmngppgunsoom22cmnu
« Previous
Showing results 1 — 15 out of 12,007 results