The Internet Archive has a preservation copy of this work in our general collections.
The file type is application/pdf
.
Filters
About compression of vocabulary in computer oriented languages
[article]
2003
arXiv
pre-print
The author uses the entropy of the ideal Bose-Einstein gas to minimize losses in computer-oriented languages. ...
This is a Pidgin language, which can serve as a computer oriented language and whose vocabulary is significantly simplified (the number of words is essentially decreased). ...
etc.), the problem of "compressing" and economical coding of texts arises first of all. ...
arXiv:cs/0303002v2
fatcat:o2ncsq2pubh3lg6c4cy3n5afsu
Simple, Fast, and Efficient Natural Language Adaptive Compression
[chapter]
2004
Lecture Notes in Computer Science
One of the most successful natural language compression methods is word-based Huffman. ...
A one-pass adaptive variant of Huffman exists, but it is character-oriented and rather complex. ...
The word-based Huffman byte oriented codes proposed in [7] obtain compression ratios on natural language close to 30% by coding with bytes instead of bits (in comparison to the bit oriented approach ...
doi:10.1007/978-3-540-30213-1_34
fatcat:6zyp5soayjcglf7b6nr7jjquvq
The Roumanian spelling checker ROMSP: the project overview
1995
Computer Science Journal of Moldova
Problems of user interface engineering support by object oriented methods are of special interest. ...
Aspects of the Roumanian spelling checker ROMSP are presented: effective vocabulary representation, similar words detection algorithms, automatic word inflection, the user interface, supporting tools, ...
Of course, something intermediate is of interest. In our case, about 220,000 roots, 144 endings, and about 2,000 ending sets from 2 144 were sufficient. ...
doaj:f51d0ee9c11245b780c92578cd5ec0fd
fatcat:ppfrn4ra5ndftltrhckdtw3qgi
A Two-Level Structure for Compressing Aligned Bitexts
[chapter]
2009
Lecture Notes in Computer Science
A bitext, or bilingual parallel corpus, consists of two texts, each one in a different language, that are mutual translations. ...
Our strategy is based on a two-level structure for the vocabularies, and on the use of biwords, a pair of associated words, one from each language, as basic symbols to be encoded with an ETDC [2] compressor ...
Moreover, to evaluate the effect of our strategy of using a biword-oriented model, we also implemented ETDC compression over the bitext using two different word-oriented models. ...
doi:10.1007/978-3-642-03784-9_11
fatcat:w22wobqwxncm3ap7lydw5xfj5e
Efficiently decodable and searchable natural language adaptive compression
2005
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '05
We address the problem of adaptive compression of natural language text, focusing on the case where low bandwidth is available and the receiver has little processing power, as in mobile applications. ...
Moreover, we show that our technique can be adapted to avoid decompression at all in cases where the receiver only wants to detect the presence of some keywords in the document, which is useful in scenarios ...
For example, a language classification system might look for a small set of common words of each language and use it to classify the incoming compressed text, forwarding it to a specific directory or computer ...
doi:10.1145/1076034.1076076
dblp:conf/sigir/BrisaboaFNP05
fatcat:fw7diimi6fht3kpjrievlthtyy
New adaptive compressors for natural language text
2008
Software, Practice & Experience
Semistatic byte-oriented word-based compression codes have been shown to be an attractive alternative to compress natural language text databases, because of the combination of speed, effectiveness, and ...
In particular, our recently proposed family of dense compression codes has been shown to be superior to the more traditional byte-oriented word-based Huffman codes in most aspects. ...
A first pass over the text gathers global statistical information about the vocabulary (list of source symbols) in order to obtain a model of the text. ...
doi:10.1002/spe.882
fatcat:oltpsdjtezf53jqx2krvv3ex2e
A fast dynamic compression scheme for natural language texts
2010
Computers and Mathematics with Applications
The aim of designing a dynamic version of WBTC is to adapt it for real-time transmission. ...
The problem in the semi-static technique is to perform two passes over the source text, and therefore encoding cannot start before the whole first pass has been completed. ...
In decompression, Dynamic WBTC is about 23%-80% faster than MLZW. Comparing DyWBTC with classical compressors, ours DyWBTC outperforms in compression time. ...
doi:10.1016/j.camwa.2010.10.019
fatcat:4vvx463w7nbopjyf3slijyk7mq
Fast searching on compressed text allowing errors
1998
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '98
We compress typical English texts to about 30% of their original size, against 40% and 35% for Compress and Gaip, respectively. ...
The algorithm is based on a word-oriented shift-or algorithm and a fast Boyer-Moore-type filter. It concomitantly uses the vocabulary of the text available as part of the Huffman coding data. ...
Aratijo, who helped particularly with the algorithms for approximate searching in the text vocabulary. ...
doi:10.1145/290941.291013
dblp:conf/sigir/MouraNZB98
fatcat:lwmninhsczezxgkir3txt3fh6a
Measuring Perceptual and Linguistic Complexity in Multilingual Grounded Language Data
2021
Proceedings of the ... International Florida Artificial Intelligence Research Society Conference
The success of grounded language acquisition using perceptual data (e.g., in robotics) is affected by the complexity of both the perceptual concepts being learned and the language describing those concepts ...
Our work illuminates core, quantifiable statistical differences in how language is used to describe different traits of objects, and the visual representation of those objects. ...
To measure shape complexity, we compute the compression loss of detected edges. ...
doi:10.32473/flairs.v34i1.128450
fatcat:ep45qo7pwnfhjc7ppocklkqofm
(S,C)-Dense Coding: An Optimized Compression Code for Natural Language Text Databases
[chapter]
2003
Lecture Notes in Computer Science
This work presents (s, c)-Dense Code, a new method for compressing natural language texts. ...
We formally describe the (s, c)-Dense Code and show how to compute the parameters s and c that optimize the compression for a specific corpus. ...
Therefore, the vocabulary will be slightly smaller than in the case of the Huffman code, where some information about the shape of the tree must be stored (even when a canonical Huffman tree is used). ...
doi:10.1007/978-3-540-39984-1_10
fatcat:edztgxtibzcjtj7dcc66ewfv7u
Instructional Design and Practice of Specialized English, an Application-oriented Graduate Course for Mechanical Engineering
2017
DEStech Transactions on Social Science Education and Human Science
The features of Specialized English, a graduate course for mechanical engineering for applied universities, are analyzed; The goal, contents, and corresponding application-oriented instructional strategy ...
of the course are designed and determined based on application investigation; Explorational instruction practice with multiple methods have been carried out. ...
Acknowledgement This research was financially supported by Innovation Project of Guangxi Graduate Education, 2015. ...
doi:10.12783/dtssehs/esem2017/15083
fatcat:adyuridocjb4la6ivpgk3sisv4
Fast and flexible word searching on compressed text
2000
ACM Transactions on Information Systems
We compress typical English texts to about 30% of their original size, against 40% and 35% for Compress and Gzip, respectively. ...
We present a fast compression and decompression technique for natural language texts. ...
Araújo, who helped particularly with the algorithms for approximate searching in the text vocabulary. We also thank the many comments of the referees that helped us to improve this work. ...
doi:10.1145/348751.348754
fatcat:gtwbmlqconbn5jz3le3ptj4fmm
TRENDS IN THE DEVELOPMENT OF THE AVIATION VOCABULARY
2018
Vìsnik Nacìonalʹnogo Avìacìjnogo Unìversitetu
The general trend of the economy of linguistic means for the modern period is manifested in the aviation terminology in an effort to compress the form of the term, and the form of the nomenclature terms ...
Conclusions: Aviation vocabulary develops along the lines of general trends in scientific and technical terminology. ...
Anthropocentrism in lexicology takes a leading position and is understood as a way out of the boundaries of the language in the field of knowledge about the surrounding reality, about the person (language ...
doi:10.18372/2306-1472.77.13503
fatcat:f4f2d3d4qfcxnj4jlq7rungwqe
Page 5636 of Mathematical Reviews Vol. , Issue 2004g
[page]
2004
Mathematical Reviews
In particular, we propose a logic language with a clear and declarative semantics to specify the structural features of object oriented databases and authorizations associated with complex data objects ...
A direct advantage of this approach is that we can formally specify and reason about authorizations on data objects without losing inheritance and abstraction features of object oriented databases.”
2004g ...
Speech-based Interaction
2016
Proceedings of the 2016 CHI Conference Extended Abstracts on Human Factors in Computing Systems - CHI EA '16
-Goal orientation (system-initiated prompts or mixed initiative) -Informing the user of their progress toward achieving the goal
Human-Computer Interaction and ASR • HCI needs to be aware of ASR's capabilities ...
• All these employ not only ASR, but significantly more Natural Language Processing, and a good amount of Human-Computer Interaction -not all are dedicated to speech-based input! ...
-Codecs (lossy formats, compression, non-linear representation) • Use lossless compression (e.g. flac codec or zip) if low bandwidth • Ideally use only uncompressed formats (wav or raw)! ...
doi:10.1145/2851581.2856689
dblp:conf/chi/MunteanuP16
fatcat:c2qfv4eukjhyti7brvvzuyv6je
« Previous
Showing results 1 — 15 out of 9,322 results