634 Hits in 3.2 sec

Statistical models for unformatted text

Christopher Landauer
1981 Proceedings of the 4th annual international ACM SIGIR conference on Information storage and retrieval theoretical issues in information retrieval - SIGIR '81  
It is expected that more Precise Knowledge of solutions for these Problems will simplify the design and improve the effectiveness of statistical i~formation retrieval systems.  ...  Both Kinds of model are based on a stochastic process, but there is a different filter for the realization.  ...  4, Summary In thls note, we have described some current problems with statistical methods for information retrieval from unformatted English text.  ... 
doi:10.1145/511754.511765 dblp:conf/sigir/Landauer81 fatcat:rirxrpmtwzfdxcy4z6i43agjd4

Exploring fine-grained sentiment values in online product reviews

Phoey Lee Teh, Irina Pak, Paul Rayson, Scott Piao
2015 2015 IEEE Conference on Open Systems (ICOS)  
For consumers, reading reviews helps them make better purchase decisions but we show there is also value to be gained in a finer-grained sentiment analysis for future commercial website platforms.  ...  We show that this is possible for online comments about ten different categories of products.  ...  Plain unformatted text refers to the texts that are written without any capital little, repetition words, emoticons or any strengthening words.  ... 
doi:10.1109/icos.2015.7377288 fatcat:l6bdplsysvfeznggntw34yvisq

The Gender Differential Effects of a Procedural Plan For Solving Mathematical Word Problems

Ron Zambo, Robert K. Hess
1996 School Science and Mathematics  
Two versions of the test were given: the formatted form, containing the seven steps of a problem-solving plan on each of the pages, with space for student work; and the unformatted form, containing only  ...  Three classes of students (n=73) were administered the formatted test first, followed by the unformatted version.  ...  The students who took the unformatted test first and the formatted test second showed statistically significant improvement in test scores.  ... 
doi:10.1111/j.1949-8594.1996.tb15854.x fatcat:afiniyptwbefbcrz7xjwqqqqrm

The HSIC Bottleneck: Deep Learning without Back-Propagation [article]

Wan-Duo Kurt Ma, J.P. Lewis, W. Bastiaan Kleijn
2019 arXiv   pre-print
There is no requirement for symmetric feedback or update locking.  ...  We introduce the HSIC (Hilbert-Schmidt independence criterion) bottleneck for training deep neural networks.  ...  Acknowledgments We thank David Balduzzi and Marcus Frean for discussions.  ... 
arXiv:1908.01580v3 fatcat:6yjb3azvrzaqffpsgcuyah3oye

ARCLIN: Automated API Mention Resolution for Unformatted Texts [article]

Yintong Huo, Yuxin Su, Hongming Zhang, Michael R. Lyu
2022 arXiv   pre-print
However, unlike official documentation written by experts, discussions in open forums are made by regular developers who write in short and informal texts, including spelling errors or abbreviations.  ...  ., StackOverflow) are popular platforms for developers to discuss technical problems such as how to use specific Application Programming Interface (API), how to solve the programming tasks, or how to fix  ...  Table 1 : 1 Three main challenges for API mining in unformatted texts, Blue words refers to API mentions and Red words refers to common words.  ... 
arXiv:2201.01459v1 fatcat:qbymrmxgdbbqhmje2td36svzbq

Research on building temporal-spatial data warehouse of marine environmental data products

Jian Liu, Xin Zhang, Tianhe Chi
2010 2010 2nd International Conference on Advanced Computer Control  
Present study consists the system framework and logical model of marine environment spatial-temporal data warehouse (MESTDW) for comprehensive analysis of marine environment data.  ...  Multidimensional data model of comprehensive analysis subject based on star-shaped model group is formed by the data warehouse construction methods of "metadata driven -metadata sharing" and subject structure  ...  volume unformatted binary files, text files, NetCdf files Surface meteorological conventional statistical products sea water temperature; dew point temperature; relative humidity; temperature  ... 
doi:10.1109/icacc.2010.5487046 fatcat:jy2b7rwatjgxfikoe5hr7a3h5i

Detecting Cyberbullying "Hotspots" on Twitter: A Predictive Analytics Approach

Shuyuan Mary Ho, Dayu Kao, Ming-Jung Chiu-Huang, Wenyi Li, Chung-Jui Lai
2020 Forensic Science International: Digital Investigation  
This study attempts to develop a prediction model for identifying cyberbullying "hotspots" by analyzing the manifestation of charged language on Twitter.  ...  The contribution is significant for mediation agenciesdsuch as school counseling and law enforcement agencies.  ...  Acknowledgements The authors wish to thank Florida Center for Cybersecurity (FC2) for the grant 3910-1007-00-B, and the Executive Yuan of the Republic of China for the grants from the Digital Infrastructure  ... 
doi:10.1016/j.fsidi.2020.300906 fatcat:bo3wevdsezgr3in6anwxjl3lge

You have e-mail, what happens next? Tracking the eyes for genre

Malcolm Clark, Ian Ruthven, Patrik O'Brian Holt, Dawei Song, Stuart Watt
2014 Information Processing & Management  
In addition, the ocular strategies of scanning and skimming, they employed for the processing of the texts by block, genre and representation were evaluated. Crown  ...  The researchers focused on eight different types of e-mail, such as calls for papers, newsletters and spam, which were chosen to represent different genres.  ...  Acknowledgements This experimental research was conducted within the framework of the project, Automatic Adaptation of Knowledge Structures for Assisted Information Seeking (AutoAdapt), funded by the EPSRC  ... 
doi:10.1016/j.ipm.2013.08.005 fatcat:wkeobe72ijhznpkslqrligwbx4

Evasion Attack on Text Classified Training Datasets

2019 International Journal of Engineering and Advanced Technology  
To present this paper, to do an evasion attack on collected text documents using extraction keyword and find mean words using Naive Bayes models .  ...  Security against Evasion attack The security against training dataset is a challenge for the researcher, because text dataset are unformatted.  ...  Statistical properties of different PDF files generated in the statistical based malware. The Hidden Markov Models (HMM) detect gradually changed benign files [11] and similarity index [12] .  ... 
doi:10.35940/ijeat.f1009.0886s19 fatcat:p6wqi6rbaffjrptw3mq3s6etw4

Text Segmentation for Language Identification in Greek Forums

Pavlina Fragkou
2014 Procedia - Social and Behavioral Sciences  
Greeklish can be defined as the use of Latin alphabet for rendering Greek words with Latin characters.  ...  The evaluation using two well known text segmentation algorithms leads to the conclusion that, despite the difficulty of the problem examined, text segmentation seems to be a promising solution.  ...  For statistical language identification, a set of character level language models is prepared from training data as a first step.  ... 
doi:10.1016/j.sbspro.2014.07.140 fatcat:r2iu3iiuvnh7jk7clvizcdnn3a

Fake News Detection on Social Media using Machine Learning Techniques

Our point is to locate a dependable and right model that arranges a given article as fake or genuine. For identification of fake articles we use machine learning algorithms.  ...  In this paper we examine different systems for recognizing counterfeit information via internet based networking medium.  ...  The best results were obtained by Random forest and XGBoost classifiers, statistically attached with 0.85 (0.007) and 0.86 (0.006) for AUC, respectively.  ... 
doi:10.35940/ijitee.g5428.059720 fatcat:tckuwomyd5cyjpsaapcgapiwqa

PSC volume 27 issue 2 Cover and Front matter

1994 PS: Political Science and Politics  
:1 and 1:k(McFaddens choice model).  ...  [ ] Are the references in alphabetical order and do they provide complete source information for the in-text citations?  ... 
doi:10.1017/s1049096500040361 fatcat:stsnnilatrbsxagwr4qxmctpai

Open Science data without curation. Is it useful? An American Astronomical Society Publishing perspective

Greg Schwarz
2021 Zenodo  
In short, the quality is often lacking which means significant challenges for the end user.  ...  The reasons for poor data products is due to lack of author training in curation and laziness.  ...  The "submit and forget" data model Early animation issues • Videos were available only for download. Reader had to figure out how to display the movie.  ... 
doi:10.5281/zenodo.4884917 fatcat:7l5snobkobeatbc4rjhokmchja

ANALYZE Rulebase [chapter]

Harvey J. Greenberg
1988 Mathematical Models for Decision Support  
Such setups would be done i n batch for large models, as in industrial applications. Then, the complete LP file was written to a packed (unformatted) file now read into memory.  ...  Choosing a problem to analyze (WOODNET) 1 A reasonable place to become familiar with a model is to look at statistics, which is I 1 obtained from the SUMMARY command in Figure 3.  ... 
doi:10.1007/978-3-642-83555-1_14 fatcat:qwjy4b7ptnazxgucfndkc5qz3i

Sentence Segmentation from Unformatted Text using Language Modeling and Sequence Labeling Approaches

Ievgen Iosifov, Olena Iosifova, Volodymyr Sokolov
2020 2020 IEEE International Conference on Problems of Infocommunications. Science and Technology (PIC S&T)  
Antonevych Active-HDL 13 Kozhevnikov Nadia Pasieka, Liliana Khimchuk, Research of dynamic mathematical models of adaptation of members of Using Deep Structured Semantic Model to Analysis Text Documents  ...  Models of searching for Using neuromodels for evaluating and determining Analysis of Models for Selection of Investment Strategies 127 Pasko, Pavel Paderno, Evgeniy ergonomic reserves to increase efficiency  ... 
doi:10.1109/picst51311.2020.9468084 fatcat:udezp6cgpff4fgo27l6jv53jku
« Previous Showing results 1 — 15 out of 634 results