Filters








86,512 Hits in 4.1 sec

Large-Scale Experiments for Mathematical Document Classification [chapter]

Simon Barthel, Sascha Tönnies, Wolf-Tilo Balke
2013 Lecture Notes in Computer Science  
We therefore conducted a large scale feasibility study on a real world data set from one of the biggest mathematical digital libraries, i.e.  ...  This quality can largely be attributed to the great effort invested into semantic enrichment of the provided documents e.g. by annotating their documents with respect to a domain-specific taxonomy.  ...  Moreover, we would like to express our appreciation to the Zentralblatt MATH for the provision of the corpus used in this study.  ... 
doi:10.1007/978-3-319-03599-4_10 fatcat:wydbdpnjkjhmhhk4uj42v32j2u

Handwritten Mathematical Expressions Recognition

Yassine Chajri, Belaid Bouikhalene
2013 International Journal of Signal Processing, Image Processing and Pattern Recognition  
This paper also describes all the details concerning the necessary steps of our approach for handwritten mathematical expressions recognition.  ...  In this paper, we will present a dataset of handwritten mathematical expressions outcome of several mathematical disciplines (logic, analysis, algebra and probability).  ...  There are two large sets in the field of document recognition: text analysis and graphical components analysis.  ... 
doi:10.14257/ijsip.2016.9.5.07 fatcat:n6zz2f72j5f7leo5f5rlrkzmp4

Incorporating Word Embeddings into Open Directory Project based Large-scale Classification [article]

Kang-Min Kim, Aliyeva Dinara, Byung-Ju Choi, SangKeun Lee
2018 arXiv   pre-print
In this paper, we incorporate word embeddings into the ODP-based large-scale classification.  ...  The evaluation results clearly show the efficacy of our methodology in large-scale text classification.  ...  Related Work For the large-scale text classification, many approaches have been developed to handle data sparsity on a knowledge base.  ... 
arXiv:1804.00828v1 fatcat:gjri2upk3vftrafo2gcgeplthe

Notes and references on early automatic classification work

Karen Sparck Jones
1991 SIGIR Forum  
Automatic classification research in general Research on automatic classification got going before 1960, in direct response to the opportunities offered by computers for handling large-scale and/or complex  ...  the early research, there are still substantial challenges in operating on a large scale.  ... 
doi:10.1145/122642.122644 fatcat:nxt435js6rb4jd7zrwoieqcyz4

Towards Explaining STEM Document Classification using Mathematical Entity Linking [article]

Philipp Scharpf, Moritz Schubotz, Bela Gipp
2021 arXiv   pre-print
Document subject classification is essential for structuring (digital) libraries and allowing readers to search within a specific field.  ...  In this paper, we present first advances towards STEM document classification explainability using classical and mathematical Entity Linking.  ...  In 2020, Scharpf et al. presented large-scale experiments for classification and clustering of arXiv documents, sections, and abstracts comparing encodings of natural and mathematical language [1] .  ... 
arXiv:2109.00954v1 fatcat:io7lma2birbdzfugybd47m7whm

Automatic Processing of Historical Japanese Mathematics (Wasan) Documents

Yago Diez, Toya Suzuki, Marius Vila, Katsushi Waki
2021 Applied Sciences  
These documents represent a unique type of mathematics and amalgamate the mathematical knowledge of a time and place where major advances where reached.  ...  Furthermore, we study the performance of five well-known deep learning networks and obtain 99.75% classification accuracy for modern kanji and 90.4% for classical kanji.  ...  Experiment 3: Classification of the "ima" kanji In this final experiment, we studied the performance of our whole pipeline for the problem of detecting and classifying the "ima" kanji.  ... 
doi:10.3390/app11178050 fatcat:v35hp7dco5fxhjlix5cxrvkkdq

Mathematical Formula Image Screening Based on Feature Correlation Enhancement

Hongyuan Liu, Fang Yang, Xue Wang, Jianhui Si
2022 Electronics  
To screen and collect images containing mathematical formulas for others to study or for further research, a model for screening images of mathematical formulas based on feature correlation enhancement  ...  There are mathematical formula images or other images in scientific and technical documents or on web pages, and mathematical formula images are classified as either containing only mathematical formulas  ...  At present, there are a large number of images of mathematical formulas with research value in web pages or scientific and technological documents.  ... 
doi:10.3390/electronics11050799 fatcat:4hculscmdreadfhvrcbjkqv7ra

Automatic Classification of Spatial Relationships among Mathematical Symbols Using Geometric Features

Walaa ALY, Seiichi UCHIDA, Masakazu SUZUKI
2009 IEICE transactions on information and systems  
Machine recognition of mathematical expressions on printed documents is not trivial even when all the individual characters and symbols in an expression can be recognized correctly.  ...  Experimental results on very large databases showed that this classification worked well with an accuracy of 99.525% by using distribution maps which are defined by two geometric features, relative size  ...  a large-scale database, which was not available in the past.  ... 
doi:10.1587/transinf.e92.d.2235 fatcat:xfdtvnevlnchhgnhedshps6y4i

Text mining and natural language processing

Anne Kao, Steve Poteet
2005 SIGKDD Explorations  
ACKNOWLEDGMENTS The authors wish to thank our colleagues in Boeing Phantom Works, Mathematics and Computing Technology for providing reviews and comments for the paper submissions: William Ferng, Dragos  ...  They represent a wide range of expertise, including machine learning, data mining, statistics, database and mathematics, as a complement to the authors' background in Text Mining and NLP.  ...  This is consistent with our experience in running large scale classification applications in Boeing (with from 500 classes to 60,000 classes).  ... 
doi:10.1145/1089815.1089816 fatcat:febssnb43ja3ffcrcjdarzlrh4

A Descriptive Approach to Classification [chapter]

Miguel Martinez-Alvarez, Thomas Roelleke
2011 Lecture Notes in Computer Science  
Moreover, the automatic translation from PDatalog to mathematical formulation is discussed. Secondly, quality and efficiency results prove the approach feasibility for real-scale collections.  ...  This paper investigates the application of descriptive approaches for modelling classification.  ...  Experiment set-up Two traditional text classification collections have been used for the experiments: 20newsgroups and Reuters-21578. 20 Newsgroups 3 is a collection of approximately 20,000 newsgroup documents  ... 
doi:10.1007/978-3-642-23318-0_27 fatcat:zyfn364dszdj5dzy7f5fydlhq4

Towards Multi Label Text Classification through Label Propagation

Shweta C, Maya Ingle, Parag Kulkarni
2012 International Journal of Advanced Computer Science and Applications  
We are using semi supervised learning technique for effective utilization of labeled and unlabeled data for classification.  ...  Classifying text data has been an active area of research for a long time. Text document is multifaceted object and often inherently ambiguous by nature.  ...  ESTA Powerful representati on of input documents using NMF and also works for large scale datasets Parameter selection is crucial.  ... 
doi:10.14569/ijacsa.2012.030607 fatcat:75ae7hgvybdfroyx2gak4ndzci

Application of an Improved Genetic Algorithm in Network Information Filtering

Min Ren, Baoya Song, Jirong Jiang
2013 Sensors & Transducers  
text classification and information filtering.  ...  Genetic Algorithm in solving such complicated problems as premature convergence, low search efficiency during late period, and so forth, the paper puts forward a Fuzzy Genetic Algorithm that can be used for  ...  The training and test documents set are selected from the SOGUO large-scaled corpus, and each classification is made up of 1000 articles selected, of which 700 articles are used for training, and 300 articles  ... 
doaj:55d8a083628d4941863aeb443246d08c fatcat:o5bjjnzsizcw5kohdi5mi4gdq4

Page 502 of Computational Linguistics Vol. 31, Issue 4 [page]

2005 Computational Linguistics  
All the operations necessary for the classification of a sentence pair (fil- ter, word alignment computation, and feature extraction) can be implemented efficiently and scaled up to very large amounts  ...  The mathematics of machine translation: Parameter estimation. Computational Linguistics, 19(2):263-311. Darroch, J. N. and D. Ratcliff. 1974. Generalized iterative scaling for log-linear models.  ... 

Mining Social Media Data for Understanding Students' Learning Experiences

Xin Chen, Mihaela Vorvoreanu, Krishna P.C. Madhavan
2014 IEEE Transactions on Learning Technologies  
In this paper, a work-flow is developed which combines both qualitative investigation and large-scale data mining scheme.  ...  Students' digital footprints provide vast amount of implicit knowledge and a whole new perspective for educational researchers and practitioners to understand students' experiences outside the controlled  ...  Then apply the algorithm to another large-scale and unexplored dataset, so that the physical method is improved.  ... 
doi:10.1109/tlt.2013.2296520 fatcat:zp552le4bvh5bba5vocbhr4ofy

Labeled Anchors and a Scalable, Transparent, and Interactive Classifier

Jeffrey Lund, Stephen Cowley, Wilson Fearn, Emily Hales, Kevin Seppi
2018 Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing  
., 2014) in that it extends the vector-space representation of words to include document labels.  ...  We run a small user study that demonstrates that untrained users can interactively update topics in order to improve classification accuracy.  ...  Thanks to Piper Armstrong, Naomi Johnson, Connor Cook and Nozomu Okuda for their invaluable help with this work.  ... 
doi:10.18653/v1/d18-1095 dblp:conf/emnlp/LundCFHS18 fatcat:oooavjrd6rebzbbzqz5q3dw5dy
« Previous Showing results 1 — 15 out of 86,512 results