Filters








2,979 Hits in 4.8 sec

A Modular Metadata Extraction System for Born-Digital Articles

Dominika Tkaczyk, Lukasz Bolikowski, Artur Czeczko, Krzysztof Rusek
2012 2012 10th IAPR International Workshop on Document Analysis Systems  
We present a comprehensive system for extracting metadata from scholarly articles.  ...  The evaluation tests we have performed showed good results of the individual implementations and the entire metadata extraction process. ABSTRACT 163 3 21 AFFILIATION  ...  We would also like to thank the anonymous reviewers for their insightful comments.  ... 
doi:10.1109/das.2012.4 dblp:conf/das/TkaczykBCR12 fatcat:fly4braehnafhjk6xz6z5ulffy

Multimodal Approach for Metadata Extraction from German Scientific Publications [article]

Azeddine Bouabdallah, Jorge Gavilan, Jennifer Gerbl, Prayuth Patumcharoenpol
2021 arXiv   pre-print
Our model for this approach was trained on a dataset consisting of around 8800 documents and is able to obtain an overall F1-score of 0.923.  ...  This model aims to increase the overall accuracy of metadata extraction compared to other state-of-the-art approaches.  ...  [17] , numerous previous old studies on metadata extraction relied on text and layout rules, addressing the issue using context-based classifiers such as Hidden Markov Models (HMMs) and similar approaches  ... 
arXiv:2111.05736v1 fatcat:rz5xwg5nkrhz7hwn54by2eqcpa

KEYRY: A Keyword-Based Search Engine over Relational Databases Based on a Hidden Markov Model [chapter]

Sonia Bergamaschi, Francesco Guerra, Silvia Rota, Yannis Velegrakis
2011 Lecture Notes in Computer Science  
In KEYRY the search process is modeled as a Hidden Markov Model and the List Viterbi algorithm is applied to computing the top-k queries that better represent the intended meaning of a user keyword query  ...  This work was partially supported by project "Searching for a needle in mountains of data"  ...  The tool we demonstrate here is instead based on a Hidden Markov Model. A detailed description of the methodology can be found on our respective research paper [2] .  ... 
doi:10.1007/978-3-642-24574-9_42 fatcat:fa7npu7g5vhhtibfp6xoe5mk34

Sequence Labeling using Conditional Random Fields

Romansha Chopra, Nivedita Singh, Yang Zhenning, N.Ch.S.N. Iyengar
2017 International Journal of u- and e- Service, Science and Technology  
Conditional random fields (CRFs), is a scheme for building probabilistic models to divide and tag sequence data.  ...  With a given a labeled set of data, baseline set of features will be created and the accuracy of the CRF suite model created using those features will be measured.  ...  It represents the large literature body and it also studies the particular class of Hidden Markov Model It is an undirected graphical model which is conditionally trained, repeated over sequence.  ... 
doi:10.14257/ijunesst.2017.10.9.10 fatcat:pofo2sdqazgfjgt3bvnsn5ibmi

Detection of Mobile Phone Fraud Using Possibilistic Fuzzy C-Means Clustering and Hidden Markov Model

Sharmila Subudhi, Suvasini Panigrahi, Tanmay Kumar Behera
2016 International Journal of Synthetic Emotions  
This paper presents a novel approach for fraud detection in mobile phone networks by using a combination of Possibilistic Fuzzy C-Means clustering and Hidden Markov Model (HMM).  ...  The trained HMM model is then applied for detecting fraudulent activities on incoming call sequences.  ...  Markov Model (HMM).  ... 
doi:10.4018/ijse.2016070102 fatcat:xjjuucmiz5dynnc4tlwyjnbd5u

A Rule-Based Information Extraction Approach for Extracting Metadata from PDF Books

Abrar Alamoudi, Amal Alomari, Sarah Alwarthan, Atta-ur-Rahman
2021 Innovative Computing Information and Control Express Letters, Part B: Applications  
In this work, an intelligent rule-based approach is proposed for extracting the logical metadata from PDF books accurately.  ...  The experimental results indicate that the proposed approach is capable of extracting the metadata from PDF books successfully with an overall accuracy of 94.62% and 90.27% for both training and testing  ...  Finally, the Hidden Markov Model (HMM) is applied for extracting the authors and the titles.  ... 
doi:10.24507/icicelb.12.02.121 fatcat:4kjs4omuxvhrnhujft5ad7maca

Analyzing the Dynamics of Research by Extracting Key Aspects of Scientific Papers

Sonal Gupta, Christopher D. Manning
2011 International Joint Conference on Natural Language Processing  
We extract these characteristics by matching semantic extraction patterns, learned using bootstrapping, to the dependency trees of sentences in an article's abstract.  ...  For instance, we show that part-of-speech tagging and parsing have increasingly been adopted as tools for solving problems in other domains.  ...  papers mainly due to techniques like expectation maximization and hidden Markov models.  ... 
dblp:conf/ijcnlp/GuptaM11 fatcat:d56yguykize7jdxmndcs6b6n3y

Research on the Application of User Behavior Auditing Based on Hidden Markov Model in Cloud Environment

Kejun Zhang, Chen Jiang, Yunsong Yang, Yu Wang, Guoliang Zhang
2017 DEStech Transactions on Materials Science and Engineering  
A user behavior modeling method based on Hidden Markov model is proposed in this paper, the user behavior model is used to identify the validity of the user's operation, and to ensure the security of the  ...  In the process of audit data analysis and processing, a feature vector method is proposed to extract valuable information from audit data.  ...  Hidden Markov model is actually a classification method in data mining, using the hidden Markov model to model the normal behavior of cloud users.  ... 
doi:10.12783/dtmse/icmsme2016/7516 fatcat:voumabvizfhhpkwsgwahklfrea

Prediction is very hard, especially about conversion. Predicting user purchases from clickstream data in fashion e-commerce [article]

Luca Bigon, Giovanni Cassani, Ciro Greco, Lucas Lacasa, Mattia Pavoni, Andrea Polonioli, Jacopo Tagliabue
2019 arXiv   pre-print
from the literature; finally, we propose a new discriminative neural model that outperforms neural architectures recently proposed at Rakuten labs.  ...  Knowing if a user is a buyer vs window shopper solely based on clickstream data is of crucial importance for ecommerce platforms seeking to implement real-time accurate NBA (next best action) policies.  ...  The authors also wish to thank Tooso Inc. for providing the computational infrastructure and funding for the project.  ... 
arXiv:1907.00400v1 fatcat:qjs2gqh6fff73pme2bqdlty7ty

Unsupervised Metadata Extraction in Scientific Digital Libraries Using A-Priori Domain-Specific Knowledge

Alexander Ivanyukovich, Maurizio Marchese
2006 Semantic Web Applications and Perspectives  
More specifically, we focus on quality improvements of metadata extraction from scientific papers (mainly in computer science domain) collected from various sources over the Internet.  ...  We propose and present a novel approach focusing on the improvement in the metadata extraction quality without involving external information sources (oracles, manually prepared databases, etc), but relying  ...  Application of statistical models like Hidden Markov Model (HMM) [13] and Dual and Variable-length output Hidden Markov Model (DVHMM) [14] are reported to have nearly 90% accuracy however, the training  ... 
dblp:conf/swap/IvanyukovichM06 fatcat:h7ktkv3cyjdbfm24jjywlt7vki

Text Mining to Facilitate Domain Knowledge Discovery [chapter]

Chengbin Wang, Xiaogang Ma
2019 Text Mining - Analysis, Programming and Application [Working Title]  
The research includes three major parts: (1) structuralization of geological literature, (2) information extraction and visualization for geological literature, and (3) geological text mining to assist  ...  For these data, traditional research methods have limited functions for integrating and mining them to make knowledge discovery.  ...  The statistically based methods include machine learning and deep learning methods, such as hidden Markov model (HMM), maximum entropy Markov model (MEMM), conditional random fields (CRF), and long short-term  ... 
doi:10.5772/intechopen.85362 fatcat:opipq2cyhrfmjdnw2rewku7spy

Turning hamburgers into a cow - An introductory comparison of PDF metadata extraction using two reference management systems

Jeremy Lee McLaughlin
2016 Figshare  
The sections of this paper examine the literature related to attempts to extract PDF metadata using automated harvesting technologies and the role of reference management systems in PDF organization for  ...  ., (2011) notes that even though PDF is the format of choice for an overwhelming majority of downloaded scholarly content, attempts to extract meaning from PDFs as they are currently used is similar to  ...  Markov models; CRF -conditional random fields).  ... 
doi:10.6084/m9.figshare.3807183 fatcat:hromvcfbkbfydao3wlurmhcwoi

Challenging Aspects for Facial Feature Extraction and Age Estimation

A. Deepa, T. Sasipraba
2016 Indian Journal of Science and Technology  
The face identification, tools for extraction, feature normalization, features to be extracted is all explained.  ...  In this paper we have discussed the steps for facial age estimation and a comparative study of various methodologies in each step has been briefed.  ...  DCT coefficients of image block are used as observation vectors of an embedded HMM(Hidden Markov Model).  ... 
doi:10.17485/ijst/2016/v9i4/72315 fatcat:7p4wv4lhhzdkdhfc2gye7d5cnu

MexPub: Deep Transfer Learning for Metadata Extraction from German Publications [article]

Zeyd Boukhers and Nada Beili and Timo Hartmann and Prantik Goswami and Muhammad Arslan Zafar
2021 arXiv   pre-print
In this paper, we present a method that extracts metadata from PDF documents with different layouts and styles by viewing the document as an image.  ...  Our method achieved an average accuracy of around 90% which validates its capability to accurately extract metadata from a variety of PDF documents with challenging templates.  ...  Therefore, most of the earlier works addressed the problem of classifying segment strings in scientific documents using context-based classifiers such as Hidden Markov Models (HMMs) [26] and Conditional  ... 
arXiv:2106.07359v1 fatcat:k5446qbqvzhonj32urjqvaagpq

Reference metadata extraction using a hierarchical knowledge representation framework

Min-Yuh Day, Richard Tzong-Han Tsai, Cheng-Lung Sung, Chiu-Chen Hsieh, Cheng-Wei Lee, Shih-Hung Wu, Kun-Pin Wu, Chorng-Shyong Ong, Wen-Lian Hsu
2007 Decision Support Systems  
Accurate reference metadata extraction from such publications is essential for the integration of metadata from heterogeneous reference sources.  ...  In this paper, we propose a hierarchical template-based reference metadata extraction method for scholarly publications.  ...  Acknowledgements We would like to thank the anonymous reviewers for their valuable comments, which have greatly improved the presentation of this paper.  ... 
doi:10.1016/j.dss.2006.08.006 fatcat:6fumdfsmdnaxjnqcyxayeqfpma
« Previous Showing results 1 — 15 out of 2,979 results