39,861 Hits in 8.0 sec

Company Mention Detection for Large Scale Text Mining

Rebecca J. Passonneau, Tifara Ramelson, Boyi Xie
2014 Proceedings of the International Conference on Knowledge Discovery and Information Retrieval  
This paper presents an initial investigation of the impact of improved company mention detection for financial analytics. Coverage of company mention detection improve dramatically.  ...  Text mining on a large scale that addresses actionable prediction needs to content with noisy information in documents, and with interdependencies between the kinds of NLP techniques applied and the data  ...  In our view, rich semantic and pragmatic data mining for large scale text mining should aim for information that supports more informed decision making, or in other words, is actionable.  ... 
doi:10.5220/0005174405120520 dblp:conf/ic3k/PassonneauRX14 fatcat:aks2su6mtrdg5gs7nnfoimbi5m

Large-scale Information Extraction for Assisted Curation of the Biomedical Literature

Fabio Rinaldi, Lenz Furrer, Simon Clematide
2015 International Conference of the Italian Association for Artificial Intelligence  
We present an approach towards large-scale processing of biomedical literature in order to extract domain entities and semantic relationships among them.  ...  PubMed, the main literature repository for the life sciences, contains more than 23 million publication references. In average nearly two publications per minute are added.  ...  As part of a project funded by a large pharmaceutical company, the OntoGene group recently adapted their text mining, with the goal of detecting evidence for specific protein interactions described in  ... 
dblp:conf/aiia/RinaldiFC15 fatcat:evfhhhvqmneytda7hyflrlae6q

A Natural Language Processing Approach to Social License Management

Robert G. Boutilier, Kyle Bahr
2020 Sustainability  
To validate the program, we compared it to human coding of interview texts from a Bolivian mining project from 2009 to 2018.  ...  The program's estimation of the annual average SL was significantly correlated with rating scale measures.  ...  The five most mentioned bags were labeled as: • "Community administration and projects" (2547 mentions) • "Regional satisfaction with relations with the mining company" (2316 mentions) • "Community benefits  ... 
doi:10.3390/su12208441 fatcat:5w64xhzetfe27nbvouldwzqrku

Opinion Mining on Non-English Short Text [article]

Esra Akbas
2017 arXiv   pre-print
We detect the mixture of positive and negative sentiments on a multi-variant scale.  ...  In this paper, we investigate the problem of mining opinions on the collection of informal short texts. Both positive and negative sentiment strength of texts are detected.  ...  acceptor/rejector, for too small/large values of P (c).  ... 
arXiv:1704.00016v2 fatcat:u6d3igwzanewdioonhysf7hvvm

Spark NLP: Natural Language Understanding at Scale [article]

Veysel Kocaman, David Talby
2021 arXiv   pre-print
It provides simple, performant and accurate NLP annotations for machine learning pipelines that can scale easily in a distributed environment.  ...  of text preprocessing at large scale and connecting the dots between various steps of solving a data science problem with NLP.  ...  records is unstructured making it largely inaccessible for statistical analysis [5] .  ... 
arXiv:2101.10848v1 fatcat:niua3vh3ujcwtge3m47e5entva

Intelligent Financial Fraud Detection Practices: An Investigation [chapter]

Jarrod West, Maumita Bhattacharya, Rafiqul Islam
2015 Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering  
This paper presents a comprehensive investigation on financial fraud detection practices using such data mining methods, with a particular focus on computational intelligence-based techniques.  ...  Financial fraud is an issue with far reaching consequences in the finance industry, government, corporate sectors, and for ordinary consumers.  ...  US companies Text mining with singular valida- tion decomposition vector 95.65% [5] Financial statement fraud with managerial state- ments for US companies Text mining Text mining and support  ... 
doi:10.1007/978-3-319-23802-9_16 fatcat:5vghiqbtfnaj3p6zaam3l3icta

Intelligent Financial Fraud Detection Practices: An Investigation [article]

J. West, Maumita Bhattacharya, R. Islam
2015 arXiv   pre-print
This paper presents a comprehensive investigation on financial fraud detection practices using such data mining methods, with a particular focus on computational intelligence-based techniques.  ...  Financial fraud is an issue with far reaching consequences in the finance industry, government, corporate sectors, and for ordinary consumers.  ...  Finally, further research into the differences between each type of financial fraud could lead to a generic framework which would greatly enhance the scope of intelligent detection methods for this problem  ... 
arXiv:1510.07165v1 fatcat:jgvqfrowwnf27adzulock66sve

Sentiment Analysis: A Contrastive Study of its Techniques

2020 International journal for research in engineering application & management  
Text Mining, Information and Coding The concept of text mining involves the process of mining the data and extracting the required data from it.  ...  The corpus-based approach requires a large dataset to detect the polarity (positive, negative or neutral) and sentiment analysis of the text whereas a dictionary-based approach used to measure the feeling  ... 
doi:10.35291/2454-9150.2020.0325 fatcat:yjkh5wlwfrbx3gq7elp72bys7i

Artificial Intelligence and Big Data in Fraud Detection

2021 EURAS Journal of Engineering and Applied Sciences  
On the other hand, artificial intelligence methods are used in fraud detection for increasing the efficiency of corporations.  ...  There are ten artificial intelligence methods explained which are used for fraud detection. Each method has its unique bases and it can not be said that there is only one optimal method.  ...  Financial statement Text mining 45.08- fraud with managerial Text mining and support vector 75.41% statements for US machine hybrid 50.00- companies 81.97% Financial statement Text mining and decision  ... 
doi:10.17932/ejeas.2021.024/ejeas_v01i2001 fatcat:l7zdy6bfujfllkdb4pcrmpzbou

Longitudinal Analytics on Web Archive Data: It's About Time!

Gerhard Weikum, Nikos Ntarmos, Marc Spaniol, Peter Triantafillou, András A. Benczúr, Scott Kirkpatrick, Philippe Rigaux, Mark Williamson
2011 Conference on Innovative Data Systems Research  
For example, tracking and analyzing a politician's public appearances over a decade is much harder than mining frequently used query words or frequently clicked URLs for the last month.  ...  The timestamp annotations and the sheer volume of multi-modal content constitutes a gold mine for analysts of all sorts, across different application areas, from political analysts and marketing agencies  ...  ., by company acquistions or mergers, this is a grand challenge for large-scale longitudinal analytics.  ... 
dblp:conf/cidr/WeikumNSTBKRW11 fatcat:ycvlhbkqdjffhcv6k4nb2gi63i

Sentiment Analysis of Microtakaful Industry: Comparison between Indonesia and Malaysia

Aam Slamet Rusydiana, Irman Firmansyah, Lina Marlina
2019 International Journal of Nusantara Islam  
Data were analyzed using the software Semantria as an analytical tool in the form of text.  ...  Media Applications Online Text mining is being used by large media companies, such as the Tribune company, to eliminate ambiguous information and to provide the reader with a better search experience,  ...  Mentioned that the increasingly widespread use of social networks like Twitter makes social networking such as very large data.  ... 
doi:10.15575/ijni.v6i1.3004 fatcat:x5m5cfzozbgrxmiqtpkzgozdye

SocialSpamGuard: A Data Mining-Based Spam Detection System for Social Media Networks

Xin Jin, Cindy Xide Lin, Jiebo Luo, Jiawei Han
2011 Proceedings of the VLDB Endowment  
We employ our GAD clustering algorithm for large scale clustering and integrate it with the designed active learning algorithm to deal with the scalability and real-time detection challenges.  ...  In this demo, we propose SocialSpamGuard, a scalable and online social media spam detection system based on data mining for social network security.  ...  However, one of the major challenges of spam detection in social media is that the spams are usually in the form of photos and text, and in the context of large scale dynamic social network.  ... 
dblp:journals/pvldb/JinLLH11 fatcat:syec47lac5guzmsjp7akcf5bpa

An Advanced Press Review System Combining Deep News Analysis and Machine Learning Algorithms

Danuta Ploch, Andreas Lommatzsch, Florian Schultze
2016 Proceedings of ACL-2016 System Demonstrations  
The system enables us demonstrating the live analyzes of news and social media streams as well as the strengths of advanced text mining algorithms for creating a comprehensive media analysis.  ...  In this demo we present a system that combines advanced text mining and machine learning approaches in an extensible press review system.  ...  An exemplary application for large scale news analysis is LY-DIA. LYDIA focuses on named entity detection.  ... 
doi:10.18653/v1/p16-4019 dblp:conf/acl/PlochLS16 fatcat:3blisgl2ifd4xnnwfemk3manvu

Adapting text mining tools to noisy text

Diana Maynard
2019 Zenodo  
Invited talk given at Text Mining for Science Studies Workshop, Berlin  ...  Acknowledgements This work supported by the European Union/EU under the Information and Communication Technologies (ICT) theme of the 7th Framework and H2020 Programmes for R&D:  ...  • GATE: a framework for text engineering that, at last count, comes with 87 different plugins • GCP: The GATECloud Paralleliser for large-scale multi-threaded processing • Mímir: a GATE based indexing  ... 
doi:10.5281/zenodo.3609869 fatcat:iiiph45jvnakraet6bk6ovew34

Mining Structures from Massive Text Data: A Data-Driven Approach

Jiawei Han
2017 Symposium on Information Management and Big Data  
The real-world big data are largely unstructured, interconnected, and in the form of natural language text.  ...  We propose a text mining approach that requires only distant supervision or minimal supervision but relies on massive data.  ...  Government is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation hereon.  ... 
dblp:conf/simbig/Han17 fatcat:pn573jbtavavjimlmqxx43gmra
« Previous Showing results 1 — 15 out of 39,861 results