333 Hits in 7.0 sec

Towards High-Precision and Reusable Entity Resolution Algorithms over Sparse Financial Datasets

Douglas Burdick, Lucian Popa, Rajasekar Krishnamurthy
2016 Proceedings of the Second International Workshop on Data Science for Macro-Modeling - DSMM'16  
We describe our approach to the FEIII Data Challenge, which requires matching entities across multiple financial datasets (FFIEC, SEC and LEI).  ...  The high-level specification is reusable, in the sense that the same HIL specification (modulo changing the attribute names) is uniformly applicable not only between FFIEC and SEC, but also between FFIEC  ...  Regarding the entity resolution fragment of HIL, which we use in this data challenge, an important feature towards obtaining both high recall and high precision is the ability to define and compose multiple  ... 
doi:10.1145/2951894.2951909 dblp:conf/cikm/BurdickPK16 fatcat:7ee5mpjwurhkldt6dnynyhwnga

Sentiment analysis model for Twitter data in Polish language [article]

Karol Chlasta
2019 arXiv   pre-print
The Naive Bayes and Maximum Entropy algorithms achieved the best accuracy of respectively 71.76% and 77.32%. All implementation tasks were completed using R programming language.  ...  The score was calculated using the number of positive and/or negative emoticons and Polish words in each document.  ...  However, when trained on the matrix with a parameter sparse = 0.995 it provided almost 71% accuracy and high precision of 87.77%.  ... 
arXiv:1911.00985v1 fatcat:6ymavvqegfcpfean2ddirvh4we

Machine Learning for Microcontroller-Class Hardware – A Review [article]

Swapnil Sayan Saha, Sandeep Singh Sandha, Mani Srivastava
2022 arXiv   pre-print
Conventional machine learning deployment has high memory and compute footprint hindering their direct deployment on ultra resource-constrained microcontrollers.  ...  Finally, we identify the open research challenges and unsolved questions demanding careful considerations moving forward.  ...  • We illustrate a coherent and closed-loop ML model development and deployment workflow for microcontrollers. We delineate each block in the workflow, providing both  ... 
arXiv:2205.14550v2 fatcat:nuamusn3fzctlcrkolazkgq2ia

Augmented Understanding and Automated Adaptation of Curation Rules [article]

Alireza Tabebordbar
2020 arXiv   pre-print
Over the past years, there has been many efforts to curate and increase the added value of the raw data.  ...  To address these challenges, in this dissertation, we present techniques, algorithms and systems for augmenting analysts in curation tasks.  ...  is a time-consuming task and requires high-performance computing resources on very large datasets such as Twitter.  ... 
arXiv:2007.08710v1 fatcat:cw4ka6pzw5ev3hlfidpfllv5sy

A Surface Ocean CO2 Reference Network, SOCONET and Associated Marine Boundary Layer CO2 Measurements

Rik Wanninkhof, Penelope A. Pickers, Abdirahman M. Omar, Adrienne Sutton, Akihiko Murata, Are Olsen, Britton B. Stephens, Bronte Tilbrook, David Munro, Denis Pierrot, Gregor Rehder, J. Magdalena Santana-Casiano (+19 others)
2019 Frontiers in Marine Science  
These products and other derivatives using surface ocean and MBL  ...  and atmospheric MBLs.  ...  The detail and complexity of interpolation schemes differ significantly between the various approaches but they all aim to create pCO 2 fields at high resolution from relatively sparse data (Figure 4  ... 
doi:10.3389/fmars.2019.00400 fatcat:domfkhempzhqfkzpgx7gjw2ezu

A Contemporary Review on Utilizing Semantic Web Technologies in Healthcare, Virtual Communities, and Ontology-Based Information Processing Systems

Senthil Kumar Narayanasamy, Kathiravan Srinivasan, Yuh-Chung Hu, Satish Kumar Masilamani, Kuo-Yi Huang
2022 Electronics  
As the world is heading towards the fourth industrial revolution, the implicit utilization of artificial-intelligence-enabled semantic web technologies paves the way for many real-time application developments  ...  gathered from heterogeneous sources receive a common nomenclature, and it paves the way for disambiguating the duplicates very easily.  ...  Furthermore, the semantic web and knowledge management can complement each other for resolving the ambiguities persisting in the text documents and addressing the challenges with a high precision rate.  ... 
doi:10.3390/electronics11030453 fatcat:rslqynwogrdtfnd52ydwwpckxe

Generating enhanced natural environments and terrain for interactive combat simulations (GENETICS)

William D. Wells
2005 Proceedings of the ACM symposium on Virtual reality software and technology - VRST '05  
the data needed, and completing and reviewing the collection of information.  ...  Reports, 1215 Jefferson Davis Highway, Suite 1204, Arlington, VA 22202-4302, and to the Office of Management and Budget, Paperwork Reduction Project (0704-0188) Washington DC 20503.  ...  core of the tree and high precision rendering of leaves on the outer edge.  ... 
doi:10.1145/1101616.1101655 dblp:conf/vrst/Wells05 fatcat:7zz2nynrpfchjba27dr7yfmbvu

Crowdsourced Data Management: Industry and Academic Perspectives

Adam Marcus, Aditya Parameswaran
2015 Foundations and Trends in Databases  
resolution, and video/audio/image processing.  ...  On the academic side, we summarize the state of the art in crowd-powered algorithms and system design tailored to large-scale data processing.  ...  Entity resolution or matching We now move onto entity resolution or matching.  ... 
doi:10.1561/1900000044 fatcat:rva7dinutnbnlj2hvm4j2hmhge

Recent advances in methods of lexical semantic relatedness – a survey

2012 Natural Language Engineering  
Resolving ambiguity concerns recognising the true referent entity of a name reference, essentially a further named entity 'recognition' step and often a compulsory pro-VI  ...  It is recognised that a fundamental task in Information Extraction is Named Entity Recognition, the goals of which are identifying references of named entities in unstructured documents, and classifying  ...  This light-weight process generates a sense-tagged corpus that is biased towards high-precision but low recall.  ... 
doi:10.1017/s1351324912000125 fatcat:b62qbqwrqfaf3gytw22yktc5ae

Statistical source expansion for question answering

Nico Schlaefer, Jennifer Chu-Carroll, Eric Nyberg, James Fan, Wlodek Zadrozny, David Ferrucci
2011 Proceedings of the 20th ACM international conference on Information and knowledge management - CIKM '11  
We evaluated the impact of source expansion on search performance and end-to-end accuracy using Watson and the OpenEphyra QA system, and datasets comprising over 6,500 questions from the Jeopardy!  ...  on rankings generated by a web search engine, and 75% when using a multi-document summarization algorithm.  ...  Some nuggets have high topicality scores just because they repeat topical terms over and over, but they provide little relevant information.  ... 
doi:10.1145/2063576.2063632 dblp:conf/cikm/SchlaeferCNFZF11 fatcat:whoy62klazctbdo4p57wbevkdu

Excavating the mother lode of human-generated text: A systematic review of research that uses the wikipedia corpus

Mohamad Mehdi, Chitu Okoli, Mostafa Mesgari, Finn Årup Nielsen, Arto Lanamäki
2017 Information Processing & Management  
Chu [18] proposed new approaches for handling sparse relational datasets, specifically data extracted from unstructured documents.  ...  Researchers have examined Wikipedia's evolution over the years in terms of content and community.  ... 
doi:10.1016/j.ipm.2016.07.003 fatcat:qgjeatizfzbyjkbo4rsuxea76y

Introduction to Linked Data and Its Lifecycle on the Web [chapter]

Axel-Cyrille Ngonga Ngomo, Sören Auer, Jens Lehmann, Amrapali Zaveri
2014 Lecture Notes in Computer Science  
With Linked Data, a very pragmatic approach towards achieving the vision of the Semantic Web has gained some traction in the last years.  ...  We conclude the chapter with a discussion of issues, limitations and further research and development challenges of Linked Data.  ...  This work was supported by a grant from the European Union's 7th Framework Programme provided for the projects LOD2 (GA no. 257943), GeoKnow (GA no. 318159) and the Eureka project SCMS.  ... 
doi:10.1007/978-3-319-10587-1_1 fatcat:2vba6rydjvehro3cwcakafnn5e

Driver Lane Change Intention Inference for Intelligent Vehicles: Framework, Survey, and Challenges

Yang Xing, Chen Lv, Huaji Wang, Hong Wang, Yunfeng Ai, Dongpu Cao, Efstathios Velenis, Fei-Yue Wang
2019 IEEE Transactions on Vehicular Technology  
Then, the lane change intention inference (LCII) system is reviewed from the perspective of input signals, algorithms, and evaluation.  ...  ., and with Vehicle ). E. Velenis is with the Advanced Vehicle Engineering  ...  The virtual facial images and videos can be generated based on the high-resolution 3D scans as used in [133] .  ... 
doi:10.1109/tvt.2019.2903299 fatcat:a62io5i4cbf3nkhm6k5f2lecje

Principles and Practices of Robust, Photography-based Digital Imaging Techniques for Museums [article]

Mark Mudge, Carla Schroer, Graeme Earl, Kirk Martinez, Hembo Pagi, Corey Toler-Franklin, Szymon Rusinkiewicz, Gianpaolo Palma, Melvin Wachowiak, Michael Ashley, Neffra Matthews, Tommy Noble (+1 others)
2010 Eurographics Workshop on Graphics and Cultural Heritage  
The tutorial will present many examples of existing and cutting-edge uses of photography-based imaging including Reflectance Transformation Imaging (RTI), Algorithmic Rendering (AR), camera calibration  ...  The information is extracted from the photographic sequences by computer algorithms.  ...  The output is a high resolution color texture and normal map. Capturing RGBN Datasets There are several methods for acquiring RGBN datasets.  ... 
doi:10.2312/pe/vast/vast10s/111-137 dblp:conf/vast/MudgeSEMPTRPWAM10 fatcat:2kllism42veaxcwbgfxrijahre

Tackling Climate Change with Machine Learning [article]

David Rolnick, Priya L. Donti, Lynn H. Kaack, Kelly Kochanski, Alexandre Lacoste, Kris Sankaran, Andrew Slavin Ross, Nikola Milojevic-Dupont, Natasha Jaques, Anna Waldman-Brown, Alexandra Luccioni, Tegan Maharaj, Evan D. Sherwin, S. Karthik Mukkavilli (+6 others)
2019 arXiv   pre-print
From smart grids to disaster management, we identify high impact problems where existing gaps can be filled by machine learning, in collaboration with other fields.  ...  Climate change is one of the greatest challenges facing humanity, and we, as machine learning experts, may wonder how we can help.  ...  Foundation and Carnegie Mellon University (SES-00949710), US Department of Energy contract DE-FG02-97ER25308, the Natural Sciences and Engineering Research Council of Canada, and the MIT Media Lab Consortium  ... 
arXiv:1906.05433v2 fatcat:ykmqsivkbfcazaz3wl5f7srula
« Previous Showing results 1 — 15 out of 333 results