Filters








24 Hits in 1e+01 sec

Pattern recognition and machine learning

1993 ChoiceReviews  
The software frameworks Hadoop, GridGain and Oracle Coherence are reviewed and evaluated with respect to their suitablility to t into the context of RapidMiner.  ...  However, it has only limited support for parallelization and it lacks functionality to spread long-running computations over multiple machines.  ...  In cases like this Hadoop probably performs better since it is thought to do performant disk reads in large scale data extensive applications [45] .  ... 
doi:10.5860/choice.30-3308 fatcat:qunga7zxtfhjze2rmcvkhetp44

Pattern Recognition and Machine Learning

2007 Journal of Electronic Imaging (JEI)  
The software frameworks Hadoop, GridGain and Oracle Coherence are reviewed and evaluated with respect to their suitablility to t into the context of RapidMiner.  ...  However, it has only limited support for parallelization and it lacks functionality to spread long-running computations over multiple machines.  ...  In cases like this Hadoop probably performs better since it is thought to do performant disk reads in large scale data extensive applications [45] .  ... 
doi:10.1117/1.2819119 fatcat:5cp7aa2hyrg6hjyddqvvvws7di

Pattern recognition and machine learning

2007 ChoiceReviews  
The software frameworks Hadoop, GridGain and Oracle Coherence are reviewed and evaluated with respect to their suitablility to t into the context of RapidMiner.  ...  However, it has only limited support for parallelization and it lacks functionality to spread long-running computations over multiple machines.  ...  In cases like this Hadoop probably performs better since it is thought to do performant disk reads in large scale data extensive applications [45] .  ... 
doi:10.5860/choice.44-5091 fatcat:sz2zpdbqznee5dr45vrxk37nbi

The Enron Corpus: Where the Email Bodies are Buried? [article]

David Noever
2020 arXiv   pre-print
Where possible, we compare accuracy against execution times for 51 algorithms and report human-interpretable business rules that can scale to vast datasets.  ...  First, we identify persons of interest (POI), using financial records and email, and report a peak accuracy of 95.7%.  ...  Acknowledgements The authors would like to thank the PeopleTec Technical Fellows program for encouragement and project assistance.  ... 
arXiv:2001.10374v1 fatcat:6tttlzlqfvfypjbzy2mcachup4

List of Contributors [chapter]

Shibakali Gupta, Indradip Banerjee, Siddhartha Bhattacharyya
2019 Big Data Security  
KU is a collaborative initiative designed to make high quality books Open Access. More information about the initiative can be found at www.knowledgeunlatched.org  ...  An electronic version of this book is freely available, thanks to the support of libraries working with Knowledge Unlatched.  ...  With this information, a naive hyperlink in a spam e-mail can be twisted to disclose not only his individual minutiae but can also be enticed into providing corporate authorizations and thus providing  ... 
doi:10.1515/9783110606058-202 fatcat:3jtqdtgsavas7n3vxrtbrkdbdy

Data Compression Techniques on Text Files: A Comparison Study

Haroon Altarawneh, Mohammad Altarawneh
2011 International Journal of Computer Applications  
There is a real need to save allocated space for this content as well as allowing more efficient usage, searching, and retrieving information operations on this content.  ...  There is a real need to save allocated space for this content as well as allowing more efficient usage, searching, and retrieving information operations on this content.  ...  Acknowledgements: Contributors to the research other than authors credited should be mentioned under acknowledgement.  ... 
doi:10.5120/3097-4249 fatcat:obfx7im63zgapg5pro7ol3erxa

A comprehensive survey on machine learning for networking: evolution, applications and research opportunities

Raouf Boutaba, Mohammad A. Salahuddin, Noura Limam, Sara Ayoubi, Nashid Shahriar, Felipe Estrada-Solano, Oscar M. Caicedo
2018 Journal of Internet Services and Applications  
This survey is original, since it jointly presents the application of diverse ML techniques in various key areas of networking across different network technologies.  ...  In this way, readers will benefit from a comprehensive discussion on the different learning paradigms and ML techniques applied to fundamental problems in networking, including traffic prediction, routing  ...  Acknowledgments We thank the anonymous reviewers for their insightful comments and suggestions that helped us improve the quality of the paper.  ... 
doi:10.1186/s13174-018-0087-2 fatcat:jvwpewceevev3n4keoswqlcacu

Comparative Analysis of Different Machine Learning Classifiers for the Prediction of Chronic Diseases [chapter]

Rajesh Singh, Anita Gehlot, Dharam Buddhi
2022 Comparative Analysis of Different Machine Learning Classifiers for the Prediction of Chronic Diseases  
Precise diagnosis of these diseases on time is very significant for maintaining a healthy life.  ...  This paper forms the basis of understanding the difficulty of the domain and the amount of efficiency achieved by the various methods recently.  ...  However, still further research is needed to overcome the challenges and making the Stirling engine commercially viable and cost effective. for multiple applications.  ... 
doi:10.13052/rp-9788770227667 fatcat:da47mjbbyzfwnbpde7rgbrlppe

Hybrid machine learning architecture for phishing email classification [article]

Κωνσταντίνος Κουτρουμπούχος, Konstantinos Koutroumpouchos, University Of Piraeus, Χρήστος Ξενάκης, Christos Xenakis
2020
as input for the classification process and (b) A hybrid "stacked" architecture, which has two classifiers: the first one classifies the email using text-based features and the second one (which outputs  ...  After developing an algorithm for testing every combination of classifier, their different hyperparameter values and the different architectures, it was found that, compared to the classification using  ...  A typical type of data is uninterpreted binary data or other data which are to be processed by a mail-based application. Other uses include spreadsheets and data for mail-based scheduling systems.  ... 
doi:10.26267/unipi_dione/36 fatcat:6znkifhskncnpnx5efddt2sj4m

The Third International Conference on Data Analytics

Fritz Laux, Panos Pardalos, Alain Crolotte, Fritz Laux, Lina Yao, Eiko Yoneki, Takuya Yoshihiro, Wakayama University, Japan, Sergio Ilarri, Prabhat Mahanti, Dominique Laurent (+98 others)
Forward The Third International Conference on Data Analytics (DATA ANALYTICS 2014   unpublished
Processing of terabytes to petabytes of data, or incorporating non-structural data and multi-structured data sources and types require advanced analytics and data science mechanisms for both raw and partially-processed  ...  We hope the DATA ANALYTICS 2014 was a successful international forum for the exchange of ideas and results between academia and industry and to promote further progress in data analytics.  ...  Many methods have been proposed for exact search, but they all suffer from the curse of dimensionality and are, thus, not applicable to high dimensional spaces.  ... 
fatcat:ofbiffnczjaqzioc67koijiis4

Consortial certification processes – The Goportis digital archive. A case study [article]

Yvonne Friese, Thomas Gerdes, Franziska Schwab, Thomas Bähr, University, My, University, My
2019
The Goportis Consortium successfully applied for the Data Seal of Approval (DSA) [1] and is currently working on the application for the nestor Seal [2].  ...  This way it could serve as best-practice example for other institutions interested in consortial certification.  ...  We aim for this to include a working Archivematica instance configured using the test dataset and exports from the service, for example provided as a bootable drive.  ... 
doi:10.34657/605 fatcat:q4wskqmipvbn7bz7dgazxvo4o4

Special Issue: Grid, Cloud and Sky Applications for Knowledge-based Industries and Businesses

Vlado Stankovski, Dana Petcu, Anton Železnikar, Matjaž Gams, Jožef Stefan, Drago Torkar, Jožef Stefan, Juan Carlos, Augusto, Argentina, Costin Badica, Romania (+11 others)
2013 unpublished
In the final part of the paper, a case study on fingerprint recognition in the cloud and its integration into the e-learning environment Moodle is presented.  ...  data distribution and parallel processing capabilities.  ...  Acknowledgement The authors are grateful to the Associate Editor Maria Ganzha and the reviewer's valuable comments that improved the manuscript.  ... 
fatcat:hprkilpn3zaybgxrp4frkdwhme

An empirical evaluation of misconfiguration in Internet services [article]

Tobias Fiebig, Technische Universität Berlin, Technische Universität Berlin, Anja Feldmann
2017
In fact, there is a constant stream of new, complex techniques to ensure the confidentiality, integrity, and availability of data and systems.  ...  and authorization being configured even though it would have been available.  ...  distribution moved to using (compromised) mail servers, and sending spam emails via compromised mail accounts of legitimate users (Alazab and Broadhurst, 2017; Hu et al., 2016) .  ... 
doi:10.14279/depositonce-6140 fatcat:lvw4geuxrrgfhi3ms3t7m6pkl4

CLOUD COMPUTING 2011 Editors

Massimo Villari, Yong Lee, Wolf Zimmermann
2011 The Second International Conference on Cloud Computing, GRIDs, and Virtualization   unpublished
The copyright release is a transfer of publication rights, which allows IARIA and its partners to drive the dissemination of the published material.  ...  This allows IARIA to give articles increased visibility via distribution, inclusion in libraries, and arrangements for submission to indexes.  ...  And a record-andreplay scheme for this DCSM is designed.  ... 
fatcat:ckpewilt4vcutmzrx2jtagih4q

Traffic microstructures and network anomaly detection [article]

Henry Clausen, University Of Edinburgh, David Aspinall, Gordon Ross
2022
The goal of DetGen is to provide researchers with extensive ground truth information and enable the generation of customisable datasets that provide realistic structural diversity.  ...  Many methods rely on features on a microscopic level such as packet sizes or interarrival times to identify reoccurring patterns and detect deviations from them.  ...  The dataset contains labelled traffic from three real attacks, corresponding to IPscanning and a spam mail campaign.  ... 
doi:10.7488/era/2060 fatcat:lukha7hndnfbfe3gmq2stldkva
« Previous Showing results 1 — 15 out of 24 results