Filters








4,780 Hits in 7.1 sec

Database Integrated Analytics Using R: Initial Experiences with SQL-Server + R

Josep Ll. Berral, Nicolas Poggi
2016 2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW)  
Most data scientists use nowadays functional or semi-functional languages like SQL, Scala or R to treat data, obtained directly from databases.  ...  Here we show a first taste of such technology by testing the portability of our ALOJA-ML analytics framework, coded in R, to Microsoft SQL-Server 2016, one of the SQL+R solutions released recently.  ...  Here we describe our first approach to integrate our R-based analytics engine into a SQL+R platform, the Microsoft SQL-Server 2016, primarily looking at the user experience, and discussing bout cases of  ... 
doi:10.1109/icdmw.2016.0009 dblp:conf/icdm/BerralP16 fatcat:izibngwkhnf2xanzxcrcs3kbwi

MCDB-R

Subi Arumugam, Fei Xu, Ravi Jampani, Christopher Jermaine, Luis L. Perez, Peter J. Haas
2010 Proceedings of the VLDB Endowment  
literature to a database setting.  ...  In this paper, we extend the Monte Carlo Database System to efficiently obtain a set of samples from the tail of a query-result distribution by adapting recent "Gibbs cloning" ideas from the simulation  ...  Using the TPC-H database (scalefactor = 10) on an 8-core server, we ran the former query using Each TS-seed is used to produce 1000 random values initially.  ... 
doi:10.14778/1920841.1920941 fatcat:2b5yjhb44fcgvp47heto47dc2u

WaterML R package for managing ecological experiment data on a CUAHSI HydroServer

Jiří Kadlec, Bryn StClair, Daniel P. Ames, Richard A. Gill
2015 Ecological Informatics  
The resulting system allows research scientists to use a familiar statistical computation environment, R, together with the open source HydroServer software (for data archival and sharing).  ...  The system is tested in the context of data collected as part of a large ecological manipulation experiment.  ...  Because it is integrated directly into the R statistical software, our package can be installed and used on any operating system with an internet connection.  ... 
doi:10.1016/j.ecoinf.2015.05.002 fatcat:5kryx2n3enhwxhztfzx3ly2xty

Profiling R on a contemporary processor

Shriram Sridharan, Jignesh M. Patel
2014 Proceedings of the VLDB Endowment  
Addressing these issues should allow R programs to run faster than they do today, and allow R to be used for analyzing even larger datasets.  ...  All data and code that is used in this paper (which includes the R programs, and changes to the R source code for instrumentation) can be found at:  ...  Oracle R Enterprise (ORE) [4] integrates R with the Oracle database to provide in-database analytic capabilities for R and Oracle users.  ... 
doi:10.14778/2735471.2735478 fatcat:3ekitklp2rhlniavmmx4zd3gsq

Integration of R Statistical Environment into ICT Infrastructure of GMP and GENASIS [chapter]

Richard Hůlek, Jiří Kalina, Ladislav Dušek, Jiří Jarkovský
2013 IFIP Advances in Information and Communication Technology  
In order to evaluate a dataset on POPs concentrations from the initial GMP campaign, it was essential to use advanced statistical methods which are not incorporated in commonly used database languages.  ...  statistical software R.  ...  Methods of Integration R can be effectively embedded with other programming languages and environments.  ... 
doi:10.1007/978-3-642-41151-9_23 fatcat:n6thp754jvcstod3xzu5vsfk5a

Enhanced Mobile caching in Edge using Signal R and AES

Dr.T.Jaya R.S. Aashmi
2019 Zenodo  
In our system,Signal R technique is used, where the cache data is shared among co-operative users. The hub connection among the co-operative users initiated by creating a server.  ...  Advanced Encryption Standard (AES) is used to encrypt and decrypt the data amid the co-operative users.  ...   SignalR applications can measure out to thousands of clients with Service Bus, SQL Server or Redis. SignalRmethod together with AES encryption and decryption algorithm.  ... 
doi:10.5281/zenodo.3598660 fatcat:dara76a24bhgzh255oldq6lzva

Building an R&D chemical registration system

Elyette Martin, Aurélien Monge, Jacques-Antoine Duret, Federico Gualandi, Manuel C Peitsch, Pavel Pospisil
2012 Journal of Cheminformatics  
Here, we present the concept and methodology we used to build the system that we call the Unique Compound Database (UCD).  ...  In order to store and manage thousands of chemical compounds in such an environment, we have built a state-of-the-art master chemical database with unique structure identifiers.  ...  Acknowledgements The authors express their gratitude to Peter Hliva for developing a Java component for Pipeline Pilot which uses the Java API of Accelrys Cheshire and to Lynda Conroy for editing the manuscript  ... 
doi:10.1186/1758-2946-4-11 pmid:22650418 pmcid:PMC3430593 fatcat:f4wwfya6tjdsznzmogchhcv7ha

Upgrading the Business Intelligence System by Implementing the Decision Tree Model in the R Software Package

Jordan ATANASIJEVIC, Danijela MILOSEVIC
2020 Studies in Informatics and Control  
Decision makers will be able to use the proposed solution to make decisions with confidence even if they don't possess the pertinent IT knowledge.  ...  Certain changes have taken place in the world of research in recent years, and open-source software packages are now most commonly used for statistical surveys.  ...  The code used for importing library to Shiny application and creating Open Database Connectivity (ODBC) from R to SQL Server Management Studio is presented below.  ... 
doi:10.24846/v29i2y202009 fatcat:f2r4gmr5jzbedj7ifbmmp4dgha

Accelerating Relational Databases by Leveraging Remote Memory and RDMA

Feng Li, Sudipto Das, Manoj Syamala, Vivek R. Narasayya
2016 Proceedings of the 2016 International Conference on Management of Data - SIGMOD '16  
In all our experiments, we use a high-performance enterprise-grade disk subsystem with a hardware RAID-0 controller and up to 20 disks.  ...  We conduct extensive experiments using a commodity RDMAenabled cluster of ten servers, and using a variety of configurations, targeted micro-benchmarks and industry-standard TPC benchmarks (Section 5).  ...  Acknowledgements We would like to thank Miguel Castro, Aleksandar Dragojević, and Dushyanth Narayanan for providing the code to implement RDMA transfers using NDSPI which we use in our custom implementation  ... 
doi:10.1145/2882903.2882949 dblp:conf/sigmod/LiDSN16 fatcat:czojcfylinej7dwklezw6lwnty

When Database Systems Meet the Grid [article]

Maria A. Nieto-Santisteban, Alexander S. Szalay, Aniruddha R. Thakar, William J. O'Mullane, Jim Gray, James Annis
2005 arXiv   pre-print
Using a cluster of SQL servers, we reimplemented an existing Grid application that finds galaxy clusters in a large astronomical database.  ...  The SQL implementation runs an order of magnitude faster than the earlier Tcl-C-file-based implementation. We discuss why and how Grid applications can take advantage of database systems.  ...  In current Grid projects, databases and database systems are typically used only to access and integrate data, but not to perform analytic or computational tasks.  ... 
arXiv:cs/0502018v1 fatcat:52xi4nfh3rf2ri33ubutnnosey

Integration of Cassandra and Spark in Computer Aided Drug Design

Nitha V R
2021 International Journal of Scientific Research in Computer Science Engineering and Information Technology  
A data analytics tool "spark" can be efficiently used in mining and managing huge data stored in the database.  ...  The Apache Cassandra database is a big data management tool which can be used to store huge amount of data in different file formats.  ...  It also supports SQL queries, Streaming data, Machine learning (ML), and Graph algorithms. Spark SQL includes a server mode with industry standard JDBC and ODBC connectivity.  ... 
doi:10.32628/cseit217112 fatcat:eb2iwav64fa5hjakupk2defpca

Towards A Data Warehouse Testing Framework

Bharath Kumar R, Nachiyappan.S
2017 Zenodo  
This Data warehouse is a non-standard data collection with administrative support in object-oriented, integrated, time-variant and decision making process.  ...  Here we use the ETL function. Advanced testing methods considered to be the best practices are the effects of this process. The test process results in the passing phase of the main database.  ...  Initially, the data stored in data sources (DS) is taken, converted and loaded in the database warehouse (DW).  ... 
doi:10.5281/zenodo.1050418 fatcat:j52w7mjrd5bp7hohijvhawo66e

The COMET Sleep Research Platform

Deborah A. Nichols, Steven DeSalvo, Ric Miller, Darrell Jonsson, Kara S. Griffin, Pamela R. Hyde, James K. Walsh, Clete A. Kushida
2014 eGEMs  
The platform also provides medical researchers the ability to visualize and interpret data using business intelligence (BI) tools.  ...  The platform also provides medical researchers the ability to visualize and interpret data using business intelligence (BI) tools.  ...  SQL Server Agent is a scheduling tool included with Microsoft SQL Server that allows for the scheduling of tasks and scripts against the SQL Server databases.  ... 
doi:10.13063/2327-9214.1059 pmid:25848590 pmcid:PMC4371444 fatcat:w2i44vnjs5gfthal6lr4qofmqe

DoT(Database for IoT): Requirements and Selection Criteria

Trupti Gurav, R. A.
2017 International Journal of Computer Applications  
However there is a little work done on data management as well as the type of database systems (DoT: Database for IoT) that must be used in IoT.  ...  DoT requirements are different than the traditional database requirements.  ...  The ability to execute analytics on #1 3. The ability to integrate analytics on #1 with analytics on previously known data #1 is handled well by NoSQL DBMSs. Or RDBMS via DDL.  ... 
doi:10.5120/ijca2017913021 fatcat:q3t3tojr7ffwhhiwznttmhkpvi

The Longhorn Array Database (LAD): an open-source, MIAME compliant implementation of the Stanford Microarray Database (SMD)

Patrick J Killion, Gavin Sherlock, Vishwanath R Iyer
2003 BMC Bioinformatics  
The Longhorn Array Database (LAD) is a MIAME compliant microarray database that operates on PostgreSQL and Linux.  ...  It is a fully open source version of the Stanford Microarray Database (SMD), one of the largest microarray databases.  ...  This work was supported by an NIH INIA Program (Integrative Neuroscience Initiative on Alcoholism) grant AA13518. P.J.K. was supported in part by a pre-doctoral NIAAA-Alcohol Training Grant.  ... 
doi:10.1186/1471-2105-4-32 pmid:12930545 pmcid:PMC194174 fatcat:gijrnimgw5dwrlrejrnzg6jxtm
« Previous Showing results 1 — 15 out of 4,780 results