Filters








2,961 Hits in 4.0 sec

Table servers protect confidentiality in tabular data releases

Alan F. Karr, Adrian Dobra, Ashish P. Sanil
2003 Communications of the ACM  
Here we describe table servers being developed by the National Institute of Statistical Sciences (NISS) that disseminate tabular summaries of statistical data in response to user queries for marginal sub-tables  ...  The query history database, with tables for users, queries and the time trajectories of RF (t) and UF (t), is maintained in a MySQL database server.  ... 
doi:10.1145/602421.602451 fatcat:hasvzptuqba73my6aw4npeijty

A Taxonomy of Inaccurate Summaries and Their Management in OLAP Systems [chapter]

John Horner, Il-Yeol Song
2005 Lecture Notes in Computer Science  
In this paper, we present a taxonomy of inaccurate summary factors and practical rules for handling them.  ...  We consolidate relevant terms and concepts in statistical databases with those in OLAP systems and explore factors that are important for measuring the impact of erroneous summaries.  ...  Queries Impacted Management Table 4 . 4 Managing Data Problems Type Queries Impacted Management Biased Any exploratory query that aggre- gates biased measures Compari- son queries where the  ... 
doi:10.1007/11568322_28 fatcat:oijetfqwmza75k5pdzzhaqqvza

SOFTWARE SYSTEMS FOR TABULAR DATA RELEASES

ADRIAN DOBRA, ALAN F. KARR, ASHISH P. SANIL, STEPHEN E. FIENBERG
2002 International Journal of Uncertainty Fuzziness and Knowledge-Based Systems  
We describe two classes of software systems that release tabular summaries of an underlying database.  ...  Optimal tabular releases are static releases of sets of sub-tables that are characterized by maximizing the amount of information released, as given by a measure of data utility, subject to a constraint  ...  Table Servers Servers disseminate tabular summaries of statistical data in response to user queries for marginal sub-tables of a large (e.g., 40 dimensions with 4 categories each) contingency table containing  ... 
doi:10.1142/s0218488502001624 fatcat:vuobpgz4dna2nlzig6x4cksime

GeneMesh: a web-based microarray analysis tool for relating differentially expressed genes to MeSH terms

Saurin D Jani, Gary L Argraves, Jeremy L Barth, W Scott Argraves
2010 BMC Bioinformatics  
Conclusions: GeneMesh is a versatile web-based tool for testing and developing new hypotheses through relating genes in a query set (e.g., differentially expressed genes from a DNA microarray experiment  ...  Gene summaries, gene ontologies, intermolecular interactions, overlays of genes onto KEGG pathway diagrams and heatmaps of expression intensity values.  ...  We also thank the MUSC Computational Biology Resource Center http://cbrc.musc.edu/homepage/CBRC_1_index.html for providing access to computational infrastructure used to operate GeneMesh.  ... 
doi:10.1186/1471-2105-11-166 pmid:20359363 pmcid:PMC3212930 fatcat:avxd67fwt5hxvjknbub3vlm7fq

NCBI's Conserved Domain Database and Tools for Protein Domain Analysis

Mingzhang Yang, Myra K. Derbyshire, Roxanne A. Yamashita, Aron Marchler‐Bauer
2019 Current Protocols in Bioinformatics  
The CDD curation effort increases coverage and provides finer-grained classifications of common and widely distributed protein domain families, for which a wealth of functional and structural data have  ...  The CDD maintains both live search capabilities and an archive of pre-computed domain annotations for a selected subset of sequences tracked by the NCBI's Entrez protein database.  ...  Lanczycki, Shennan Lu, Jiyao Wang, and Dachuan Zhang; and Renata Geer for composing the comprehensive online CDD Help documentation.  ... 
doi:10.1002/cpbi.90 pmid:31851420 pmcid:PMC7378889 fatcat:lr3pkrp5qjfu5f76wiqs5tof5e

Data Preparation [chapter]

Tom Pollard, Franck Dernoncourt, Samuel Finlayson, Adrian Velasquez
2016 Secondary Analysis of Electronic Health Records  
relational databases and plain text data files. • Understand the key concepts of reproducible research. • Get practical experience in querying a medical database.  ...  Learning Objectives • Become familiar with common categories of medical data. • Appreciate the importance of collaboration between caregivers and data analysts. • Learn common terminology associated with  ...  Each hospital brings its own biases to the data too.  ... 
doi:10.1007/978-3-319-43742-2_11 fatcat:ec3exvq7szgydmtbxyn4ty2v3y

A Study of Snippet Length and Informativeness

David Maxwell, Leif Azzopardi, Yashar Moshfeghi
2017 Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval - SIGIR '17  
Acknowledgments Our thanks to Alastair Maxwell and Stuart Mackie for their comments, the 53 participants of this study, and the anonymous reviewers for their feedback. e lead author is nancially supported  ...  were therefore: T0 where only the title for each result summary were presented; T1 where for each result summary, a title and one query-biased snippet fragment were presented; T2 where a title and two  ...  Work initially focused upon how these summaries should be generated [30, 31, 39, 48, 51] . ese early works proposed the idea of summarising documents with respect to the query (query-biased summaries)  ... 
doi:10.1145/3077136.3080824 dblp:conf/sigir/MaxwellAM17 fatcat:kcuxs7js3vcktoeyyzakgcvzxi

Hillview

Mihai Budiu, Parikshit Gopalan, Lalith Suresh, Udi Wieder, Han Kruiger, Marcos K. Aguilera
2019 Proceedings of the VLDB Endowment  
Vizketches combine algorithmic techniques for data summarization with computer graphics principles for efficient rendering.  ...  Hillview is a distributed spreadsheet for browsing very large datasets that cannot be handled by a single machine.  ...  Hillview introduces a new query execution engine specialized to render tabular views and charts for a spreadsheet.  ... 
doi:10.14778/3342263.3342279 fatcat:ouj3b26gpzckjfti6jrlmguytq

Exploring Visualization of Data Transforms

Larry Xu
2016 Proceedings of the 2016 International Conference on Management of Data - SIGMOD '16  
In the context of data exploration, users often interact with relational database systems in an interactive query session to form useful insights.  ...  Each query a user executes can potentially transform a resultset in complex ways.  ...  A query Q transforms resultset R old into a new resultset Rnew where Q is a standard SQL query and each resultset is tabular data visually represented as a standard two-dimensional grid.  ... 
doi:10.1145/2882903.2914837 dblp:conf/sigmod/Xu16 fatcat:b3gfntjnmjam7cihyqp5owlzwm

Tabular

Andrew D. Gordon, Thore Graepel, Nicolas Rolland, Claudio Russo, Johannes Borgstrom, John Guiver
2014 Proceedings of the 41st ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages - POPL '14  
The ability to query for missing values provides a uniform interface to a wide variety of tasks, including classification, clustering, recommendation, and ranking.  ...  We describe a detailed design of our language, Tabular, complete with formal semantics and type system. A rich series of examples illustrates the expressiveness of Tabular.  ...  We would like to thank John Rust and Michal Kosinski from the Cambridge Psychometrics Centre as well as Pearson Assessments for providing the IQ dataset for research purposes.  ... 
doi:10.1145/2535838.2535850 dblp:conf/popl/GordonGRRBG14 fatcat:kt5jab5eqngqjk3pbbxlaoh24q

Attention Augmented Convolutional Transformer for Tabular Time-series [article]

Sharath M Shankaranarayana, Davor Runje
2021 arXiv   pre-print
In this work, we propose a novel scalable architecture for learning representations from tabular time-series data and subsequently performing downstream tasks such as time-series classification.  ...  Time-series classification is one of the most frequently performed tasks in industrial data science, and one of the most widely used data representation in the industrial setting is tabular representation  ...  In summary, the main contributions of our paper are as follows: • We propose a novel BERT framework employing attention augmented convolutions for time-series tabular data TabAConvBERT. • We propose a  ... 
arXiv:2110.01825v1 fatcat:vmo5zf5evzb6njahzcrzwa777y

Tabular Transformers for Modeling Multivariate Time Series [article]

Inkit Padhi, Yair Schiff, Igor Melnyk, Mattia Rigotti, Youssef Mroueh, Pierre Dognin, Jerret Ross, Ravi Nair, Erik Altman
2021 arXiv   pre-print
to GPT and can be used for generation of realistic synthetic tabular sequences.  ...  This results in two architectures for tabular time series: one for learning representations that is analogous to BERT and can be pre-trained end-to-end and used in downstream tasks, and one that is akin  ...  Acknowledgements Authors acknowlege the MIT-IBM Watson AI lab and Wells Fargo's Sherif Boutros and Vanio Markov for fruitful discussions and feedback on this work.  ... 
arXiv:2011.01843v2 fatcat:h2fmupnluvdfdbzaifgmevo7tm

COSMIC 2005

S Forbes, J Clements, E Dawson, S Bamford, T Webb, A Dogan, A Flanagan, J Teague, R Wooster, P A Futreal, M R Stratton
2006 British Journal of Cancer  
The COSMIC web site has been expanded to give more views and summaries of the data and provide faster query routes and downloads.  ...  COSMIC has been expanded and now holds data previously reported in the scientific literature for 28 known cancer genes.  ...  ACKNOWLEDGEMENTS We thank Francis Martin, Andrew King and Joan Green in the Sanger Institute library for their continued support and The Wellcome Trust for funding this work.  ... 
doi:10.1038/sj.bjc.6602928 pmid:16421597 pmcid:PMC2361125 fatcat:b357kb5mbbggfmi422oqtkw2gy

PhosPhAt: a database of phosphorylation sites in Arabidopsis thaliana and a plant-specific phosphorylation site predictor

J. L. Heazlewood, P. Durek, J. Hummel, J. Selbig, W. Weckwerth, D. Walther, W. X. Schulze
2007 Nucleic Acids Research  
ACKNOWLEDGEMENTS We would like to thank Wolfgang Engelsberger for providing feedback regarding the usability and design of the database.  ...  The authors would also like to thank Robert Schmidt for the rapid implementation of changes in the database website after review of the manuscript.  ...  ', and (ii) displaying a summary of phosphorylation site prediction of one locus with a concurrent display of experimental sites via the tab 'Query Prediction Data'.  ... 
doi:10.1093/nar/gkm812 pmid:17984086 pmcid:PMC2238998 fatcat:yorbz27hvjad5pvcetdm2nd67a

The MRC IEU OpenGWAS data infrastructure [article]

Benjamin L Elsworth, Matthew S Lyon, Tessa Alexander, Yi Liu, Peter Matthews, Jon Hallett, Phil Bates, Tom Palmer, Valeriia Haberland, George Davey Smith, Jie Zheng, Philip Haycock (+2 others)
2020 bioRxiv   pre-print
for the scientific community.  ...  Users can access the data via a website, an application programming interface, R and Python packages, and also as downloadable files that can be rapidly queried in high performance computing environments  ...  Each GWAS summary dataset is available for download in GWAS-VCF format 23 , designed to optimise data fidelity and query speed relative to standard tabular text files.  ... 
doi:10.1101/2020.08.10.244293 fatcat:54woielze5cbtfj4rk2ucd4udq
« Previous Showing results 1 — 15 out of 2,961 results