Filters








80 Hits in 5.8 sec

Efficiently Enumerating Hitting Sets of Hypergraphs Arising in Data Profiling [article]

Thomas Bläsius, Tobias Friedrich, Julius Lischeid, Kitty Meeks, Martin Schirneck
2021 arXiv   pre-print
The transversal hypergraph problem is the task of enumerating the minimal hitting sets of a hypergraph. It is a long-standing open question whether this can be done in output-polynomial time.  ...  We apply our enumeration method to the discovery problem of minimal unique column combinations from data profiling.  ...  The authors would like to thank Felix Naumann and Thorsten Papenbrock for the many fruitful discussions about data profiling, and Erik Kohlros for conducting additional experiments.  ... 
arXiv:1805.01310v3 fatcat:zm626hj75javxjnay5it4ligse

Efficiently Enumerating Hitting Sets of Hypergraphs Arising in Data Profiling [chapter]

Thomas Bläsius, Tobias Friedrich, Julius Lischeid, Kitty Meeks, Martin Schirneck
2019 2019 Proceedings of the Twenty-First Workshop on Algorithm Engineering and Experiments (ALENEX)  
Despite these lower bounds, we provide empirical evidence showing that the enumeration outperforms the theoretical worst-case guarantee on hypergraphs arising in the profiling of relational databases,  ...  We devise an enumeration method for inclusion-wise minimal hitting sets in hypergraphs. It has delay O(m k * +1 · n 2 ) and uses linear space.  ...  On the other hand, |X| is usually small in instances arising in data profiling.  ... 
doi:10.1137/1.9781611975499.11 dblp:conf/alenex/Blasius0LMS19 fatcat:etvgphwm4ne3domx67nddmu3hq

Algorithmic Enumeration: Output-sensitive, Input-Sensitive, Parameterized, Approximativ (Dagstuhl Seminar 18421)

Henning Fernau, Petr. A. Golovach, Marie-France Sagot, Michael Wagner
2019 Dagstuhl Reports  
Enumeration problems arise in a natural way in various fields of Computer Science, as, e.g., Artificial Intelligence and Data Mining, in Natural Sciences Engineering, Social Sciences, and Biology.  ...  Enumeration problems require to list all wanted objects of the input as, e.g., particular subsets of the vertex or edge set of a given graph or particular satisfying assignments of logical expressions.  ...  Despite the hardness results, we show that a careful implementation of the extension oracle can help avoiding the worst case on hypergraphs arising in the profiling of real-world databases, leading to  ... 
doi:10.4230/dagrep.8.10.63 dblp:journals/dagstuhl-reports/FernauGS18 fatcat:dwb3bv2onzanvgh6ucxh6atswm

A client‐side Web agent for document categorization

Daniel Boley, Maria Gini, Kyle Hastings, Bamshad Mobasher, Jerry Moore
1998 Internet Research  
In this paper, we describe the overall architecture of this agent and discuss the details of the algorithms within its key components.  ...  The principal novel components in this agent that make it possible are (i) a scalable hierarchical clustering algorithm and (ii) a taxonomic label generator.  ...  If the document already exists in the profile, the agent simply increments that document's number of hits by one and sets the start time to the current time.  ... 
doi:10.1108/10662249810241257 fatcat:6l6q4jxekncgrmry3h4oejgd5m

Efficiently Leveraging Multi-level User Intent for Session-based Recommendation via Atten-Mixer Network [article]

Peiyan Zhang, Jiayan Guo, Chaozhuo Li, Yueqi Xie, Jaeboum Kim, Yan Zhang, Xing Xie, Haohan Wang, Sunghun Kim
2022 arXiv   pre-print
Experiments on three benchmarks demonstrate the effectiveness and efficiency of our proposal.  ...  of more complicated models is the panacea for improving the empirical performance.  ...  In view of this phenomenon, a meaningful question naturally arises: Are those GNN-based models under-or over-complicated for SBR?  ... 
arXiv:2206.12781v1 fatcat:pdfxscsc4jhbhktohjctuy2jca

An Introduction to Metabolic Networks and Their Structural Analysis

V. Lacroix, L. Cottret, P. Thebault, M.-F. Sagot
2008 IEEE/ACM Transactions on Computational Biology & Bioinformatics  
There has been a renewed interest for metabolism in the computational biology community, leading to an avalanche of papers coming from methodological network analysis as well as experimental and theoretical  ...  This paper is focused on the structural aspects of metabolism only.  ...  Set 2: A þ B $ C, C $ D. (b) Bipartite graph for the set of reactions: A þ B $ C, C $ D. (c) Undirected hypergraph corresponding to the network: A þ B $ C, C $ D.  ... 
doi:10.1109/tcbb.2008.79 pmid:18989046 fatcat:iomp7dhnvndgtlwhpeyc4dnvoa

In SilicoConstraint-Based Strain Optimization Methods: the Quest for Optimal Cell Factories

Paulo Maia, Miguel Rocha, Isabel Rocha
2015 Microbiology and Molecular Biology Reviews  
The development of efficient cell factories that allow for competitive production yields is of paramount importance for this leap to happen.  ...  In this work, a thorough analysis of the mainin silicoconstraint-based strain design strategies and algorithms is presented, their application in real-world case studies is analyzed, and a path for the  ...  An MCS "hitting" all target modes is termed a minimal hitting set, and the computation of the minimal hitting sets from the set of target modes can be performed using the Berge algorithm (129) .  ... 
doi:10.1128/mmbr.00014-15 pmid:26609052 pmcid:PMC4711187 fatcat:f6w6ny2dgvdifobydgenzowkmu

Estimating entity importance via counting set covers

Aristides Gionis, Theodoros Lappas, Evimaria Terzi
2012 Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '12  
The data-mining literature is rich in problems asking to assess the importance of entities in a given dataset.  ...  In a user study and an experimental evaluation on real data, we demonstrate that our framework is efficient and provides useful and intuitive results.  ...  Aristides Gionis was partially supported by the Torres Quevedo Program of the Spanish Ministry of Science and Innovation, co-funded by the European Social Fund, and by the Spanish Centre for the Development  ... 
doi:10.1145/2339530.2339640 dblp:conf/kdd/GionisLT12 fatcat:7jyb6c76anfj5p7usgmdnk3nhu

An approach for pipelining nested collections in scientific workflows

Timothy M. McPhillips, Shawn Bowers
2005 SIGMOD record  
When you see these people, please thank them personally for their role in achieving quick reviews of submitted papers.  ...  We are grateful to Scott Grafton of the Dartmouth Brain Imaging Center, and to Jens Voeckler, Doug Scheftner, Ewa Deelman, Carl Kesselman, and the entire Virtual Data System team for discussion, guidance  ...  The enumerated classes are represented using <owl:oneOf rdf:parseType="Collection"> construct in case of enumerated classes, and using <owl:oneOf> and <rdf:List> constructs in case of enumerated data types  ... 
doi:10.1145/1084805.1084809 fatcat:sgtpcat7vzc3veb4dx2jgskpte

Efficient query processing in distributed search engines

Simon Jonassen
2012 SIGIR Forum  
However, the implications and applicability of these techniques in practice need further evaluation in real-life settings. iii Preface  ...  In this context, any improvement in query processing efficiency can reduce the operational costs and improve user satisfaction, hence improve the overall benefit.  ...  Any opinions, findings and conclusions or recommendations expressed in this material are those of the authors, and do not necessarily reflect the views of the funding agencies.  ... 
doi:10.1145/2492189.2492201 fatcat:uwasxhngrfgntemkhawyv3te64

Recommender systems

Linyuan Lü, Matúš Medo, Chi Ho Yeung, Yi-Cheng Zhang, Zi-Ke Zhang, Tao Zhou
2012 Physics reports  
In this article, we review recent developments in recommender systems and discuss the major challenges.  ...  In addition to algorithms, physical aspects are described to illustrate macroscopic behavior of recommender systems. Potential impacts and future directions are discussed.  ...  Acknowledgments This work was partially supported by the EU FET-Open Grant 231200 (project QLectives) and National Natural Science Foundation of China (Grant Nos. 11075031, 11105024, 61103109 and 60973069  ... 
doi:10.1016/j.physrep.2012.02.006 fatcat:cywkeyu2wjdzhdqkg2545v5f7a

Formal Analysis of Message Passing [chapter]

Stephen F. Siegel, Ganesh Gopalakrishnan
2011 Lecture Notes in Computer Science  
Since these issues are harbingers of those being faced in multicore programming, the time is ripe to build a critical mass of researchers working in this area. Structure of an MPI program.  ...  This paper summarizes research being done in our groups in support of this area, specifically with respect to the Message Passing Interface.  ...  on larger data sets will be different from those used for executing on smaller data sets.  ... 
doi:10.1007/978-3-642-18275-4_2 fatcat:vmr66hc24zgtjps5eo2n7qlk2a

29th International Conference on Data Engineering [book of abstracts]

2013 2013 IEEE 29th International Conference on Data Engineering Workshops (ICDEW)  
Challenging applications, requiring efficient and scalable access to massive data, arise every day.  ...  More importantly, we exploit the interaction of the heterogeneous constraints by encoding them in a conflict hypergraph.  ...  These volunteers welcome participants, give directions, help in the sessions and on the registration desk, and generally make sure the conference is running smoothly.  ... 
doi:10.1109/icdew.2013.6547409 fatcat:wadzpuh3b5htli4mgb4jreoika

AME Blockchain: An Architecture Design for Closed-Loop Fluid Economy Token System [article]

Lanny Z.N. Yuan, Huaibing Jian, Peng Liu, Pengxin Zhu, ShanYang Fu
2018 arXiv   pre-print
We introduce all major technologies adopted in our system, including blockchain, distributed storage, P2P network, service application framework, and data encryption.  ...  In this white paper, we propose a blockchain-based system, named AME, which is a decentralized infrastructure and application platform with enhanced security and self-management properties.  ...  Collectively across an entire user population available to the system, a global profile can be computed. The set of computed feature profiles will be used to construct hypergraphs for graph analysis.  ... 
arXiv:1812.08017v1 fatcat:m2tgekvu2jcnnbhxssob4xgh7i

An overview of graph databases and their applications in the biomedical domain

Santiago Timón-Reina, Mariano Rincón, Rafael Martínez-Tomás
2021 Database: The Journal of Biological Databases and Curation  
Because of the interconnected nature of its data, the biomedical domain has been one of the early adopters of graph databases, enabling more natural representation models and better data integration workflows  ...  In this work, we survey the literature to explore the evolution, performance and how the most recent graph database solutions are applied in the biomedical domain, compiling a great variety of use cases  ...  One example is the fragment-based drug discovery (FBDD) (97) , in which the validation stage of a project involves testing sensible close analogs of a fragment hit.  ... 
doi:10.1093/database/baab026 pmid:34003247 pmcid:PMC8130509 fatcat:xku5npedwzgs3ayzsuvz6iattq
« Previous Showing results 1 — 15 out of 80 results