2,111 Hits in 6.4 sec

Better Data Labelling with EMBLEM (and how that Impacts Defect Prediction) [article]

Huy Tu and Zhe Yu and Tim Menzies
2020 arXiv   pre-print
When a new domain is encountered, EMBLEM can learn better ways to label which comments refer to real problems.  ...  For the data sets explored here, EMBLEM better labelling methods significantly improved P_opt20 and G-scores performance in nearly all the projects studied here.  ...  Step 2 Bug-fixing Labelling (Data Labelling): categorize a commit as bug-fixing or not based on the textual content of the commit log.  ... 
arXiv:1905.01719v3 fatcat:uf5gnpl7rfhrjk3heqdjoetfs4

Identification of microservices from monolithic applications through topic modelling

Miguel Brito, Jácome Cunha, João Saraiva
2021 ACM Symposium on Applied Computing  
This process is slow and, depending on the project's complexity, it may take months or even years to complete.  ...  Microservices emerged as one of the most popular architectural patterns in the recent years given the increased need to scale, grow and flexibilize software projects accompanied by the growth in cloud  ...  That problem was observed initially on projects more complex and bigger in number of components (classes in our case -Java).  ... 
doi:10.1145/3412841.3442016 dblp:conf/sac/BritoCS21 fatcat:osk6bi5cq5gedaoxjlomizu3he

Building Corpora of Technical Texts: Approaches and Tools

Petr Sojka, Martin Líska, Michal Ruzicka
2011 Recent Advances in Slavonic Natural Languages Processing  
In particular, there is no widely accepted format to represent and handle math.  ...  We present an approach based on multiple representations of mathematical formulae that has been used for math retrieval, similarity and clustering of mathematical corpus.  ...  Addition of one more keyword Euclid reduces the number of results to only 97 -all of them contain this textual term.  ... 
dblp:conf/raslan/SojkaLR11 fatcat:mfpchgk2mfe65jeg6w7hjaqgxy

Maintaining dynamic channel profiles on the web

Haggai Roitman, David Carmel, Elad Yom-Tov
2008 Proceedings of the VLDB Endowment  
This work addresses a novel problem of maintaining channel profiles on the Web.  ...  The monitoring scheme is further extended to consider the content that is published on the channels.  ...  Learning a threshold adaptively for each channel can result in even better quality-budget trade-off performance.  ... 
doi:10.14778/1453856.1453878 fatcat:aipxa7icendr7bhqmkluihv37u

Image Exchange: IHE and the Evolution of Image Sharing

David S. Mendelson, Peter R. G. Bak, Elliot Menschik, Eliot Siegel
2008 Radiographics  
The research community is concomitantly developing solutions that solve image exchange issues that are specific to research (eg, the sharing of deidentified data) but that might also be encountered in  ...  Image sharing has evolved from film to transportable media (eg, compact disks) to direct electronic exchange over the Internet.  ...  Acknowledgment: The authors would like to acknowledge and thank Nancy Knight, PhD, Department of Radiology and Nuclear Medicine, University of Maryland School of Medicine, for her assistance in the preparation  ... 
doi:10.1148/rg.287085174 pmid:18772272 fatcat:ajir7bcjxfhjjpbuqq56s5rj3y

MLGO: a Machine Learning Guided Compiler Optimizations Framework [article]

Mircea Trofin
2021 arXiv   pre-print
To the best of our knowledge, this work is the first full integration of ML in a complex compiler pass in a real-world setting. It is available in the main LLVM repository.  ...  The same model, trained on one corpus, generalizes well to a diversity of real-world targets, as well as to the same set of targets after months of active development.  ...  Their evolution (due to training) does not diff well, so the compactness of a binary format is more economical for the project repository. 10 See the buildbot setup script available at  ... 
arXiv:2101.04808v1 fatcat:jl7owbq5xvf5xmo3qhpksrqtrq

Efficient storage and fast querying of source code

Oleksandr Panchenko, Hasso Plattner, Alexander B. Zeier
2010 Information Systems Frontiers  
Many of these data structures work with tree-based or graph-based representations of source code.  ...  Enabling fast and detailed insights over large portions of source code is an important task in a global development ecosystem.  ...  Acknowledgements This project has been done in cooperation with SAP AG. In particular, we would like to thank Jan Karstens, Heinz Ulrich Roggenkemper, Wolfgang Stephan, Cafer Tosun, Xiwei Zhou.  ... 
doi:10.1007/s10796-010-9285-6 fatcat:hsbow2egxzavplnupykch4txze


Changhu Wang, Zhiwei Li, Lei Zhang
2010 Proceedings of the 19th international conference on World wide web - WWW '10  
By considering the image database as a huge repository, MindFinder is able to help users present and refine their initial thoughts in their mind, and finally turn thoughts to a beautiful image(s).  ...  the picture in users' mind.  ...  query panel and the textual description of image I, in which cosine similarity is used. β is a trade-off parameter to balance the textual query and visual query 8 .  ... 
doi:10.1145/1772690.1772909 dblp:conf/www/WangLZ10 fatcat:r2cz2bfdbbhhnki2v6cuswovre

Approximate Query Answering for a Heterogeneous XML Document Base [chapter]

Federica Mandreoli, Riccardo Martoglia, Paolo Tiberio
2004 Lecture Notes in Computer Science  
In this paper, we deal with the problem of effective search and query answering in heterogeneous web document bases containing documents in XML format of which the schemas are available.  ...  schemas and to use them in the query processing phase, when a query written on a source schema is automatically rewritten in order to be compatible with the other useful XML documents.  ...  Heterogeneous collections of various types of documents, such as actual text documents or metadata on textual and/or multimedia documents, are more and more widespread on the web.  ... 
doi:10.1007/978-3-540-30480-7_35 fatcat:uaz5e2ajovdkjaajmty7iqfkcq

DeepDiary: Automatically Captioning Lifelogging Image Streams [chapter]

Chenyou Fan, David J. Crandall
2016 Lecture Notes in Computer Science  
more compact and less noisy.  ...  In this paper, we propose to use automatic image captioning algorithms to generate textual representations of these collections.  ...  This work was supported in part by the National Science Foundation (IIS-1253549 and CNS-1408730) and Google, and used compute facilities provided by NVidia, the Lilly Endowment through support of the IU  ... 
doi:10.1007/978-3-319-46604-0_33 fatcat:rg7a6mw6l5cnnny53uqfykreyu

Adaptive relevance feedback for large-scale image retrieval

Nicolae Suditu, François Fleuret
2015 Multimedia tools and applications  
We have used the ImageNet dataset as it was released in 2010 for most of our evaluations in §4-6.  ...  Then, we sampled uniformly a small collection of 33,000 images (i.e. 3% of the large collection), and another one of 60,000 images (i.e. 6% of the large collection).  ...  Our results give motivation for further investigations on other heuristics or finding more principled ways of trade-off.  ... 
doi:10.1007/s11042-015-2610-9 fatcat:v3gmot2r3rbmdnqc5lmdyu6w5y

Overview of the MPEG Reconfigurable Video Coding Framework

Shuvra S. Bhattacharyya, Johan Eker, Jörn W. Janneck, Christophe Lucarz, Marco Mattavelli, Mickaël Raulet
2009 Journal of Signal Processing Systems  
So far the specification of such standards, and of the algorithms that build them, has been done case by case providing monolithic textual and reference software specifications in different forms and programming  ...  Video coding technology in the last 20 years has evolved producing a variety of different and complex algorithms and coding standards.  ...  enabling to achieve specific design or performance trade-offs and thus fill, case by case, the requirements of specific applications.  ... 
doi:10.1007/s11265-009-0399-3 fatcat:5dhub7pkxvapfmepkmdvrem6y4

Using a multimedia semantic graph for web document visualization and summarization

Antonio M. Rinaldi, Cristiano Russo
2020 Multimedia tools and applications  
In this paper we present a document summarization and visualization technique based on both statistical and semantic analysis of textual and visual contents.  ...  Existing methods for tag-clouds generations are mostly based on text content of documents, others also consider statistical or semantic information to enrich the document summary, while precious information  ...  Moreover, the algorithm includes a trade-off factor which mitigate the problem of favoring more generic concepts in topic identification.  ... 
doi:10.1007/s11042-020-09761-1 fatcat:dqv7une7ejc4tlxd2p2ilatuqa

A Scalable Approach to Exact Model and Commonality Counting for Extended Feature Models

David Fernandez-Amoros, Ruben Heradio, Jose A. Cerrada, Carlos Cerrada
2014 IEEE Transactions on Software Engineering  
One of those statistics is an upper approximation of total number of products modeled by a FM, which does not take into account textual constraints. 2) Researchers in the field of automated analysis of  ...  Section IV reviews in detail several approaches to the product and commonality counting problem. Section V presents our algorithm.  ...  As Cleaveland points out in [14] , this determination is not a scientific process of discovery but one of design and engineering, and it involves trade-offs among many objectives.  ... 
doi:10.1109/tse.2014.2331073 fatcat:seqn7fvcwjbgzoecld5su4siay

Selective Integration of Background Knowledge in TCBR Systems [chapter]

Anil Patelia, Sutanu Chakraborti, Nirmalie Wiratunga
2011 Lecture Notes in Computer Science  
This paper explores how background knowledge from freely available web resources can be utilised for Textual Case Based Reasoning.  ...  The work reported here extends the exisiting Explicit Semantic Analysis approach to representation, where textual content is represented using concepts with correspondence to Wikipedia articles.  ...  This corresponds, for example, to a TCBR system where both problem and solution components are textual.  ... 
doi:10.1007/978-3-642-23291-6_16 fatcat:64pia4dc2bcpdb5aoi5yi5erzu
« Previous Showing results 1 — 15 out of 2,111 results