A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
Better Data Labelling with EMBLEM (and how that Impacts Defect Prediction)
[article]
2020
arXiv
pre-print
When a new domain is encountered, EMBLEM can learn better ways to label which comments refer to real problems. ...
For the data sets explored here, EMBLEM better labelling methods significantly improved P_opt20 and G-scores performance in nearly all the projects studied here. ...
Step 2 Bug-fixing Labelling (Data Labelling): categorize a commit as bug-fixing or not based on the textual content of the commit log. ...
arXiv:1905.01719v3
fatcat:uf5gnpl7rfhrjk3heqdjoetfs4
Identification of microservices from monolithic applications through topic modelling
2021
ACM Symposium on Applied Computing
This process is slow and, depending on the project's complexity, it may take months or even years to complete. ...
Microservices emerged as one of the most popular architectural patterns in the recent years given the increased need to scale, grow and flexibilize software projects accompanied by the growth in cloud ...
That problem was observed initially on projects more complex and bigger in number of components (classes in our case -Java). ...
doi:10.1145/3412841.3442016
dblp:conf/sac/BritoCS21
fatcat:osk6bi5cq5gedaoxjlomizu3he
Building Corpora of Technical Texts: Approaches and Tools
2011
Recent Advances in Slavonic Natural Languages Processing
In particular, there is no widely accepted format to represent and handle math. ...
We present an approach based on multiple representations of mathematical formulae that has been used for math retrieval, similarity and clustering of mathematical corpus. ...
Addition of one more keyword Euclid reduces the number of results to only 97 -all of them contain this textual term. ...
dblp:conf/raslan/SojkaLR11
fatcat:mfpchgk2mfe65jeg6w7hjaqgxy
Maintaining dynamic channel profiles on the web
2008
Proceedings of the VLDB Endowment
This work addresses a novel problem of maintaining channel profiles on the Web. ...
The monitoring scheme is further extended to consider the content that is published on the channels. ...
Learning a threshold adaptively for each channel can result in even better quality-budget trade-off performance. ...
doi:10.14778/1453856.1453878
fatcat:aipxa7icendr7bhqmkluihv37u
Image Exchange: IHE and the Evolution of Image Sharing
2008
Radiographics
The research community is concomitantly developing solutions that solve image exchange issues that are specific to research (eg, the sharing of deidentified data) but that might also be encountered in ...
Image sharing has evolved from film to transportable media (eg, compact disks) to direct electronic exchange over the Internet. ...
Acknowledgment: The authors would like to acknowledge and thank Nancy Knight, PhD, Department of Radiology and Nuclear Medicine, University of Maryland School of Medicine, for her assistance in the preparation ...
doi:10.1148/rg.287085174
pmid:18772272
fatcat:ajir7bcjxfhjjpbuqq56s5rj3y
MLGO: a Machine Learning Guided Compiler Optimizations Framework
[article]
2021
arXiv
pre-print
To the best of our knowledge, this work is the first full integration of ML in a complex compiler pass in a real-world setting. It is available in the main LLVM repository. ...
The same model, trained on one corpus, generalizes well to a diversity of real-world targets, as well as to the same set of targets after months of active development. ...
Their evolution (due to training) does not diff well, so the compactness of a binary format is more economical for the project repository. 10 See the buildbot setup script available at https://github.com ...
arXiv:2101.04808v1
fatcat:jl7owbq5xvf5xmo3qhpksrqtrq
Efficient storage and fast querying of source code
2010
Information Systems Frontiers
Many of these data structures work with tree-based or graph-based representations of source code. ...
Enabling fast and detailed insights over large portions of source code is an important task in a global development ecosystem. ...
Acknowledgements This project has been done in cooperation with SAP AG. In particular, we would like to thank Jan Karstens, Heinz Ulrich Roggenkemper, Wolfgang Stephan, Cafer Tosun, Xiwei Zhou. ...
doi:10.1007/s10796-010-9285-6
fatcat:hsbow2egxzavplnupykch4txze
By considering the image database as a huge repository, MindFinder is able to help users present and refine their initial thoughts in their mind, and finally turn thoughts to a beautiful image(s). ...
the picture in users' mind. ...
query panel and the textual description of image I, in which cosine similarity is used. β is a trade-off parameter to balance the textual query and visual query 8 . ...
doi:10.1145/1772690.1772909
dblp:conf/www/WangLZ10
fatcat:r2cz2bfdbbhhnki2v6cuswovre
Approximate Query Answering for a Heterogeneous XML Document Base
[chapter]
2004
Lecture Notes in Computer Science
In this paper, we deal with the problem of effective search and query answering in heterogeneous web document bases containing documents in XML format of which the schemas are available. ...
schemas and to use them in the query processing phase, when a query written on a source schema is automatically rewritten in order to be compatible with the other useful XML documents. ...
Heterogeneous collections of various types of documents, such as actual text documents or metadata on textual and/or multimedia documents, are more and more widespread on the web. ...
doi:10.1007/978-3-540-30480-7_35
fatcat:uaz5e2ajovdkjaajmty7iqfkcq
DeepDiary: Automatically Captioning Lifelogging Image Streams
[chapter]
2016
Lecture Notes in Computer Science
more compact and less noisy. ...
In this paper, we propose to use automatic image captioning algorithms to generate textual representations of these collections. ...
This work was supported in part by the National Science Foundation (IIS-1253549 and CNS-1408730) and Google, and used compute facilities provided by NVidia, the Lilly Endowment through support of the IU ...
doi:10.1007/978-3-319-46604-0_33
fatcat:rg7a6mw6l5cnnny53uqfykreyu
Adaptive relevance feedback for large-scale image retrieval
2015
Multimedia tools and applications
We have used the ImageNet dataset as it was released in 2010 for most of our evaluations in §4-6. ...
Then, we sampled uniformly a small collection of 33,000 images (i.e. 3% of the large collection), and another one of 60,000 images (i.e. 6% of the large collection). ...
Our results give motivation for further investigations on other heuristics or finding more principled ways of trade-off. ...
doi:10.1007/s11042-015-2610-9
fatcat:v3gmot2r3rbmdnqc5lmdyu6w5y
Overview of the MPEG Reconfigurable Video Coding Framework
2009
Journal of Signal Processing Systems
So far the specification of such standards, and of the algorithms that build them, has been done case by case providing monolithic textual and reference software specifications in different forms and programming ...
Video coding technology in the last 20 years has evolved producing a variety of different and complex algorithms and coding standards. ...
enabling to achieve specific design or performance trade-offs and thus fill, case by case, the requirements of specific applications. ...
doi:10.1007/s11265-009-0399-3
fatcat:5dhub7pkxvapfmepkmdvrem6y4
Using a multimedia semantic graph for web document visualization and summarization
2020
Multimedia tools and applications
In this paper we present a document summarization and visualization technique based on both statistical and semantic analysis of textual and visual contents. ...
Existing methods for tag-clouds generations are mostly based on text content of documents, others also consider statistical or semantic information to enrich the document summary, while precious information ...
Moreover, the algorithm includes a trade-off factor which mitigate the problem of favoring more generic concepts in topic identification. ...
doi:10.1007/s11042-020-09761-1
fatcat:dqv7une7ejc4tlxd2p2ilatuqa
A Scalable Approach to Exact Model and Commonality Counting for Extended Feature Models
2014
IEEE Transactions on Software Engineering
One of those statistics is an upper approximation of total number of products modeled by a FM, which does not take into account textual constraints. 2) Researchers in the field of automated analysis of ...
Section IV reviews in detail several approaches to the product and commonality counting problem. Section V presents our algorithm. ...
As Cleaveland points out in [14] , this determination is not a scientific process of discovery but one of design and engineering, and it involves trade-offs among many objectives. ...
doi:10.1109/tse.2014.2331073
fatcat:seqn7fvcwjbgzoecld5su4siay
Selective Integration of Background Knowledge in TCBR Systems
[chapter]
2011
Lecture Notes in Computer Science
This paper explores how background knowledge from freely available web resources can be utilised for Textual Case Based Reasoning. ...
The work reported here extends the exisiting Explicit Semantic Analysis approach to representation, where textual content is represented using concepts with correspondence to Wikipedia articles. ...
This corresponds, for example, to a TCBR system where both problem and solution components are textual. ...
doi:10.1007/978-3-642-23291-6_16
fatcat:64pia4dc2bcpdb5aoi5yi5erzu
« Previous
Showing results 1 — 15 out of 2,111 results