5,834 Hits in 14.7 sec

Automatic labeling of software components and their evolution using log-likelihood ratio of word frequencies in source code

Adrian Kuhn
2009 2009 6th IEEE International Working Conference on Mining Software Repositories  
In this paper we present a lexical approach that uses the log-likelihood ratios of word frequencies to automatically provide labels for software components.  ...  As more and more open-source software components become available on the internet we need automatic ways to label and compare them.  ...  We thank Dominique Matter for his help with the parameters of log-likelihood ratios.  ... 
doi:10.1109/msr.2009.5069499 dblp:conf/msr/Kuhn09 fatcat:jqhojunm5nhfpfeibura7gnu2q

Detailed author index

2009 2009 6th IEEE International Working Conference on Mining Software Repositories  
of Developers 175 Automatic Labeling of Software Components and Their Evolution Using Log-Likelihood Ratio of Word Frequencies in Source Code [Search] A B C D E F G H I J K L M N O P Q R S T U  ...  P Pinzger, Martin 151 Using Association Rules to Study the Co-Evolution of Production & Test Code Pollock, Lori 71 Mining Source Code to Automatically Split Identifiers for Software Analysis  ... 
doi:10.1109/msr.2009.5069464 fatcat:hbptjwwpvng4hebf6c7ni72siu

A survey on the use of topic models when mining software repositories

Tse-Hsun Chen, Stephen W. Thomas, Ahmed E. Hassan
2015 Empirical Software Engineering  
fully exploring their underlying assumptions and parameter values.  ...  Researchers in software engineering have attempted to improve software development by mining and analyzing software repositories.  ...  , using the statistical properties of its word frequencies.  ... 
doi:10.1007/s10664-015-9402-8 fatcat:dwuxhrdbsvdznorfwfcow3prn4

A Survey on Mining Software Repositories

Woosung JUNG, Eunjoo LEE, Chisu WU
2012 IEICE transactions on information and systems  
This paper presents fundamental concepts, overall process and recent research issues of Mining Software Repositories.  ...  The data sources such as source control systems, bug tracking systems or archived communications, data types and techniques used for general MSR problems are also presented.  ...  Kuhn suggested the lexical approach to automatically label the SW component by using log-likelihood ration of word frequencies and applied it to detect the evolution trends of SW system [104] .  ... 
doi:10.1587/transinf.e95.d.1384 fatcat:kfje3mzcufchzdj7qyt5smaaum

Summarization of Software Artifacts : A Review

Som Gupta, Gupta S.K
2017 International Journal of Computer Science & Information Technology (IJCSIT)  
The paper also reviews the evaluation techniques used for summarizing software artifacts. The paper discusses the open problems and challenges in this field of research.  ...  Summarization of software artifacts is an ongoing field of research among the software engineering community due to the benefits that summarization provides like saving of time and efforts in various software  ...  Supervised Learning Approach: Here the likelihood of label is suggested by using labeled training dataset [12] .  ... 
doi:10.5121/ijcsit.2017.9512 fatcat:4unydg54tbh6ji7hstvixyfrsm

Studying and detecting log-related issues

Mehran Hassani, Weiyi Shang, Emad Shihab, Nikolaos Tsantalis
2018 Empirical Software Engineering  
In this paper, we first perform an empirical study on log-related issues in two large-scale, open-source software systems.  ...  The rich knowledge conveyed in logs is leveraged by researchers and practitioners in performing various tasks, both in software development and its operation.  ...  Acknowledgments First and foremost, I want to express my deepest gratitude to my supervisors, Dr. Weiyi Shang and  ... 
doi:10.1007/s10664-018-9603-z fatcat:sqajfkewl5do5hum27oz7ve7ui

A Survey on Automated Log Analysis for Reliability Engineering [article]

Shilin He, Pinjia He, Zhuangbin Chen, Tianyi Yang, Yuxin Su, Michael R. Lyu
2021 arXiv   pre-print
Logs are semi-structured text generated by logging statements in software source code.  ...  To enable effective and efficient usage of modern software logs in reliability engineering, a number of studies have been conducted on automated log analysis.  ...  in source code [2] .  ... 
arXiv:2009.07237v2 fatcat:thbtfboglnglld5rr6s2gqhizi

Mining Unstructured Software Repositories [chapter]

Stephen W. Thomas, Ahmed E. Hassan, Dorothea Blostein
2013 Evolving Software Systems  
M INING SOFTWARE REPOSITORIES, which is the process of analyzing the data related to software development practices, is an emerging field which aims to aid development teams in their day to day tasks.  ...  Information Retrieval (IR) techniques, which were developed specifically to handle unstructured data, have recently been used by researchers to mine and analyze the unstructured data in software repositories  ...  monitor and analyze their source code evolution.  ... 
doi:10.1007/978-3-642-45398-4_5 fatcat:n5ivswn6brfupjbsuomwddokd4

Effective assignment and assistance to software developers and reviewers

Motahareh Bahrami Zanjani
2016 Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering - FSE 2016  
His knowledge and expertise in area of Software evolution is admirable. Without his guidance this dissertation would not be possible. I would also like to thank Dr.  ...  I would like to extend my special gratitude to them for supporting me in every step of the way. v ABSTRACT The conducted research is within the realm of software maintenance and evolution.  ...  Number of Assigned Reviewers *** *** *** Number of Contributing Reviewers *** *** *** Assigned Reviewer's Reputation  ... 
doi:10.1145/2950290.2983960 dblp:conf/sigsoft/Zanjani16 fatcat:yxc343ox3jgihevgzzcb5scuay

Design and Evaluation of an Ultra Low-power Human-quality Speech Recognition System

Dennis Pinto, Jose-María Arnau, Antonio González
2020 ACM Transactions on Architecture and Code Optimization (TACO)  
The software is based on the so-called hybrid approach with a vocabulary of 200K words and RNN-based language model re-scoring, whereas the hardware consists of a commercially available low-power processor  ...  Automatic Speech Recognition (ASR) has experienced a dramatic evolution since pioneer development of Bell Lab's single-digit recognizer more than 50 years ago.  ...  (Although the costs represent probabilities, additions are used instead of multiplications, since cost is computed as the negative log-likelihood.)  ... 
doi:10.1145/3425604 fatcat:73qovytiebbddjvjrfce5igvny

Leveraging Team Dynamics to Predict Open-source Software Projects' Susceptibility to Social Engineering Attacks [article]

Luiz Giovanini, Daniela Oliveira, Huascar Sanchez, Deborah Shands
2021 arXiv   pre-print
Open-source software (OSS) is a critical part of the software supply chain.  ...  The attackers have exploited interactions among development team members and the social dynamics of team behavior to enable their attacks.  ...  Later, developers of other open-or closed-source software make use of the compromised OSS as a component or a dependency.  ... 
arXiv:2106.16067v3 fatcat:vvvaoz3gyzcozaa735klggrn2m

Finding Structure in Text, Genome and Other Symbolic Sequences [article]

Ted Dunning
2012 arXiv   pre-print
Generically, these methods allow detection of a difference in the frequency of a single feature, the detection of a difference between the frequencies of an ensemble of features and the attribution of  ...  the source of a text.  ...  CHAPTER 13 Glossary The terms in this glossary were selected automatically using a log-likelihood ratio test.  ... 
arXiv:1207.1847v1 fatcat:atx6naydzjbzrcrfrbnyn4uqcu

Anti-pattern Mutations and Fault-proneness

Fehmi Jaafar, Foutse Khomh, Yann-Gael Gueheneuc, Mohammad Zulkernine
2014 2014 14th International Conference on Quality Software  
This paper presents results from an empirical study aimed at understanding the evolution of anti-patterns in 27 releases of three open-source software systems: ArgoUML, Mylyn, and Rhino.  ...  Software evolution and development are continuous activities that have a never-ending cycle.  ...  collection of source code.  ... 
doi:10.1109/qsic.2014.45 dblp:conf/qsic/JaafarKGZ14 fatcat:xig5s7mprfbx7psl7ltas6mxb4

Literature Review on Automatic Speech Recognition

Wiqas Ghai, Navdeep Singh
2012 International Journal of Computer Applications  
Achieving higher Recognition accuracy, low Word error rate, developing speech corpus depending upon the nature of language and addressing the issues of sources of variability through approaches like Missing  ...  In this paper, an effort has been made to highlight the progress made so far for ASRs of different languages and the technological perspective of automatic speech recognition in countries like China, Russian  ...  This mixture contained tokens in the ratio 1:2, unique words in the ratio 5:2 and phoneme occurrence in the ratio 5:9.  ... 
doi:10.5120/5565-7646 fatcat:wu46s3bhjbejhmskufomiodxma

TIDIER: an identifier splitting approach using speech recognition techniques

Latifa Guerrouj, Massimiliano Di Penta, Giuliano Antoniol, Yann-Gaël Guéhéneuc
2011 Journal of Software: Evolution and Process  
This paper proposes a novel approach to recognize words composing source code identifiers. The approach is based on an adaptation of Dynamic Time Warping used to recognize words in continuous speech.  ...  Indeed, identifiers are developers' main source of information and guide their cognitive processes during program comprehension when high-level documentation is scarce or outdated and when source code  ...  This research was partially supported by the Natural Sciences and Engineering Research Council of Canada (Research Chairs in Software Evolution and in Software Patterns and Patterns of Software) and by  ... 
doi:10.1002/smr.539 fatcat:iu7cfsgalzf4npavl3cgsnvuaq
« Previous Showing results 1 — 15 out of 5,834 results