5 Hits in 1.1 sec

Import2vec - Learning Embeddings for Software Libraries [article]

Bart Theeten, Frederik Vandeputte, Tom Van Cutsem
2019 arXiv   pre-print
We consider the problem of developing suitable learning representations (embeddings) for library packages that capture semantic similarity among libraries.  ...  We apply word embedding techniques from natural language processing (NLP) to train embeddings for library packages ("library vectors").  ...  We describe how Mikolov et. al's skip-gram model [2] , which is used to learn embeddings for words ("word vectors") can be adapted to learn embeddings for libraries ("library vectors") based on their  ... 
arXiv:1904.03990v1 fatcat:woy7fqcksjd6hfqxiaijaczy64

Representation of Developer Expertise in Open Source Software [article]

Tapajit Dey, Andrey Karnauch, Audris Mockus
2021 arXiv   pre-print
for example, a project tries to find new maintainers and look for developers with relevant skills.  ...  Method: we use the World of Code infrastructure to extract the complete set of APIs in the files changed by open source developers and, based on that data, employ Doc2Vec embeddings for vector representations  ...  However, they also present the shortcomings of using basic GitHub profile features for machine learning classifiers to predict expertise in software libraries.  ... 
arXiv:2005.10176v3 fatcat:qboalsojzrgqtk7olgs73arxta

Antipatterns in Software Classification Taxonomies [article]

Cezar Sas, Andrea Capiluppi
2022 arXiv   pre-print
Our contributions show that existing, and very likely even new, classification attempts are deemed to fail for one or more issues, that we named as the 'antipatterns' of software classification tasks.  ...  The second is to perform a case study showing how to create a classification of software types using a curated set of software systems.  ...  for a library.  ... 
arXiv:2204.08880v1 fatcat:cv232o6gzfea5nezzxqepcjsyi

On the validity of pre-trained transformers for natural language processing in the software engineering domain [article]

Julian von der Mosel, Alexander Trautsch, Steffen Herbold
2022 arXiv   pre-print
Our results show that for tasks that require understanding of the software engineering context, pre-training with software engineering data is valuable, while general domain models are sufficient for general  ...  language understanding, also within the software engineering domain.  ...  ACKNOWLEDGMENTS The authors would like to thank the GWDG for their support regarding the usage of the GPU resources required for this article.  ... 
arXiv:2109.04738v2 fatcat:kjgg3abyvvf4jjjsgdmi47ynm4

External Factors in Sustainability of Open Source Software

Marat Valiev
For example, since established users of software mostly report bugs and new adopters mostly ask questions, we can estimate project's lifecycle stage and user base structure using already existing issue  ...  Modern software development is heavily reliant on Open Source. It saves time and money, but, as any other non-commercial software, it comes on as-is basis.  ...  Related Work on Embeddings in Software Engineering Library embeddings. The first work on library embeddings, Import2vec, was published in 2019 [146] .  ... 
doi:10.1184/r1/14512605 fatcat:3qk6q42nzvhvhisxc4apjdksj4