Filters








3 Hits in 1.1 sec

Shellcode_IA32: A Dataset for Automatic Shellcode Generation [article]

Pietro Liguori, Erfan Al-Hossami, Domenico Cotroneo, Roberto Natella, Bojan Cukic, Samira Shaikh
2021 arXiv   pre-print
We take the first step to address the task of automatically generating shellcodes, i.e., small pieces of code used as a payload in the exploitation of a software vulnerability, starting from natural language  ...  We assemble and release a novel dataset (Shellcode_IA32), consisting of challenging but common assembly instructions with their natural language descriptions.  ...  Shellcode IA32 represents a first step towards the ambitious goal of automatically generating shellcodes from natural language.  ... 
arXiv:2104.13100v3 fatcat:thhla6bcjzfjhoxuwadd36th3u

Can We Generate Shellcodes via Natural Language? An Empirical Study [article]

Pietro Liguori, Erfan Al-Hossami, Domenico Cotroneo, Roberto Natella, Bojan Cukic, Samira Shaikh
2022 arXiv   pre-print
We then present an empirical study using a novel dataset (Shellcode_IA32), which consists of 3,200 assembly code snippets of real Linux/x86 shellcodes from public databases, annotated using natural language  ...  In this work, we address the task of automatically generating shellcodes, starting purely from descriptions in natural language, by proposing an approach based on Neural Machine Translation (NMT).  ...  Dataset We curated and released a dataset for, Shellcode IA32 [50] , specific to shellcode generation.  ... 
arXiv:2202.03755v1 fatcat:34g6mtgqwvh4vkm5mrzqsjt5si

Can NMT Understand Me? Towards Perturbation-based Evaluation of NMT Models for Code Generation [article]

Pietro Liguori, Cristina Improta, Simona De Vivo, Roberto Natella, Bojan Cukic, Domenico Cotroneo
2022 arXiv   pre-print
However, when dealing with the specific task of the code generation (i.e., the generation of code starting from a description in natural language), it has not yet been defined an approach to validate the  ...  In this work, we address the problem by identifying a set of perturbations and metrics tailored for the robustness assessment of such models.  ...  For example, in the Shellcode_IA32 dataset [24, 25] used for the generation of assembly code from natural language, the intent, i.e., the natural language description, "Push the contents of eax onto  ... 
arXiv:2203.15319v1 fatcat:hskzlz6tajbojeywzedzv5zc3e