15 Hits in 3.0 sec

Contextual Weisfeiler-Lehman Graph Kernel For Malware Detection [article]

Annamalai Narayanan, Guozhu Meng, Liu Yang, Jinliang Liu, Lihui Chen
2016 arXiv   pre-print
We observe that state-of-the-art graph kernels, such as Weisfeiler-Lehman kernel (WLK) capture the structural information well but fail to capture contextual information.  ...  To address this, we develop the Contextual Weisfeiler-Lehman kernel (CWLK) which is capable of capturing both these types of information.  ...  We apply this featureenrichment idea on a state-of-the-art graph kernel, namely, Weisfeiler-Lehman kernel (WLK) [11] to obtain the Contextual Weisfeiler-Lehman kernel (CWLK).  ... 
arXiv:1606.06369v1 fatcat:kmhrihgbxrch5mthmejdgoyerm

Context-aware, Adaptive and Scalable Android Malware Detection through Online Learning (extended version) [article]

Annamalai Narayanan, Mahinthan Chandramohan, Lihui Chen, Yang Liu
2017 arXiv   pre-print
In order to perform accurate detection, a novel graph kernel that facilitates capturing apps' security-sensitive behaviors along with their context information from dependency graphs is proposed.  ...  It is well-known that Android malware constantly evolves so as to evade detection. This causes the entire malware population to be non-stationary.  ...  Weisfeiler-Lehman Kernel.  ... 
arXiv:1706.00947v2 fatcat:sduouh6iovhkxficotkgpwaycq

A Multi-view Context-aware Approach to Android Malware Detection and Malicious Code Localization [article]

Annamalai Narayanan, Mahinthan Chandramohan, Lihui Chen, Yang Liu
2017 arXiv   pre-print
MKLDroid uses a graph kernel to capture structural and contextual information from apps' dependency graphs and identify malice code patterns in each view.  ...  Addressing this limitation, we propose MKLDroid, a unified framework that systematically integrates multiple views of apps for performing comprehensive malware detection and malicious code localisation  ...  To address C1, we leverage on our previous work [20] and use the Contextual Weisfeiler-Lehman Kernel (CWLK) that is specifically designed to perform accurate malware detection by capturing both structural  ... 
arXiv:1704.01759v2 fatcat:msutizn5uneovma7mb36vaeqzm

subgraph2vec: Learning Distributed Representations of Rooted Sub-graphs from Large Graphs [article]

Annamalai Narayanan, Mahinthan Chandramohan, Lihui Chen, Yang Liu and Santhoshkumar Saminathan
2016 arXiv   pre-print
Also, we show that the subgraph vectors could be used for building a deep learning variant of Weisfeiler-Lehman graph kernel.  ...  Specifically, on two realworld program analysis tasks, namely, code clone and malware detection, subgraph2vec outperforms state-of-the-art kernels by more than 17% and 4%, respectively.  ...  To illustrate this, lets consider the Weisfeiler-Lehman (WL) kernel [6] which decomposes graphs into rooted subgraphs 1 .  ... 
arXiv:1606.08928v1 fatcat:ltirl3etv5e5xhffj5m4k4ahum

Android-COCO: Android Malware Detection with Graph Neural Network for Byte- and Native-Code [article]

Peng Xu
2022 arXiv   pre-print
Recently, various approaches have been introduced to detect Android malware, the majority of these are either based on the Manifest File features or the structural information, such as control flow graph  ...  After that, we design an ensemble algorithm to get the final result of malware detection system.  ...  graph kernel, namely, Weisfeiler-Lehman Kernel(WLK) to obtain the Contextual Weisfeiler-Lehman Kernel(CWLK).  ... 
arXiv:2112.10038v2 fatcat:5wbiq52wp5hsfo2jlcaxawcpjq

Adaptive and Scalable Android Malware Detection through Online Learning [article]

Annamalai Narayanan, Liu Yang, Lihui Chen, Liu Jinliang
2016 arXiv   pre-print
In order to perform accurate detection, security-sensitive behaviors are captured from apps in the form of inter-procedural control-flow sub-graph features using a state-of-the-art graph kernel.  ...  Our experimental findings strongly indicate that online learning based approaches are highly suitable for real-world malware detection.  ...  ACKNOWLEDGMENT We thank the authors of [4] and [5] , for their suggestions and discussions that helped us re-implement their methods. We thank Kevin Allix for sharing the dataset used in [23] .  ... 
arXiv:1606.07150v2 fatcat:klf5uaukzreidojvdx53egs57e

Algorithm selection for software validation based on graph kernels

Cedric Richter, Eyke Hüllermeier, Marie-Christine Jakobs, Heike Wehrheim
2020 Automated Software Engineering : An International Journal  
Our kernel operates on a graph representation of source code mixing elements of control-flow and program-dependence graphs with abstract syntax trees.  ...  The evaluation, which is based on data sets from the annual software verification competition SV-COMP, demonstrates our kernel to generalize well and to achieve rather high prediction accuracy, both for  ...  Weisfeiler-Lehman subtree kernels (on CFGs only) are also employed for malware detection in Android apps (Wagner et al. 2009; Sahs and Khan 2012) .  ... 
doi:10.1007/s10515-020-00270-x fatcat:4b62vpw6izg3xce26ysdhovtp4

Predicting Rankings of Software Verification Competitions [article]

Mike Czech, Eyke Hüllermeier, Marie-Christine Jakobs, Heike Wehrheim
2017 arXiv   pre-print
Our kernels employ a graph representation for software source code that mixes elements of control flow and program dependence graphs with abstract syntax trees.  ...  The method builds upon so-called label ranking algorithms, which we complement with appropriate kernels providing a similarity measure for verification tasks.  ...  ., types for program variables [16] or malware in Android apps [17] ). Just like our approach, the la er also uses Weisfeiler-Lehman subtree kernels (on CFGs only).  ... 
arXiv:1703.00757v1 fatcat:rk4hob6lhjdmtfzdjf4ce4bdxe

Order Matters: Semantic-Aware Neural Networks for Binary Code Similarity Detection

Zeping Yu, Rui Cao, Qiyi Tang, Sen Nie, Junzhou Huang, Shi Wu
Moreover, we find that the order of the CFG's nodes is important for graph similarity detection, so we adopt convolutional neural network (CNN) on adjacency matrices to extract the order information.  ...  Binary code similarity detection, whose goal is to detect similar binary functions without having access to the source code, is an essential task in computer security.  ...  To capture the semantic feature, we propose BERT pre-training for the blocks of CFGs with two original tasks MLM & ANP, and two additional graph-level tasks BIG & GC.  ... 
doi:10.1609/aaai.v34i01.5466 fatcat:yqvu4sondrb5zo4avshuk4fvky

UNICORN: Runtime Provenance-Based Detector for Advanced Persistent Threats [article]

Xueyuan Han, Thomas Pasquier, Adam Bates, James Mickens, Margo Seltzer
2020 arXiv   pre-print
From modeling to detection, UNICORN tailors its design specifically for the unique characteristics of APTs.  ...  Through extensive yet time-efficient graph analysis, UNICORN explores provenance graphs that provide rich contextual and historical information to identify stealthy anomalous activities without pre-defined  ...  We adapt a linear-time, fast Weisfeiler-Lehman (WL) subtree graph kernel algorithm based on one dimensional WL test of isomorphism [126] .  ... 
arXiv:2001.01525v1 fatcat:cljlsnrtsfamhlhtnzptydwd6i

Bayesian Deep Learning for Graphs [article]

Federico Errica
2022 arXiv   pre-print
Two real-world applications demonstrate the efficacy of deep learning for graphs.  ...  In this thesis, we take a different route and develop a Bayesian Deep Learning framework for graph learning.  ...  such as the 1-dim Weisfeiler-Lehman (WL) test [34] .  ... 
arXiv:2202.12348v1 fatcat:ayrl5zr6q5dfjhqspecg4umsxm

Complex Data: Learning Trustworthily, Automatically, and with Guarantees

Luca Oneto, Nicolò Navarin, Battista Biggio, Federico Errica, Alessio Micheli, Franco Scarselli, Monica Bianchini, Alessandro Sperduti
2021 ESANN 2021 proceedings   unpublished
This demands for improving both ML technical aspects (e.g., design and automation) and human-related metrics (e.g., fairness, robustness, privacy, and explainability), with performance guarantees at both  ...  The aforementioned scenario posed three main challenges: (i) Learning from Complex Data (i.e., sequence, tree, and graph data), (ii) Learning Trustworthily, and (iii) Learning Automatically with Guarantees  ...  Another research direction focuses on networks for graphs extending two existing theories (i.e., Weisfeiler-Lehman test and unfolding equivalence), which separately provide only few suggestions about the  ... 
doi:10.14428/esann/2021.es2021-6 fatcat:pahz7bwkqzcllnpy5ckl427kg4

A Systematic Survey on Deep Generative Models for Graph Generation [article]

Xiaojie Guo, Liang Zhao
2020 arXiv   pre-print
Recent advances in deep generative models for graph generation is an important step towards improving the fidelity of generated graphs and paves the way for new kinds of applications.  ...  This article provides an extensive overview of the literature in the field of deep generative models for the graph generation.  ...  Hamming and Ipsen-Mikhailov distances(HIM) [61] ; (3) spectral entropies of the density matrices; (4) eigenvector centrality distance [12] ; (5) closeness centrality distance [37] ; (6) Weisfeiler Lehman  ... 
arXiv:2007.06686v2 fatcat:xox7apwdvbfhlgnsgrr3w3rv5m

Learning to Find Bugs in Programs and their Documentation

Andrew Habib
Second, we hope that our work will open the door for more research on automatically utilizing natural language in software development.  ...  First, we provide developers with novel bug detection techniques that complement traditional ones.  ...  Given two graphs and their WL sequences, we compute the graph kernel as follows: Definition 3. 4 .1 (Weisfeiler-Lehman kernel) The graph kernel of g and g is k(g, g ) = k sub (g 0 , g 0 ) + k sub (g  ... 
doi:10.26083/tuprints-00017377 fatcat:o47olqxg5rhrzoqg4j5tazo3ni

Graph mining on static, multiplex and attributed networks [article]

Benedek Rózemberczki, University Of Edinburgh, Rik Sarkar, He Sun
Graph structured data is pervasive and generated by online human interactions at an unprece- dented velocity.  ...  Relational data poses challenges for information extraction and knowledge discovery due to its web scale size, extreme sparsity, multimodality, the presence of spatial autocorrelation and heterogeneity  ...  We extracted the Weisfeiler-Lehman features which appeared in at least 5 graphs in the datasets.  ... 
doi:10.7488/era/1498 fatcat:kmqs4lnzmzam7a7q53aic7nf3y