IA Scholar Query: Towards Efficient Distributed Subgraph Enumeration Via Backtracking-Based Framework.
https://scholar.archive.org/
Internet Archive Scholar query results feedeninfo@archive.orgThu, 21 Jul 2022 00:00:00 GMTfatcat-scholarhttps://scholar.archive.org/help1440Subgraph Matching via Query-Conditioned Subgraph Matching Neural Networks and Bi-Level Tree Search
https://scholar.archive.org/work/tthgluj3znc2tmp5lwndke54zi
Recent advances have shown the success of using reinforcement learning and search to solve NP-hard graph-related tasks, such as Traveling Salesman Optimization, Graph Edit Distance computation, etc. However, it remains unclear how one can efficiently and accurately detect the occurrences of a small query graph in a large target graph, which is a core operation in graph database search, biomedical analysis, social group finding, etc. This task is called Subgraph Matching which essentially performs subgraph isomorphism check between a query graph and a large target graph. One promising approach to this classical problem is the "learning-to-search" paradigm, where a reinforcement learning (RL) agent is designed with a learned policy to guide a search algorithm to quickly find the solution without any solved instances for supervision. However, for the specific task of Subgraph Matching, though the query graph is usually small given by the user as input, the target graph is often orders-of-magnitude larger. It poses challenges to the neural network design and can lead to solution and reward sparsity. In this paper, we propose N-BLS with two innovations to tackle the challenges: (1) A novel encoder-decoder neural network architecture to dynamically compute the matching information between the query and the target graphs at each search state; (2) A Monte Carlo Tree Search enhanced bi-level search framework for training the policy and value networks. Experiments on five large real-world target graphs show that N-BLS can significantly improve the subgraph matching performance.Yunsheng Bai, Derek Xu, Yizhou Sun, Wei Wangwork_tthgluj3znc2tmp5lwndke54ziThu, 21 Jul 2022 00:00:00 GMTThe complexity of finding and enumerating optimal subgraphs to represent spatial correlation
https://scholar.archive.org/work/low5grit4jf7vnp2jjkkjw6nem
Understanding spatial correlation is vital in many fields including epidemiology and social science. Lee, Meeks and Pettersson (Stat. Comput. 2021) recently demonstrated that improved inference for areal unit count data can be achieved by carrying out modifications to a graph representing spatial correlations; specifically, they delete edges of the planar graph derived from border-sharing between geographic regions in order to maximise a specific objective function. In this paper we address the computational complexity of the associated graph optimisation problem. We demonstrate that this problem cannot be solved in polynomial time unless P = NP; we further show intractability for two simpler variants of the problem. We follow these results with two parameterised algorithms that exactly solve the problem. Both of these solve not only the decision problem, but also enumerate all solutions with polynomial time precalculation, delay, and postcalculation time in respective restricted settings. For this problem, efficient enumeration allows the uncertainty in the spatial correlation to be utilised in the modelling. The first enumeration algorithm utilises dynamic programming on a tree decomposition, and has polynomial time precalculation and linear delay if both the treewidth and maximum degree are bounded. The second algorithm is restricted to problem instances with maximum degree three, as may arise from triangulations of planar surfaces, but can output all solutions with FPT precalculation time and linear delay when the maximum number of edges that can be removed is taken as the parameter.Jessica Enright, Duncan Lee, Kitty Meeks, William Pettersson, John Sylvesterwork_low5grit4jf7vnp2jjkkjw6nemThu, 14 Jul 2022 00:00:00 GMTTackling the veracity and variety of big data
https://scholar.archive.org/work/gha5bjvg2bci3o3faf4xlnru2u
This thesis tackles the veracity and variety challenges of big data, especially focusing on graphs and relational data. We start with proposing a class of graph association rules (GARs) to specify regularities between entities in graphs, which capture both missing links and inconsistencies. A GAR is a combination of a graph pattern and a dependency; it may take as predicates machine learning classifiers for link prediction. We formalize association deduction with GARs in terms of the chase, and prove its Church-Rosser property. We show that the satisfiability, implication and association deduction problems for GARs are coNP-complete, NP-complete and NP-complete, respectively. The incremental deduction problem is DP-complete for GARs. In addition, we provide parallel algorithms for association deduction and incremental deduction. We next develop a parallel algorithm to discover GARs, which applies an applicationdriven strategy to cut back rules and data that are irrelevant to users' interest, by training a machine learning model to identify data pertaining to a given application. Moreover, we introduce a sampling method to reduce a big graph G to a set H of small sample graphs. Given expected support and recall bounds, this method is able to deduce samples in H and mine rules from H to satisfy the bounds in the entire G. Then we propose a class of temporal association rules (TACOs) for event prediction in temporal graphs. TACOs are defined on temporal graphs in terms of change patterns and (temporal) conditions, and may carry machine learning predicates for temporal event prediction. We settle the complexity of reasoning about TACOs, including their satisfiability, implication and prediction problems. We develop a system that discovers TACOs by iteratively training a rule creator based on generative models in a creatorcritic framework, and predicts events by applying the discovered TACOs in parallel. Finally, we propose an approach to querying relations D and graphs G taken together in SQL. The key idea is that if [...]Ruochun Jin, University Of Edinburgh, Wenfei Fan, Leonid Libkinwork_gha5bjvg2bci3o3faf4xlnru2uTue, 21 Jun 2022 00:00:00 GMTαNAS: Neural Architecture Search using Property Guided Synthesis
https://scholar.archive.org/work/vqmqgylit5b4dn6waqtgq6uloe
In the past few years, neural architecture search (NAS) has become an increasingly important tool within the deep learning community. Despite the many recent successes of NAS, however, most existing approaches operate within highly structured design spaces, and hence explore only a small fraction of the full search space of neural architectures while also requiring significant manual effort from domain experts. In this work, we develop techniques that enable efficient NAS in a significantly larger design space. To accomplish this, we propose to perform NAS in an abstract search space of program properties. Our key insights are as follows: (1) the abstract search space is significantly smaller than the original search space, and (2) architectures with similar program properties also have similar performance; thus, we can search more efficiently in the abstract search space. To enable this approach, we also propose a novel efficient synthesis procedure, which accepts a set of promising program properties, and returns a satisfying neural architecture. We implement our approach, αNAS, within an evolutionary framework, where the mutations are guided by the program properties. Starting with a ResNet-34 model, αNAS produces a model with slightly improved accuracy on CIFAR-10 but 96 Vision Transformer (30 FLOPS, 14 without any degradation in accuracy.Charles Jin, Phitchaya Mangpo Phothilimthana, Sudip Roywork_vqmqgylit5b4dn6waqtgq6uloeThu, 02 Jun 2022 00:00:00 GMTRLFlow: Optimising Neural Network Subgraph Transformation with World Models
https://scholar.archive.org/work/bz3jygj7djh5pi4ptmpadeuxzq
Training deep learning models takes an extremely long execution time and consumes large amounts of computing resources. At the same time, recent research proposed systems and compilers that are expected to decrease deep learning models runtime. An effective optimisation methodology in data processing is desirable, and the reduction of compute requirements of deep learning models is the focus of extensive research. In this paper, we address the neural network sub-graph transformation by exploring reinforcement learning (RL) agents to achieve performance improvement. Our proposed approach RLFlow can learn to perform neural network subgraph transformations, without the need for expertly designed heuristics to achieve a high level of performance. Recent work has aimed at applying RL to computer systems with some success, especially using model-free RL techniques. Model-based reinforcement learning methods have seen an increased focus in research as they can be used to learn the transition dynamics of the environment; this can be leveraged to train an agent using a hallucinogenic environment such as World Model (WM), thereby increasing sample efficiency compared to model-free approaches. WM uses variational auto-encoders and it builds a model of the system and allows exploring the model in an inexpensive way. In RLFlow, we propose a design for a model-based agent with WM which learns to optimise the architecture of neural networks by performing a sequence of sub-graph transformations to reduce model runtime. We show that our approach can match the state-of-the-art performance on common convolutional networks and outperforms by up to 5% those based on transformer-style architecturesSean Parker, Sami Alabed, Eiko Yonekiwork_bz3jygj7djh5pi4ptmpadeuxzqMon, 09 May 2022 00:00:00 GMTWe're Not Gonna Break It! Consistency-Preserving Operators for Efficient Product Line Configuration
https://scholar.archive.org/work/2pcqkvpkpnbz5aetrn3ldyndo4
When configuring a software product line, finding a good trade-off between multiple orthogonal quality concerns is a challenging multi-objective optimisation problem. State-of-the-art solutions based on search-based techniques create invalid configurations in intermediate steps, requiring additional repair actions that reduce the efficiency of the search. In this work, we introduce consistency-preserving configuration operators (CPCOs)--genetic operators that maintain valid configurations throughout the entire search. CPCOs bundle coherent sets of changes: the activation or deactivation of a particular feature together with other (de)activations that are needed to preserve validity. In our evaluation, our instantiation of the IBEA algorithm with CPCOs outperforms two state-of-the-art tools for optimal product line configuration in terms of both speed and solution quality. The improvements are especially pronounced in large product lines with thousands of features.Jose-Miguel Horcas, Daniel Strüber, Alexandru Burdusel, Jabier Martinez, Steffen Zschalerwork_2pcqkvpkpnbz5aetrn3ldyndo4Wed, 27 Apr 2022 00:00:00 GMTA review of knowledge graph application scenarios in cyber security
https://scholar.archive.org/work/f4cy6gcxqzdkjo4ahvq3pkn3jy
Facing the dynamic complex cyber environments, internal and external cyber threat intelligence, and the increasing risk of cyber-attack, knowledge graphs show great application potential in the cyber security area because of their capabilities in knowledge aggregation, representation, management, and reasoning. However, while most research has focused on how to develop a complete knowledge graph, it remains unclear how to apply the knowledge graph to solve industrial real challenges in cyber-attack and defense scenarios. In this review, we provide a brief overview of the basic concepts, schema, and construction approaches for the cyber security knowledge graph. To facilitate future research on cyber security knowledge graphs, we also present a curated collection of datasets and open-source libraries on the knowledge construction and information extraction task. In the major part of this article, we conduct a comparative review of the different works that elaborate on the recent progress in the application scenarios of the cyber security knowledge graph. Furthermore, a novel comprehensive classification framework is created to describe the connected works from nine primary categories and eighteen subcategories. Finally, we have a thorough outlook on several promising research directions based on the discussion of existing research flaws.Kai Liu, Fei Wang, Zhaoyun Ding, Sheng Liang, Zhengfei Yu, Yun Zhouwork_f4cy6gcxqzdkjo4ahvq3pkn3jySun, 10 Apr 2022 00:00:00 GMTMolecular similarity and diversity analysis of bioactive small molecules using chemoinformatics approaches
https://scholar.archive.org/work/7nz35zi3kze5fcqkydgi2s7ot4
The search for pharmaceutically interesting compounds using computational methods is the core idea in chemoinformatics. With the advent of combinatorial synthesis and highthroughput screening (HTS), researchers and drug industries are currently able to screen millions of compounds each day. However, improvements in screening capabilities have failed to yield a proportionate increase in novel chemotypes. Given the magnitude of compounds in one of the most popular chemistry databases, PubChem, it is irrational to experimentally screen all compounds for a potential target. This thesis aims to study the property space occupied by therapeutic compounds of economic importance obtained from public datasets, using chemoinformatics tools and computational technologies. With this objective in mind, a comprehensive review of current chemoinformatics research, with a particular emphasis on drug discovery was carried out. In addition, the most commonly used, freely available small molecule databases and algorithms for small molecule analysis were also reviewed. Further, recent developments in computational library design techniques were summarized in a separate review article. For web-based analysis and visualization of small molecules, I have developed the chemoinformatics analysis module for the Customary Medicinal Knowledgebase (CMKb; http://www.biolinfo.org/cmkb) which has served as a prototype to integrate the use of medicinal plant among Australian Aboriginals with bioactives, for identifying potential lead compounds. In order to examine the similarity of current drug molecules with human metabolites and toxics, a preliminary comparative study based on several computed physicochemical properties and functional groups was carried out. We established that searching against complete datasets was comparable to results obtained from clustered data. We then used a multi-criteria approach to analyse physicochemical properties, scaffold architecture and fragment occurrence among large public datasets of biological interest viz. [...]Varun Khannawork_7nz35zi3kze5fcqkydgi2s7ot4Mon, 28 Mar 2022 00:00:00 GMTUniform Object Rearrangement: From Complete Monotone Primitives to Efficient Non-Monotone Informed Search
https://scholar.archive.org/work/p2fypxwwkrbalm2rknsmbluojy
Object rearrangement is a widely-applicable and challenging task for robots. Geometric constraints must be carefully examined to avoid collisions and combinatorial issues arise as the number of objects increases. This work studies the algorithmic structure of rearranging uniform objects, where robot-object collisions do not occur but object-object collisions have to be avoided. The objective is minimizing the number of object transfers under the assumption that the robot can manipulate one object at a time. An efficiently computable decomposition of the configuration space is used to create a "region graph", which classifies all continuous paths of equivalent collision possibilities. Based on this compact but rich representation, a complete dynamic programming primitive DFSDP performs a recursive depth first search to solve monotone problems quickly, i.e., those instances that do not require objects to be moved first to an intermediate buffer. DFSDP is extended to solve single-buffer, non-monotone instances, given a choice of an object and a buffer. This work utilizes these primitives as local planners in an informed search framework for more general, non-monotone instances. The search utilizes partial solutions from the primitives to identify the most promising choice of objects and buffers. Experiments demonstrate that the proposed solution returns near-optimal paths with higher success rate, even for challenging non-monotone instances, than other leading alternatives.Rui Wang, Kai Gao, Daniel Nakhimovich, Jingjin Yu, Kostas E. Bekriswork_p2fypxwwkrbalm2rknsmbluojyFri, 18 Mar 2022 00:00:00 GMTHex-Mesh Generation and Processing: a Survey
https://scholar.archive.org/work/q3u3er7db5ha3d5mweiqazcsni
In this article, we provide a detailed survey of techniques for hexahedral mesh generation. We cover the whole spectrum of alternative approaches to mesh generation, as well as post processing algorithms for connectivity editing and mesh optimization. For each technique, we highlight capabilities and limitations, also pointing out the associated unsolved challenges. Recent relaxed approaches, aiming to generate not pure-hex but hex-dominant meshes, are also discussed. The required background, pertaining to geometrical as well as combinatorial aspects, is introduced along the way.Nico Pietroni, Marcel Campen, Alla Sheffer, Gianmarco Cherchi, David Bommes, Xifeng Gao, Riccardo Scateni, Franck Ledoux, Jean-Francois Remacle, Marco Livesuwork_q3u3er7db5ha3d5mweiqazcsniFri, 25 Feb 2022 00:00:00 GMTRecent Advances in Positive-Instance Driven Graph Searching
https://scholar.archive.org/work/rmv3k2gc35fq3hxd6kdqq3i4cu
Research on the similarity of a graph to being a tree—called the treewidth of the graph—has seen an enormous rise within the last decade, but a practically fast algorithm for this task has been discovered only recently by Tamaki (ESA 2017). It is based on dynamic programming and makes use of the fact that the number of positive subinstances is typically substantially smaller than the number of all subinstances. Algorithms producing only such subinstances are called positive-instance driven (PID). The parameter treedepth has a similar story. It was popularized through the graph sparsity project and is theoretically well understood—but the first practical algorithm was discovered only recently by Trimble (IPEC 2020) and is based on the same paradigm. We give an alternative and unifying view on such algorithms from the perspective of the corresponding configuration graphs in certain two-player games. This results in a single algorithm that can compute a wide range of important graph parameters such as treewidth, pathwidth, and treedepth. We complement this algorithm with a novel randomized data structure that accelerates the enumeration of subproblems in positive-instance driven algorithms.Max Bannach, Sebastian Berndtwork_rmv3k2gc35fq3hxd6kdqq3i4cuThu, 27 Jan 2022 00:00:00 GMTFPT Algorithms for Finding Near-Cliques in c-Closed Graphs
https://scholar.archive.org/work/a2rfbxa52jcoln4i3j2f7o42om
Finding large cliques or cliques missing a few edges is a fundamental algorithmic task in the study of real-world graphs, with applications in community detection, pattern recognition, and clustering. A number of effective backtracking-based heuristics for these problems have emerged from recent empirical work in social network analysis. Given the NP-hardness of variants of clique counting, these results raise a challenge for beyond worst-case analysis of these problems. Inspired by the triadic closure of real-world graphs, Fox et al. (SICOMP 2020) introduced the notion of c-closed graphs and proved that maximal clique enumeration is fixed-parameter tractable with respect to c. In practice, due to noise in data, one wishes to actually discover "near-cliques", which can be characterized as cliques with a sparse subgraph removed. In this work, we prove that many different kinds of maximal near-cliques can be enumerated in polynomial time (and FPT in c) for c-closed graphs. We study various established notions of such substructures, including k-plexes, complements of bounded-degeneracy and bounded-treewidth graphs. Interestingly, our algorithms follow relatively simple backtracking procedures, analogous to what is done in practice. Our results underscore the significance of the c-closed graph class for theoretical understanding of social network analysis.Balaram Behera, Edin Husić, Shweta Jain, Tim Roughgarden, C. Seshadhri, Mark Bravermanwork_a2rfbxa52jcoln4i3j2f7o42omTue, 25 Jan 2022 00:00:00 GMTComplex Networks: Structure and Inference
https://scholar.archive.org/work/3z2u6qwehjcpdl5kqz6frpc4cu
From the spread of disease across a population to the dispersion of vehicular traffic in cities, many real-world processes are driven by lots of small components that interact in simple ways at small scales to produce nontrivial large-scale effects. Probing the fundamental mechanisms that govern such systems——broadly called "complex systems"——is crucial for control, design, and intervention relevant to these processes. Networks, mathematical objects composed of nodes attached in pairs by edges, provide a very useful representation of such systems, and thus modeling networks is of critical importance for understanding real-world complex systems. In this thesis, I examine two different aspects of network modeling: (1) characterizing structure in networks with metadata, and (2) developing scalable, accurate, and interpretable inference techniques for real-world network data. I approach the problem of characterizing structure in networks with metadata from two different perspectives. First, I discuss new measures for characterizing the structure of signed networks with positive and negative edge signs representing amity and enmity respectively. Signed networks are hypothesized to display structural regularity (balance) as a result of certain configurations of edge signs being more common than others——for instance, the friend of my enemy should be my enemy. I show that we can develop intuitive measures of balance in signed networks that capture long-range correlations, demonstrating that real networks are indeed significantly balanced using these measures, and that these measures can be used to impute missing data. Second, I move on to explore how we can measure diversity at multiple scales in networks with node metadata that take the form of distributions. I detail a general information theoretic framework for this task, illustrating new insights it can give us through example applications involving demographic data across spatially contiguous regions. With regards to inference, I first describe a new message passi [...]Alec Kirkley, University, Mywork_3z2u6qwehjcpdl5kqz6frpc4cuWed, 19 Jan 2022 00:00:00 GMTTheory and methods for stochastic, accelerated, and distributed optimization
https://scholar.archive.org/work/tdp4oy73cvgfrngt4jvsy24dim
This thesis consists of two parts. Part I (Chapters 1-3) concerns momentum-based first-order optimization algorithms for stochastic optimization where we have only access to stochastic (noisy) estimates of the gradient of the objective. This setting would arise frequently in several key problems in supervised learning such as risk minimization for classification or regression, or saddle-point problems for distributionally robust learning. When gradients are deterministic and do not contain any noise, it is well-known that momentum-based optimization algorithms such as Nesterov's accelerated gradient (AG) method or Polyak's heavy ball (HB) method have improved convergence rates compared to gradient descent methods. However, in the presence of persistent stochastic gradient errors; momentum-based algorithms amplify the noise in the gradients and are less robust to gradient errors unless the stepsize and the momentum parameters are very carefully tuned to the problem at hand. This motivates the study of the distribution of the iterates of the momentum algorithms as a function of the stepsize and momentum parameters where there is a lack of principled strategies to ensure the existence of a stationary distribution or to control the probability that the suboptimality exceeds a certain threshold. Especially, existing results for momentum methods provide only limited guarantees in expected suboptimality, but do not typically characterize deviations from the expected suboptimality. In Chapter 1, we show that many momentum algorithms such as AG, HB and their variants for constrained strongly convex optimization converge to their equilibrium with the accelerated rate under some conditions on the parameters and on the noise structure. These results shed further light into the effect of parameters and how much noise momentum algorithms can tolerate before being divergent. In Chapter 2, we consider the general class of momentum methods (GMM) subject to stochastic gradient noise which include AG and HB as special cases. Under [...]Bugra Canwork_tdp4oy73cvgfrngt4jvsy24dimScalable Graph Algorithms using Practically Efficient Data Reductions
https://scholar.archive.org/work/2h3oorsxdnhmrpyhldt6gro2v4
This dissertation presents both heuristic and exact approaches for the NP-hard optimization problems of finding maximum cardinality and weight independent sets, as well as finding maximum cardinality cuts. An independent set of a graph is a subset of vertices such that no pair of vertices in this set is adjacent. A maximum cardinality independent set is an independent set of maximal cardinality among all possible independent sets. If one is additionally given a vertex weighting function for this graph, a maximum weight independent set of this graph is an independent set of maximal weight. Finally, a maximum cardinality cut in a graph is a bipartition of the vertices such that the number of edges running across the partitions is maximal. All three of these problems are important for a variety of real-world applications. For example, maximum or high-quality cardinality and weight independent sets are used in map labeling [Klu+19; GNN13], modeling protein-protein interactions [GWA00], or vehicle routing [Don+22]. Examples for the usage of maximum or large cuts include social network modeling [Har59], statistical physics [Bar82], or VLSI design [Bar+88; Chi+07]. In this work, we discuss and further the usage of reduction rules for all these problems. Reduction rules are graph transformations that are able to generally reduce the size of a given input while also maintaining optimality, i.e., an optimal solution of the reduced instance can be extended to an optimal solution of the original input. In this dissertation, we also present inexact reduction rules that remove vertices that are likely to be a part of (or excluded from) a solution. We show that these types of reduction rules can drastically improve the performance of heuristic algorithms while still leading to high-quality solutions. Finally, we present graph transformations that maintain optimality but also temporarily increase the graph size. Counterintuitively, this can lead to new, easier to reduce structures and subsequently an overall reduction in size in the long run. We propose multiple algorithms that incorporate these concepts into a wide spectrum of techniques. Our work on the maximum cardinality independent set problem includes an evolutionary algorithm that uses a combination of exact and inexact reduction rules to gradually shrink the graph size. We also propose an advanced local search algorithm that improves an existing state-of-the-art algorithm with reduction rules to very quickly compute high-quality independent sets. Next, we present a portfolio algorithm that won the PACE Challenge 2019 by using multiple existing approaches for different closely related problems. We then develop and evaluate multiple advanced branching rules for a state-ofthe-art branch-and-reduce algorithm. For maximum weight independent sets, we present multiple new reduction rules and graph transformations that we then use in our newly developed branch-and-reduce algorithm. Finally, for maximum cardinality cuts, we also v Abstract propose new reduction rules that are used to build an efficient preprocessing algorithm to boost the performance of state-of-the-art approaches, both heuristic and exact. We evaluated all our algorithms on a large set of instances stemming from multiple domains and applications for the corresponding problems. In general, our experiments show that our algorithms are able to significantly increase both the scale and speed at which instances can be processed in practice (by up to orders of magnitude). Furthermore, we show that our preprocessing algorithms and reductions can easily be integrated into other algorithms to improve their performance. Our contributions are either available as standalone libraries or as part of the libraries KaMIS, WeGotYouCovered (maximum cardinality and weight independent sets), and DMAX (maximum cardinality cuts). vi The last years leading to this dissertation presented me with a plethora of opportunities that helped me grow, both academically and personally. Thus, I would like to briefly express my gratitude to all people directly or indirectly involved in this journey. First and foremost, I want to give my sincerest thanks to my supervisor Peter Sanders for letting me be a part of his amazing research group, as well as for providing me with guidance and the freedom to pursue my research interests. I would also like to thank Jin-Kao Hao for being a part of my dissertation committee as a reviewer for this dissertation. Thanks go to Monika Henzinger for inviting me to her research group in Vienna and providing me with valuable feedback on my work. I want to give special thanks to Christian Schulz and Darren Strash who were close colleagues and friends for the longest part of my academic journey. Without you two, I probably would not have ventured into academia and arrived at this point.Sebastian Emanuel Lamm, Peter Sanders, Jin-Kao Haowork_2h3oorsxdnhmrpyhldt6gro2v42021 Index IEEE Transactions on Parallel and Distributed Systems Vol. 32
https://scholar.archive.org/work/u7zsjrigtfgrbesog7p3u4sy7y
work_u7zsjrigtfgrbesog7p3u4sy7yAnalysis and Mitigation of Remote Side-Channel and Fault Attacks on the Electrical Level
https://scholar.archive.org/work/6kpe5ay4ufgtna44m37supe3a4
In der fortlaufenden Miniaturisierung von integrierten Schaltungen werden physikalische Grenzen erreicht, wobei beispielsweise Einzelatomtransistoren eine mögliche untere Grenze für Strukturgrößen darstellen. Zudem ist die Herstellung der neuesten Generationen von Mikrochips heutzutage finanziell nur noch von großen, multinationalen Unternehmen zu stemmen. Aufgrund dieser Entwicklung ist Miniaturisierung nicht länger die treibende Kraft um die Leistung von elektronischen Komponenten weiter zu erhöhen. Stattdessen werden klassische Computerarchitekturen mit generischen Prozessoren weiterentwickelt zu heterogenen Systemen mit hoher Parallelität und speziellen Beschleunigern. Allerdings wird in diesen heterogenen Systemen auch der Schutz von privaten Daten gegen Angreifer zunehmend schwieriger. Neue Arten von Hardware-Komponenten, neue Arten von Anwendungen und eine allgemein erhöhte Komplexität sind einige der Faktoren, die die Sicherheit in solchen Systemen zur Herausforderung machen. Kryptografische Algorithmen sind oftmals nur unter bestimmten Annahmen über den Angreifer wirklich sicher. Es wird zum Beispiel oft angenommen, dass der Angreifer nur auf Eingaben und Ausgaben eines Moduls zugreifen kann, während interne Signale und Zwischenwerte verborgen sind. In echten Implementierungen zeigen jedoch Angriffe über Seitenkanäle und Faults die Grenzen dieses sogenannten Black-Box-Modells auf. Während bei Seitenkanalangriffen der Angreifer datenabhängige Messgrößen wie Stromverbrauch oder elektromagnetische Strahlung ausnutzt, wird bei Fault Angriffen aktiv in die Berechnungen eingegriffen, und die falschen Ausgabewerte zum Finden der geheimen Daten verwendet. Diese Art von Angriffen auf Implementierungen wurde ursprünglich nur im Kontext eines lokalen Angreifers mit Zugriff auf das Zielgerät behandelt. Jedoch haben bereits Angriffe, die auf der Messung der Zeit für bestimmte Speicherzugriffe basieren, gezeigt, dass die Bedrohung auch durch Angreifer mit Fernzugriff besteht. In dieser Arbeit wird die Bedrohung durch [...]Jonas Krautter, Mehdi B. Tahoori, Thomas Eisenbarthwork_6kpe5ay4ufgtna44m37supe3a4Efficient Cryptanalysis Techniques for Privacy-Preserving Record Linkage
https://scholar.archive.org/work/vbrnbsmilrcyhl37xs2vsg4o7m
The linking of records across databases has seen an increasing interest over the last few decades in domains ranging from national census and healthcare to crime and fraud detection. This is due to the ability of record linkage (RL) to improve data quality and facilitate advanced data mining. Due to the absence of unique entity identifiers across the databases to be linked, RL is generally based on quasi-identifying (QID) attribute values of entities, such as their names, addresses, and dates of birth. However, the use of such personal identifying information often leads to individual, ethical, and legal concerns associated with privacy and confidentiality. Privacy-preserving record linkage (PPRL) seeks to develop techniques that allow linkage of databases without compromising the privacy of the entities whose records are being linked. In general, PPRL techniques encode and/or encrypt QID values in sensitive databases in order to protect the privacy of the entities while allowing accurate linkage of records using encoded and/or encrypted values. However, certain PPRL techniques, such as popular Bloom filter encoding, have shown to be susceptible to privacy attacks, and a number of such attacks have been proposed on PPRL techniques over the years. While these attacks reveal different weaknesses in PPRL techniques, they also have limitations including the requirement of knowledge by an adversary about specific parameters, and significant memory and time consumption. Therefore, further research into both analysing existing privacy attacks and exploring novel attack methods is vital to better understand the risks associated with real-world PPRL projects. In this thesis we present a comprehensive research study about privacy attacks on PPRL. We start by proposing a taxonomy of attacks on PPRL in which the existing attacks are categorised under twelve dimensions. Our taxonomy can be used to analyse the characteristics of privacy attacks and identify their limitations. Next, we propose a framework to quantify the vulner [...]Anushka Vidanage, University, The Australian Nationalwork_vbrnbsmilrcyhl37xs2vsg4o7mSat, 04 Dec 2021 00:00:00 GMTFaster algorithms for Steiner tree and related problems
https://scholar.archive.org/work/sstifp5etffy3l7srnw6d5oijy
The Steiner tree problem in graphs (SPG) is one of the most studied problems in combinatorial optimization. Part of its theoretical appeal might be attributed to the fact that the SPG generalizes two other classic optimization problems: Shortest paths, and minimum spanning trees. On the practical side, many applications can be modeled as SPG or closely related problems. The SPG has seen impressive theoretical advancements in the last decade. However, the state of the art in (practical) exact SPG solution, set in a series of milestone papers by Polzin and Vahdati Daneshmand, has remained largely unchallenged for almost 20 years. While the DIMACS Challenge 2014 and the PACE Challenge 2018 brought renewed interest into the exact solution of SPGs, even the best new solvers fall far short of reaching the state of the art. This thesis seeks to once again advance exact SPG solution. Since many practical applications are not modeled as pure SPGs, but rather as closely related problems, this thesis also aims to combine SPG advancements with improvement in the exact solution of such related problems. Initially, we establish a broad theoretical basis to guide the subsequent algorithmic developments. In this way, we provide various new theoretical results for SPG and well-known relatives such as the maximum-weight connected subgraph problem. These results include the strength of linear programming relaxations, polyhedral descriptions, and complexity results. We go on to introduce many algorithmic components such as reduction techniques, cutting planes, graph transformations, and heuristics-both for SPG and related problems. Many of these methods and techniques are provably stronger than previous results from the literature. For example, we introduce a new reduction concept that is strictly stronger than the well-known and widely used bottleneck Steiner distance. We also provide theoretical analyses (e.g. concerning complexity) of the new algorithms. The individual components are combined in an exact branch-and-cut algorithm. [...]Daniel Markus Rehfeldt, Technische Universität Berlin, Thorsten Kochwork_sstifp5etffy3l7srnw6d5oijyTue, 23 Nov 2021 00:00:00 GMTFPT Algorithms for Finding Near-Cliques in c-Closed Graphs
https://scholar.archive.org/work/zojae272tnh3vcirqgli6r47oa
Finding large cliques or cliques missing a few edges is a fundamental algorithmic task in the study of real-world graphs, with applications in community detection, pattern recognition, and clustering. A number of effective backtracking-based heuristics for these problems have emerged from recent empirical work in social network analysis. Given the NP-hardness of variants of clique counting, these results raise a challenge for beyond worst-case analysis of these problems. Inspired by the triadic closure of real-world graphs, Fox et al. (SICOMP 2020) introduced the notion of c-closed graphs and proved that maximal clique enumeration is fixed-parameter tractable with respect to c. In practice, due to noise in data, one wishes to actually discover "near-cliques", which can be characterized as cliques with a sparse subgraph removed. In this work, we prove that many different kinds of maximal near-cliques can be enumerated in polynomial time (and FPT in c) for c-closed graphs. We study various established notions of such substructures, including k-plexes, complements of bounded-degeneracy and bounded-treewidth graphs. Interestingly, our algorithms follow relatively simple backtracking procedures, analogous to what is done in practice. Our results underscore the significance of the c-closed graph class for theoretical understanding of social network analysis.Balaram Behera, Edin Husić, Shweta Jain, Tim Roughgarden, C. Seshadhriwork_zojae272tnh3vcirqgli6r47oaFri, 19 Nov 2021 00:00:00 GMT