6 Hits in 0.47 sec

Potential Energy to Improve Link Prediction With Relational Graph Neural Networks

Simone Colombo, Dimitrios Alivanistos, Michael Cochez
2022 AAAI Spring Symposia  
Potential Energy (PE) between 2 bodies with mass, refers to the relative gravitational pull between them. Analogously, in the context of a graph, nodes can thought of as objects where a) the product of the degrees of nodes acts as a proxy for mass, b) the clustering coefficients of common neighbours as a proxy for gravitational acceleration, and c) the inverse of the shortest distance between nodes as a proxy for distance in space, which allows for PE calculation as introduced in prior work. In
more » ... this work, we are investigating the effects of incorporating PE in Link Prediction (LP) with Relational Graph Convolutional Networks (R-GCN). Specifically, we explore the benefits of including PE calculation as an informative prior to the LP task and in a follow-up experiment as a learnable feature to predict. We performed several experiments and show that considering PE in the LP process has certain advantages and find that the information PE provides was not captured by the embeddings produced by the R-GCN.
dblp:conf/aaaiss/ColomboAC22 fatcat:6d7zl5i75jgidbnrs5vi3eqckq

Approximate Knowledge Graph Query Answering: From Ranking to Binary Classification [chapter]

Ruud van Bakel, Teodor Aleksiev, Daniel Daza, Dimitrios Alivanistos, Michael Cochez
2021 Lecture Notes in Computer Science  
AbstractLarge, heterogeneous datasets are characterized by missing or even erroneous information. This is more evident when they are the product of community effort or automatic fact extraction methods from external sources, such as text. A special case of the aforementioned phenomenon can be seen in knowledge graphs, where this mostly appears in the form of missing or incorrect edges and nodes.Structured querying on such incomplete graphs will result in incomplete sets of answers, even if the
more » ... orrect entities exist in the graph, since one or more edges needed to match the pattern are missing. To overcome this problem, several algorithms for approximate structured query answering have been proposed. Inspired by modern Information Retrieval metrics, these algorithms produce a ranking of all entities in the graph, and their performance is further evaluated based on how high in this ranking the correct answers appear.In this work we take a critical look at this way of evaluation. We argue that performing a ranking-based evaluation is not sufficient to assess methods for complex query answering. To solve this, we introduce Message Passing Query Boxes (MPQB), which takes binary classification metrics back into use and shows the effect this has on the recently proposed query embedding method MPQE.
doi:10.1007/978-3-030-72308-8_8 fatcat:4r2tnxh4kfalvekgfmcjwfgsem

Revealing Spatio-temporal Patterns and Influencing Factors of Dockless Bike Sharing Demand

Pengfei Lin, Jiancheng Weng, Song Hu, Dimitrios Alivanistos, Xin Li, Baocai Yin
2020 IEEE Access  
Dockless bike sharing plays an important role in complementing urban transportation systems and promoting the sustainable development of cities worldwide. To improve system operational efficiency, it is critical to study the spatiotemporal patterns of dockless bike sharing demand as well as factors influencing these patterns. Based on bicycle trip data from Mobike, Point of Interest (POI) data and smart card data in Beijing, we built a spatially embedded network and implemented the Infomap
more » ... ithm, a community detection method to uncover the usage patterns. Then, the Gradient Boosting Decision Tree (GBDT) model was adopted to investigate the effect of the built environment and public transit services by controlling the temporal variables. The spatiotemporal distribution shows imbalanced characteristics. About half of the total trips occur in the morning/evening rush hours and at noon. The community detection results further reveal a polycentric pattern of trip demand distribution and 120 sub-regions with a significant difference in connection strength and scale. The result of the GBDT model indicates that factors including subway ridership, bus ridership, hour, residence density, office density have considerable impacts on trip demand, contributing about 62.6% of the total influence. Factors also represent complex nonlinear relationships with dockless bike sharing usage. The effect ranges of each factor were identified, it indicates rebalancing schemes could be changed according to spatial location. These findings may help planners and policymakers to determine the reasonable scale of bike deployment and improve the efficiency of redistribution in local regions while reducing rebalance costs. INDEX TERMS Dockless bike sharing system, spatiotemporal patterns, built environment, community detection, gradient boosting decision tree.
doi:10.1109/access.2020.2985329 fatcat:mulsviuj4vaijirfs3ygzozm5y

Prompting as Probing: Using Language Models for Knowledge Base Construction [article]

Dimitrios Alivanistos, Selene Báez Santamaría, Michael Cochez, Jan-Christoph Kalo, Emile van Krieken, Thiviyan Thanapalasingam
2022 arXiv   pre-print
Language Models (LMs) have proven to be useful in various downstream applications, such as summarisation, translation, question answering and text classification. LMs are becoming increasingly important tools in Artificial Intelligence, because of the vast quantity of information they can store. In this work, we present ProP (Prompting as Probing), which utilizes GPT-3, a large Language Model originally proposed by OpenAI in 2020, to perform the task of Knowledge Base Construction (KBC). ProP
more » ... plements a multi-step approach that combines a variety of prompting techniques to achieve this. Our results show that manual prompt curation is essential, that the LM must be encouraged to give answer sets of variable lengths, in particular including empty answer sets, that true/false questions are a useful device to increase precision on suggestions generated by the LM, that the size of the LM is a crucial factor, and that a dictionary of entity aliases improves the LM score. Our evaluation study indicates that these proposed techniques can substantially enhance the quality of the final predictions: ProP won track 2 of the LM-KBC competition, outperforming the baseline by 36.4 percentage points. Our implementation is available on
arXiv:2208.11057v2 fatcat:jna2nnrbhvgitaah44hiskped4

Query Embedding on Hyper-relational Knowledge Graphs [article]

Dimitrios Alivanistos and Max Berrendorf and Michael Cochez and Mikhail Galkin
2022 arXiv   pre-print
Multi-hop logical reasoning is an established problem in the field of representation learning on knowledge graphs (KGs). It subsumes both one-hop link prediction as well as other more complex types of logical queries. Existing algorithms operate only on classical, triple-based graphs, whereas modern KGs often employ a hyper-relational modeling paradigm. In this paradigm, typed edges may have several key-value pairs known as qualifiers that provide fine-grained context for facts. In queries,
more » ... context modifies the meaning of relations, and usually reduces the answer set. Hyper-relational queries are often observed in real-world KG applications, and existing approaches for approximate query answering cannot make use of qualifier pairs. In this work, we bridge this gap and extend the multi-hop reasoning problem to hyper-relational KGs allowing to tackle this new type of complex queries. Building upon recent advancements in Graph Neural Networks and query embedding techniques, we study how to embed and answer hyper-relational conjunctive queries. Besides that, we propose a method to answer such queries and demonstrate in our experiments that qualifiers improve query answering on a diverse set of query patterns.
arXiv:2106.08166v3 fatcat:xaedvxacwnflphts4nw66wvvri

Identifying and Segmenting Commuting Behavior Patterns Based on Smart Card Data and Travel Survey Data

Pengfei Lin, Jiancheng Weng, Dimitrios Alivanistos, Siyong Ma, Baocai Yin
2020 Sustainability  
Understanding commuting patterns could provide effective support for the planning and operation of public transport systems. One-month smart card data and travel behavior survey data in Beijing were integrated to complement the socioeconomic attributes of cardholders. The light gradient boosting machine (LightGBM) was introduced to identify the commuting patterns considering the spatiotemporal regularity of travel behavior. Commuters were further divided into fine-grained clusters according to
more » ... heir departure time using the latent Dirichlet allocation model. To enhance the interpretation of the behavior patterns in each cluster, we investigated the relationship between the socioeconomic characteristics of the residence locations and commuter cluster distributions. Approximately 3.1 million cardholders were identified as commuters, accounting for 67.39% of daily passenger volume. Their commuting routes indicated the existence of job–house imbalance and excess commuting in Beijing. We further segmented commuters into six clusters with different temporal patterns, including two-peak, staggered shifts, flexible departure time, and single-peak. The residences of commuters are mainly concentrated in the low housing price and high or medium population density areas; subway facilities will promote people to commute using public transport. This study will help stakeholders optimize the public transport networks, scheduling scheme, and policy accordingly, thus ameliorating commuting within cities.
doi:10.3390/su12125010 fatcat:lbyqiyjtbncnvgloqqwfr677me