Filters








312 Hits in 1.6 sec

Equivariant Graph Mechanics Networks with Constraints [article]

Wenbing Huang, Jiaqi Han, Yu Rong, Tingyang Xu, Fuchun Sun, Junzhou Huang
2022 arXiv   pre-print
Learning to reason about relations and dynamics over multiple interacting objects is a challenging topic in machine learning. The challenges mainly stem from that the interacting systems are exponentially-compositional, symmetrical, and commonly geometrically-constrained. Current methods, particularly the ones based on equivariant Graph Neural Networks (GNNs), have targeted on the first two challenges but remain immature for constrained systems. In this paper, we propose Graph Mechanics Network
more » ... (GMN) which is combinatorially efficient, equivariant and constraint-aware. The core of GMN is that it represents, by generalized coordinates, the forward kinematics information (positions and velocities) of a structural object. In this manner, the geometrical constraints are implicitly and naturally encoded in the forward kinematics. Moreover, to allow equivariant message passing in GMN, we have developed a general form of orthogonality-equivariant functions, given that the dynamics of constrained systems are more complicated than the unconstrained counterparts. Theoretically, the proposed equivariant formulation is proved to be universally expressive under certain conditions. Extensive experiments support the advantages of GMN compared to the state-of-the-art GNNs in terms of prediction accuracy, constraint satisfaction and data efficiency on the simulated systems consisting of particles, sticks and hinges, as well as two real-world datasets for molecular dynamics prediction and human motion capture.
arXiv:2203.06442v1 fatcat:ueyqmfnapjgxxovuf3ieohxsb4

Unsupervised Adversarial Graph Alignment with Graph Embedding [article]

Chaoqi Chen, Weiping Xie, Tingyang Xu, Yu Rong, Wenbing Huang, Xinghao Ding, Yue Huang, Junzhou Huang
2019 arXiv   pre-print
Graph alignment, also known as network alignment, is a fundamental task in social network analysis. Many recent works have relied on partially labeled cross-graph node correspondences, i.e., anchor links. However, due to the privacy and security issue, the manual labeling of anchor links for diverse scenarios may be prohibitive. Aligning two graphs without any anchor links is a crucial and challenging task. In this paper, we propose an Unsupervised Adversarial Graph Alignment (UAGA) framework
more » ... learn a cross-graph alignment between two embedding spaces of different graphs in a fully unsupervised fashion (i.e., no existing anchor links and no users' personal profile or attribute information is available). The proposed framework learns the embedding spaces of each graph, and then attempts to align the two spaces via adversarial training, followed by a refinement procedure. We further extend our UAGA method to incremental UAGA (iUAGA) that iteratively reveals the unobserved user links based on the pseudo anchor links. This can be used to further improve both the embedding quality and the alignment accuracy. Moreover, the proposed methods will benefit some real-world applications, e.g., link prediction in social networks. Comprehensive experiments on real-world data demonstrate the effectiveness of our proposed approaches UAGA and iUAGA for unsupervised graph alignment.
arXiv:1907.00544v1 fatcat:s4zgpykm2zf3vhklr4kfyk6fly

Frustratingly Easy Transferability Estimation [article]

Long-Kai Huang, Ying Wei, Yu Rong, Qiang Yang, Junzhou Huang
2022 arXiv   pre-print
Transferability estimation has been an essential tool in selecting a pre-trained model and the layers in it for transfer learning, to transfer, so as to maximize the performance on a target task and prevent negative transfer. Existing estimation algorithms either require intensive training on target tasks or have difficulties in evaluating the transferability between layers. To this end, we propose a simple, efficient, and effective transferability measure named TransRate. Through a single pass
more » ... over examples of a target task, TransRate measures the transferability as the mutual information between features of target examples extracted by a pre-trained model and labels of them. We overcome the challenge of efficient mutual information estimation by resorting to coding rate that serves as an effective alternative to entropy. From the perspective of feature representation, the resulting TransRate evaluates both completeness (whether features contain sufficient information of a target task) and compactness (whether features of each class are compact enough for good generalization) of pre-trained features. Theoretically, we have analyzed the close connection of TransRate to the performance after transfer learning. Despite its extraordinary simplicity in 10 lines of codes, TransRate performs remarkably well in extensive evaluations on 26 pre-trained models and 16 downstream tasks.
arXiv:2106.09362v2 fatcat:uhgti3ezejg35k6kiowbsmkady

Adaptive Graph Convolutional Neural Networks [article]

Ruoyu Li, Sheng Wang, Feiyun Zhu, Junzhou Huang
2018 arXiv   pre-print
Graph Convolutional Neural Networks (Graph CNNs) are generalizations of classical CNNs to handle graph data such as molecular data, point could and social networks. Current filters in graph CNNs are built for fixed and shared graph structure. However, for most real data, the graph structures varies in both size and connectivity. The paper proposes a generalized and flexible graph CNN taking data of arbitrary graph structure as input. In that way a task-driven adaptive graph is learned for each
more » ... raph data while training. To efficiently learn the graph, a distance metric learning is proposed. Extensive experiments on nine graph-structured datasets have demonstrated the superior performance improvement on both convergence speed and predictive accuracy.
arXiv:1801.03226v1 fatcat:2l46c3eqdbde5j7es6ngicn56e

Semi-Supervised Graph Classification: A Hierarchical Graph Perspective [article]

Jia Li, Yu Rong, Hong Cheng, Helen Meng, Wenbing Huang, Junzhou Huang
2019 arXiv   pre-print
Node classification and graph classification are two graph learning problems that predict the class label of a node and the class label of a graph respectively. A node of a graph usually represents a real-world entity, e.g., a user in a social network, or a protein in a protein-protein interaction network. In this work, we consider a more challenging but practically useful setting, in which a node itself is a graph instance. This leads to a hierarchical graph perspective which arises in many
more » ... ains such as social network, biological network and document collection. For example, in a social network, a group of people with shared interests forms a user group, whereas a number of user groups are interconnected via interactions or common members. We study the node classification problem in the hierarchical graph where a 'node' is a graph instance, e.g., a user group in the above example. As labels are usually limited in real-world data, we design two novel semi-supervised solutions named SEmi-supervised grAph cLassification via Cautious/Active Iteration (or SEAL-C/AI in short). SEAL-C/AI adopt an iterative framework that takes turns to build or update two classifiers, one working at the graph instance level and the other at the hierarchical graph level. To simplify the representation of the hierarchical graph, we propose a novel supervised, self-attentive graph embedding method called SAGE, which embeds graph instances of arbitrary size into fixed-length vectors. Through experiments on synthetic data and Tencent QQ group data, we demonstrate that SEAL-C/AI not only outperform competing methods by a significant margin in terms of accuracy/Macro-F1, but also generate meaningful interpretations of the learned representations.
arXiv:1904.05003v1 fatcat:g2izleyzcvdhza73ceofpnxfxe

Weakly Supervised Dense Event Captioning in Videos [article]

Xuguang Duan, Wenbing Huang, Chuang Gan, Jingdong Wang, Wenwu Zhu and Junzhou Huang
2018 arXiv   pre-print
Dense event captioning aims to detect and describe all events of interest contained in a video. Despite the advanced development in this area, existing methods tackle this task by making use of dense temporal annotations, which is dramatically source-consuming. This paper formulates a new problem: weakly supervised dense event captioning, which does not require temporal segment annotations for model training. Our solution is based on the one-to-one correspondence assumption, each caption
more » ... es one temporal segment, and each temporal segment has one caption, which holds in current benchmark datasets and most real-world cases. We decompose the problem into a pair of dual problems: event captioning and sentence localization and present a cycle system to train our model. Extensive experimental results are provided to demonstrate the ability of our model on both dense event captioning and sentence localization in videos.
arXiv:1812.03849v1 fatcat:pwjxl25jfrdopa7wijsyn7wrfi

Graph Convolutional Networks for Temporal Action Localization [article]

Runhao Zeng, Wenbing Huang, Mingkui Tan, Yu Rong, Peilin Zhao, Junzhou Huang, Chuang Gan
2019 arXiv   pre-print
Most state-of-the-art action localization systems process each action proposal individually, without explicitly exploiting their relations during learning. However, the relations between proposals actually play an important role in action localization, since a meaningful action always consists of multiple proposals in a video. In this paper, we propose to exploit the proposal-proposal relations using Graph Convolutional Networks (GCNs). First, we construct an action proposal graph, where each
more » ... oposal is represented as a node and their relations between two proposals as an edge. Here, we use two types of relations, one for capturing the context information for each proposal and the other one for characterizing the correlations between distinct actions. Then we apply the GCNs over the graph to model the relations among different proposals and learn powerful representations for the action classification and localization. Experimental results show that our approach significantly outperforms the state-of-the-art on THUMOS14 (49.1% versus 42.8%). Moreover, augmentation experiments on ActivityNet also verify the efficacy of modeling action proposal relationships. Codes are available at https://github.com/Alvin-Zeng/PGCN.
arXiv:1909.03252v1 fatcat:h3ovcv2dobcufn6lmmcbz3mgfm

Learning with Structured Sparsity [article]

Junzhou Huang, Tong Zhang, Dimitris Metaxas
2009 arXiv   pre-print
This paper investigates a new learning formulation called structured sparsity, which is a natural extension of the standard sparsity concept in statistical learning and compressive sensing. By allowing arbitrary structures on the feature set, this concept generalizes the group sparsity idea that has become popular in recent years. A general theory is developed for learning with structured sparsity, based on the notion of coding complexity associated with the structure. It is shown that if the
more » ... ding complexity of the target signal is small, then one can achieve improved performance by using coding complexity regularization methods, which generalize the standard sparse regularization. Moreover, a structured greedy algorithm is proposed to efficiently solve the structured sparsity problem. It is shown that the greedy algorithm approximately solves the coding complexity optimization problem under appropriate conditions. Experiments are included to demonstrate the advantage of structured sparsity over standard sparsity on some real applications.
arXiv:0903.3002v2 fatcat:zjz622hf7vgbvpptiyw5lwswti

RaFM: Rank-Aware Factorization Machines [article]

Xiaoshuang Chen, Yin Zheng, Jiaxing Wang, Wenye Ma, Junzhou Huang
2019 arXiv   pre-print
Factorization machines (FM) are a popular model class to learn pairwise interactions by a low-rank approximation. Different from existing FM-based approaches which use a fixed rank for all features, this paper proposes a Rank-Aware FM (RaFM) model which adopts pairwise interactions from embeddings with different ranks. The proposed model achieves a better performance on real-world datasets where different features have significantly varying frequencies of occurrences. Moreover, we prove that
more » ... RaFM model can be stored, evaluated, and trained as efficiently as one single FM, and under some reasonable conditions it can be even significantly more efficient than FM. RaFM improves the performance of FMs in both regression tasks and classification tasks while incurring less computational burden, therefore also has attractive potential in industrial applications.
arXiv:1905.07570v1 fatcat:kpz7nvwydzftxmdfhijj7p47pu

Graph Representation Learning via Graphical Mutual Information Maximization [article]

Zhen Peng, Wenbing Huang, Minnan Luo, Qinghua Zheng, Yu Rong, Tingyang Xu, Junzhou Huang
2020 arXiv   pre-print
The richness in the content of various information networks such as social networks and communication networks provides the unprecedented potential for learning high-quality expressive representations without external supervision. This paper investigates how to preserve and extract the abundant information from graph-structured data into embedding space in an unsupervised manner. To this end, we propose a novel concept, Graphical Mutual Information (GMI), to measure the correlation between
more » ... graphs and high-level hidden representations. GMI generalizes the idea of conventional mutual information computations from vector space to the graph domain where measuring mutual information from two aspects of node features and topological structure is indispensable. GMI exhibits several benefits: First, it is invariant to the isomorphic transformation of input graphs---an inevitable constraint in many existing graph representation learning algorithms; Besides, it can be efficiently estimated and maximized by current mutual information estimation methods such as MINE; Finally, our theoretical analysis confirms its correctness and rationality. With the aid of GMI, we develop an unsupervised learning model trained by maximizing GMI between the input and output of a graph neural encoder. Considerable experiments on transductive as well as inductive node classification and link prediction demonstrate that our method outperforms state-of-the-art unsupervised counterparts, and even sometimes exceeds the performance of supervised ones.
arXiv:2002.01169v1 fatcat:lburokr27jbdfowwgv2opovrd4

Track Facial Points in Unconstrained Videos [article]

Xi Peng, Qiong Hu, Junzhou Huang, Dimitris N. Metaxas
2016 arXiv   pre-print
Tracking Facial Points in unconstrained videos is challenging due to the non-rigid deformation that changes over time. In this paper, we propose to exploit incremental learning for person-specific alignment in wild conditions. Our approach takes advantage of part-based representation and cascade regression for robust and efficient alignment on each frame. Unlike existing methods that usually rely on models trained offline, we incrementally update the representation subspace and the cascade of
more » ... gressors in a unified framework to achieve personalized modeling on the fly. To alleviate the drifting issue, the fitting results are evaluated using a deep neural network, where well-aligned faces are picked out to incrementally update the representation and fitting models. Both image and video datasets are employed to valid the proposed method. The results demonstrate the superior performance of our approach compared with existing approaches in terms of fitting accuracy and efficiency.
arXiv:1609.02825v1 fatcat:bkmhkkdx55cihjnep6hex7jbji

The Benefit of Group Sparsity [article]

Junzhou Huang, Tong Zhang
2009 arXiv   pre-print
This paper develops a theory for group Lasso using a concept called strong group sparsity. Our result shows that group Lasso is superior to standard Lasso for strongly group-sparse signals. This provides a convincing theoretical justification for using group sparse regularization when the underlying group structure is consistent with the data. Moreover, the theory predicts some limitations of the group Lasso formulation that are confirmed by simulation studies.
arXiv:0901.2962v2 fatcat:a44ckchfzvahbccvex4bjayi6y

Hierarchically Structured Meta-learning [article]

Huaxiu Yao, Ying Wei, Junzhou Huang, Zhenhui Li
2019 arXiv   pre-print
In order to learn quickly with few samples, meta-learning utilizes prior knowledge learned from previous tasks. However, a critical challenge in meta-learning is task uncertainty and heterogeneity, which can not be handled via globally sharing knowledge among tasks. In this paper, based on gradient-based meta-learning, we propose a hierarchically structured meta-learning (HSML) algorithm that explicitly tailors the transferable knowledge to different clusters of tasks. Inspired by the way human
more » ... beings organize knowledge, we resort to a hierarchical task clustering structure to cluster tasks. As a result, the proposed approach not only addresses the challenge via the knowledge customization to different clusters of tasks, but also preserves knowledge generalization among a cluster of similar tasks. To tackle the changing of task relationship, in addition, we extend the hierarchical structure to a continual learning environment. The experimental results show that our approach can achieve state-of-the-art performance in both toy-regression and few-shot image classification problems.
arXiv:1905.05301v2 fatcat:6vj3r22p4vanxkbrql7gcznria

Graph Convolutional Module for Temporal Action Localization in Videos [article]

Runhao Zeng, Wenbing Huang, Mingkui Tan, Yu Rong, Peilin Zhao, Junzhou Huang, Chuang Gan
2021 arXiv   pre-print
Huang et al. decoupled the localization and classification in a one-stage scheme.  ...  Paradigm tIoU 0.1 0.2 0.3 0.4 0.5 Yeung et al. [52] - - 36.0 26.4 17.1 Lin et al. [26] - - 43.0 35.0 24.6 One-Stage Buch et al. [2] - - 45.7 - 29.2 Huang et al. [22] 66.4 64.7 59.8 53.4 43.2 Huang et al  ... 
arXiv:2112.00302v1 fatcat:pmvdmtkqvve57ozkdu3gdoiyou

End-to-End Learning of Motion Representation for Video Understanding [article]

Lijie Fan, Wenbing Huang, Chuang Gan, Stefano Ermon, Boqing Gong, Junzhou Huang
2018 arXiv   pre-print
Despite the recent success of end-to-end learned representations, hand-crafted optical flow features are still widely used in video analysis tasks. To fill this gap, we propose TVNet, a novel end-to-end trainable neural network, to learn optical-flow-like features from data. TVNet subsumes a specific optical flow solver, the TV-L1 method, and is initialized by unfolding its optimization iterations as neural layers. TVNet can therefore be used directly without any extra learning. Moreover, it
more » ... be naturally concatenated with other task-specific networks to formulate an end-to-end architecture, thus making our method more efficient than current multi-stage approaches by avoiding the need to pre-compute and store features on disk. Finally, the parameters of the TVNet can be further fine-tuned by end-to-end training. This enables TVNet to learn richer and task-specific patterns beyond exact optical flow. Extensive experiments on two action recognition benchmarks verify the effectiveness of the proposed approach. Our TVNet achieves better accuracies than all compared methods, while being competitive with the fastest counterpart in terms of features extraction time.
arXiv:1804.00413v1 fatcat:twtlfqa2tbgazn45mfz2g3gv7i
« Previous Showing results 1 — 15 out of 312 results