Filters








105 Hits in 0.94 sec

Improved Crowding Distance for NSGA-II [article]

Xiangxiang Chu, Xinjie Yu
2018 arXiv   pre-print
Non-dominated sorting genetic algorithm II (NSGA-II) does well in dealing with multi-objective problems. When evaluating validity of an algorithm for multi-objective problems, two kinds of indices are often considered simultaneously, i.e. the convergence to Pareto Front and the distribution characteristic. The crowding distance in the standard NSGA-II has the property that solutions within a cubic have the same crowding distance, which has no contribution to the convergence of the algorithm.
more » ... ually the closer to the Pareto Front a solution is, the higher priority it should have. In the paper, the crowding distance is redefined while keeping almost all the advantages of the original one. Moreover, the speed of converging to the Pareto Front is faster. Finally, the improvement is proved to be effective by applying it to solve nine Benchmark problems.
arXiv:1811.12667v1 fatcat:ilnmnbjz4raatmsmsdl3ihdl2y

CCTrans: Simplifying and Improving Crowd Counting with Transformer [article]

Ye Tian, Xiangxiang Chu, Hongpeng Wang
2021 arXiv   pre-print
; Chu et al. 2021b; Zheng et al. 2021 ).  ...  We adopt a pyramid transformer (Chu et al. 2021a) to capture global context through various downsampling stages.  ... 
arXiv:2109.14483v1 fatcat:25znv2jzgvbkthybkyxjozdrfu

Noisy Differentiable Architecture Search [article]

Xiangxiang Chu, Bo Zhang
2021 arXiv   pre-print
Simplicity is the ultimate sophistication. Differentiable Architecture Search (DARTS) has now become one of the mainstream paradigms of neural architecture search. However, it largely suffers from the well-known performance collapse issue due to the aggregation of skip connections. It is thought to have overly benefited from the residual structure which accelerates the information flow. To weaken this impact, we propose to inject unbiased random noise to impede the flow. We name this novel
more » ... ach NoisyDARTS. In effect, a network optimizer should perceive this difficulty at each training step and refrain from overshooting, especially on skip connections. In the long run, since we add no bias to the gradient in terms of expectation, it is still likely to converge to the right solution area. We also prove that the injected noise plays a role in smoothing the loss landscape, which makes the optimization easier. Our method features extreme simplicity and acts as a new strong baseline. We perform extensive experiments across various search spaces, datasets, and tasks, where we robustly achieve state-of-the-art results. Our code is available at https://github.com/xiaomi-automl/NoisyDARTS.
arXiv:2005.03566v3 fatcat:soqlordgljbpxgjqmh423qhkii

DAAS: Differentiable Architecture and Augmentation Policy Search [article]

Xiaoxing Wang, Xiangxiang Chu, Junchi Yan, Xiaokang Yang
2022 arXiv   pre-print
Neural architecture search (NAS) has been an active direction of automatic machine learning (Auto-ML), aiming to explore efficient network structures. The searched architecture is evaluated by training on datasets with fixed data augmentation policies. However, recent works on auto-augmentation show that the suited augmentation policies can vary over different structures. Therefore, this work considers the possible coupling between neural architectures and data augmentation and proposes an
more » ... tive algorithm jointly searching for them. Specifically, 1) for the NAS task, we adopt a single-path based differentiable method with Gumbel-softmax reparameterization strategy due to its memory efficiency; 2) for the auto-augmentation task, we introduce a novel search method based on policy gradient algorithm, which can significantly reduce the computation complexity. Our approach achieves 97.91% accuracy on CIFAR-10 and 76.6% Top-1 accuracy on ImageNet dataset, showing the outstanding performance of our search algorithm.
arXiv:2109.15273v2 fatcat:qvet6o3mqjgr7dtgfjakero3iu

MoGA: Searching Beyond MobileNetV3 [article]

Xiangxiang Chu, Bo Zhang, Ruijun Xu
2020 arXiv   pre-print
Mutation We use hierarchical mutation and the same hyperparameters as FairNAS (Chu et al. 2019a).  ...  In the meantime, neural architecture search becomes the new engine to empower the future architecture innovation Tan et al. 2019; Cai, Zhu, and Han 2019; Chu et al. 2019a) .  ... 
arXiv:1908.01314v4 fatcat:jb2qbtfsgnfczio3po4psyvu2u

A Matrix-in-matrix Neural Network for Image Super Resolution [article]

Hailong Ma and Xiangxiang Chu and Bo Zhang and Shaohua Wan and Bo Zhang
2019 arXiv   pre-print
In recent years, deep learning methods have achieved impressive results with higher peak signal-to-noise ratio in single image super-resolution (SISR) tasks by utilizing deeper layers. However, their application is quite limited since they require high computing power. In addition, most of the existing methods rarely take full advantage of the intermediate features which are helpful for restoration. To address these issues, we propose a moderate-size SISR net work named matrixed channel
more » ... n network (MCAN) by constructing a matrix ensemble of multi-connected channel attention blocks (MCAB). Several models of different sizes are released to meet various practical requirements. Conclusions can be drawn from our extensive benchmark experiments that the proposed models achieve better performance with much fewer multiply-adds and parameters. Our models will be made publicly available.
arXiv:1903.07949v1 fatcat:3gsyj43v3jcehippnatp5lweie

Modeling Motion with Multi-Modal Features for Text-Based Video Segmentation [article]

Wangbo Zhao, Kai Wang, Xiangxiang Chu, Fuzhao Xue, Xinchao Wang, Yang You
2022 arXiv   pre-print
Text-based video segmentation aims to segment the target object in a video based on a describing sentence. Incorporating motion information from optical flow maps with appearance and linguistic modalities is crucial yet has been largely ignored by previous work. In this paper, we design a method to fuse and align appearance, motion, and linguistic features to achieve accurate segmentation. Specifically, we propose a multi-modal video transformer, which can fuse and aggregate multi-modal and
more » ... oral features between frames. Furthermore, we design a language-guided feature fusion module to progressively fuse appearance and motion features in each feature level with guidance from linguistic features. Finally, a multi-modal alignment loss is proposed to alleviate the semantic gap between features from different modalities. Extensive experiments on A2D Sentences and J-HMDB Sentences verify the performance and the generalization ability of our method compared to the state-of-the-art methods.
arXiv:2204.02547v1 fatcat:5t2grxhb55hxbdvqh3ibmionuy

AutoKWS: Keyword Spotting with Differentiable Architecture Search [article]

Bo Zhang, Wenfeng Li, Qingyuan Li, Weiji Zhuang, Xiangxiang Chu, Yujun Wang
2021 arXiv   pre-print
Smart audio devices are gated by an always-on lightweight keyword spotting program to reduce power consumption. It is however challenging to design models that have both high accuracy and low latency for accurate and fast responsiveness. Many efforts have been made to develop end-to-end neural networks, in which depthwise separable convolutions, temporal convolutions, and LSTMs are adopted as building units. Nonetheless, these networks designed with human expertise may not achieve an optimal
more » ... de-off in an expansive search space. In this paper, we propose to leverage recent advances in differentiable neural architecture search to discover more efficient networks. Our searched model attains 97.2% top-1 accuracy on Google Speech Command Dataset v1 with only nearly 100K parameters.
arXiv:2009.03658v2 fatcat:qlvhtpvkmndcbm6shudaq5l2i4

Neural Architecture Search on Acoustic Scene Classification [article]

Jixiang Li, Chuming Liang, Bo Zhang, Zhao Wang, Fei Xiang, Xiangxiang Chu
2020 arXiv   pre-print
Convolutional neural networks are widely adopted in Acoustic Scene Classification (ASC) tasks, but they generally carry a heavy computational burden. In this work, we propose a lightweight yet high-performing baseline network inspired by MobileNetV2, which replaces square convolutional kernels with unidirectional ones to extract features alternately in temporal and frequency dimensions. Furthermore, we explore a dynamic architecture space built on the basis of the proposed baseline with the
more » ... nt Neural Architecture Search (NAS) paradigm, which first trains a supernet that incorporates all candidate networks and then applies a well-known evolutionary algorithm NSGA-II to discover more efficient networks with higher accuracy and lower computational cost. Experimental results demonstrate that our searched network is competent in ASC tasks, which achieves 90.3% F1-score on the DCASE2018 task 5 evaluation set, marking a new state-of-the-art performance while saving 25% of FLOPs compared to our baseline network.
arXiv:1912.12825v2 fatcat:7uwjkul6bzfnbpace3wzgeubta

Neural Architecture Search on Acoustic Scene Classification

Jixiang Li, Chuming Liang, Bo Zhang, Zhao Wang, Fei Xiang, Xiangxiang Chu
2020 Interspeech 2020  
Convolutional neural networks are widely adopted in Acoustic Scene Classification (ASC) tasks, but they generally carry a heavy computational burden. In this work, we propose a highperformance yet lightweight baseline network inspired by Mo-bileNetV2, which replaces square convolutional kernels with unidirectional ones to extract features alternately in temporal and frequency dimensions. Furthermore, we explore a dynamic architecture space built on the basis of the proposed baseline with the
more » ... ent Neural Architecture Search (NAS) paradigm, which first train a supernet that incorporates all candidate architectures and then apply a well-known evolutionary algorithm NSGA-II to discover more efficient networks with higher accuracy and lower computational cost from the supernet. Experimental results demonstrate that our searched network is competent in ASC tasks, which achieves 90.3% F1-score on the DCASE2018 task 5 evaluation set, marking a new state-of-theart performance while saving 25% of FLOPs compared to our baseline network.
doi:10.21437/interspeech.2020-0057 dblp:conf/interspeech/LiL0WXC20 fatcat:rq4klwpedbbclbixjjls6kbqbe

ROME: Robustifying Memory-Efficient NAS via Topology Disentanglement and Gradients Accumulation [article]

Xiaoxing Wang and Xiangxiang Chu and Yuda Fan and Zhexi Zhang and Xiaolin Wei and Junchi Yan and Xiaokang Yang
2020 arXiv   pre-print
Single-path based differentiable neural architecture search has great strengths for its low computational cost and memory-friendly nature. However, we surprisingly discover that it suffers from severe searching instability which has been primarily ignored, posing a potential weakness for a wider application. In this paper, we delve into its performance collapse issue and propose a new algorithm called RObustifying Memory-Efficient NAS (ROME). Specifically, 1) for consistent topology in the
more » ... h and evaluation stage, we involve separate parameters to disentangle the topology from the operations of the architecture. In such a way, we can independently sample connections and operations without interference; 2) to discount sampling unfairness and variance, we enforce fair sampling for weight update and apply a gradient accumulation mechanism for architecture parameters. Extensive experiments demonstrate that our proposed method has strong performance and robustness, where it mostly achieves state-of-the-art results on a large number of standard benchmarks.
arXiv:2011.11233v1 fatcat:a7eup6odhjgspbmkeyhnddyide

DARTS-: Robustly Stepping out of Performance Collapse Without Indicators [article]

Xiangxiang Chu, Xiaoxing Wang, Bo Zhang, Shun Lu, Xiaolin Wei, Junchi Yan
2021 arXiv   pre-print
Inspired by Chu et al. (2020c) , our method focuses on calibrating the biased searching process.  ...  By contrast, direct applying DARTS on this search space only obtains 66.4% (Chu et al., 2020c) .  ... 
arXiv:2009.01027v2 fatcat:fvskev6pybbnjg6qllmdjtp7fi

Parameter Sharing Deep Deterministic Policy Gradient for Cooperative Multi-agent Reinforcement Learning [article]

Xiangxiang Chu, Hangjun Ye
2017 arXiv   pre-print
-----Correspondence to: Xiangxiang Chu <chuxiangxiang@xiaomi.com>.  ... 
arXiv:1710.00336v2 fatcat:wsq3yokbpzgwvhntedmymcgg5i

Antibody-drug conjugates for the treatment of lymphoma: clinical advances and latest progress

Yurou Chu, Xiangxiang Zhou, Xin Wang
2021 Journal of Hematology & Oncology  
AbstractAntibody-drug conjugates (ADCs) are a promising class of immunotherapies with the potential to specifically target tumor cells and ameliorate the therapeutic index of cytotoxic drugs. ADCs comprise monoclonal antibodies, cytotoxic payloads with inherent antitumor activity, and specialized linkers connecting the two. In recent years, three ADCs, brentuximab vedotin, polatuzumab vedotin, and loncastuximab tesirine, have been approved and are already establishing their place in lymphoma
more » ... atment. As the efficacy and safety of ADCs have moved in synchrony with advances in their design, a plethora of novel ADCs have garnered growing interest as treatments. In this review, we provide an overview of the essential elements of ADC strategies in lymphoma and elucidate the up-to-date progress, current challenges, and novel targets of ADCs in this rapidly evolving field.
doi:10.1186/s13045-021-01097-z pmid:34090506 fatcat:wz7zw4t6lndfrfpovryj3ee4ma

Conditional Positional Encodings for Vision Transformers [article]

Xiangxiang Chu and Zhi Tian and Bo Zhang and Xinlong Wang and Xiaolin Wei and Huaxia Xia and Chunhua Shen
2021 arXiv   pre-print
We propose a conditional positional encoding (CPE) scheme for vision Transformers. Unlike previous fixed or learnable positional encodings, which are pre-defined and independent of input tokens, CPE is dynamically generated and conditioned on the local neighborhood of the input tokens. As a result, CPE can easily generalize to the input sequences that are longer than what the model has ever seen during training. Besides, CPE can keep the desired translation-invariance in the image
more » ... task, resulting in improved classification accuracy. CPE can be effortlessly implemented with a simple Position Encoding Generator (PEG), and it can be seamlessly incorporated into the current Transformer framework. Built on PEG, we present Conditional Position encoding Vision Transformer (CPVT). We demonstrate that CPVT has visually similar attention maps compared to those with learned positional encodings. Benefit from the conditional positional encoding scheme, we obtain state-of-the-art results on the ImageNet classification task compared with vision Transformers to date. Our code will be made available at https://github.com/Meituan-AutoML/CPVT .
arXiv:2102.10882v2 fatcat:uihyzgc44ndmjn3rbbiltf7avm
« Previous Showing results 1 — 15 out of 105 results