Filters








91 Hits in 1.5 sec

Efficient Competitive Self-Play Policy Optimization [article]

Yuanyi Zhong, Yuan Zhou, Jian Peng
2020 arXiv   pre-print
Reinforcement learning from self-play has recently reported many successes. Self-play, where the agents compete with themselves, is often used to generate training data for iterative policy improvement. In previous work, heuristic rules are designed to choose an opponent for the current learner. Typical rules include choosing the latest agent, the best agent, or a random historical agent. However, these rules may be inefficient in practice and sometimes do not guarantee convergence even in the
more » ... implest matrix games. In this paper, we propose a new algorithmic framework for competitive self-play reinforcement learning in two-player zero-sum games. We recognize the fact that the Nash equilibrium coincides with the saddle point of the stochastic payoff function, which motivates us to borrow ideas from classical saddle point optimization literature. Our method trains several agents simultaneously, and intelligently takes each other as opponent based on simple adversarial rules derived from a principled perturbation-based saddle optimization method. We prove theoretically that our algorithm converges to an approximate equilibrium with high probability in convex-concave games under standard assumptions. Beyond the theory, we further show the empirical superiority of our method over baseline methods relying on the aforementioned opponent-selection heuristics in matrix games, grid-world soccer, Gomoku, and simulated robot sumo, with neural net policy function approximators.
arXiv:2009.06086v1 fatcat:f2aoh452wraxhmnbsydvhodd2a

Pixel Contrastive-Consistent Semi-Supervised Semantic Segmentation [article]

Yuanyi Zhong, Bodi Yuan, Hong Wu, Zhiqiang Yuan, Jian Peng, Yu-Xiong Wang
2021 arXiv   pre-print
We present a novel semi-supervised semantic segmentation method which jointly achieves two desiderata of segmentation model regularities: the label-space consistency property between image augmentations and the feature-space contrastive property among different pixels. We leverage the pixel-level L2 loss and the pixel contrastive loss for the two purposes respectively. To address the computational efficiency issue and the false negative noise issue involved in the pixel contrastive loss, we
more » ... her introduce and investigate several negative sampling techniques. Extensive experiments demonstrate the state-of-the-art performance of our method (PC2Seg) with the DeepLab-v3+ architecture, in several challenging semi-supervised settings derived from the VOC, Cityscapes, and COCO datasets.
arXiv:2108.09025v1 fatcat:mx3itex7drdwlfjrvo4v5stygu

Boosting Weakly Supervised Object Detection with Progressive Knowledge Transfer [article]

Yuanyi Zhong, Jianfeng Wang, Jian Peng, Lei Zhang
2020 arXiv   pre-print
In this paper, we propose an effective knowledge transfer framework to boost the weakly supervised object detection accuracy with the help of an external fully-annotated source dataset, whose categories may not overlap with the target domain. This setting is of great practical value due to the existence of many off-the-shelf detection datasets. To more effectively utilize the source dataset, we propose to iteratively transfer the knowledge from the source domain by a one-class universal
more » ... and learn the target-domain detector. The box-level pseudo ground truths mined by the target-domain detector in each iteration effectively improve the one-class universal detector. Therefore, the knowledge in the source dataset is more thoroughly exploited and leveraged. Extensive experiments are conducted with Pascal VOC 2007 as the target weakly-annotated dataset and COCO/ImageNet as the source fully-annotated dataset. With the proposed solution, we achieved an mAP of 59.7% detection performance on the VOC test set and an mAP of 60.2% after retraining a fully supervised Faster RCNN with the mined pseudo ground truths. This is significantly better than any previously known results in related literature and sets a new state-of-the-art of weakly supervised object detection under the knowledge transfer setting. Code: .
arXiv:2007.07986v1 fatcat:h7p67bydljcwfbhx5z7zx3i3yy

Disentangling Controllable Object through Video Prediction Improves Visual Reinforcement Learning [article]

Yuanyi Zhong, Alexander Schwing, Jian Peng
2020 arXiv   pre-print
In many vision-based reinforcement learning (RL) problems, the agent controls a movable object in its visual field, e.g., the player's avatar in video games and the robotic arm in visual grasping and manipulation. Leveraging action-conditioned video prediction, we propose an end-to-end learning framework to disentangle the controllable object from the observation signal. The disentangled representation is shown to be useful for RL as additional observation channels to the agent. Experiments on
more » ... set of Atari games with the popular Double DQN algorithm demonstrate improved sample efficiency and game performance (from 222.8% to 261.4% measured in normalized game scores, with prediction bonus reward).
arXiv:2002.09136v1 fatcat:bw3iyw3gnfhtlldylcxxdwcb7e

Rethinking Feature Distribution for Loss Functions in Image Classification [article]

Weitao Wan, Yuanyi Zhong, Tianpeng Li, Jiansheng Chen
2018 arXiv   pre-print
We propose a large-margin Gaussian Mixture (L-GM) loss for deep neural networks in classification tasks. Different from the softmax cross-entropy loss, our proposal is established on the assumption that the deep features of the training set follow a Gaussian Mixture distribution. By involving a classification margin and a likelihood regularization, the L-GM loss facilitates both a high classification performance and an accurate modeling of the training feature distribution. As such, the L-GM
more » ... s is superior to the softmax loss and its major variants in the sense that besides classification, it can be readily used to distinguish abnormal inputs, such as the adversarial examples, based on their features' likelihood to the training feature distribution. Extensive experiments on various recognition benchmarks like MNIST, CIFAR, ImageNet and LFW, as well as on adversarial examples demonstrate the effectiveness of our proposal.
arXiv:1803.02988v1 fatcat:cvrz3xhy4fd7xepoypfniaab4q

Anchor Box Optimization for Object Detection [article]

Yuanyi Zhong, Jianfeng Wang, Jian Peng, Lei Zhang
2020 arXiv   pre-print
In this paper, we propose a general approach to optimize anchor boxes for object detection. Nowadays, anchor boxes are widely adopted in state-of-the-art detection frameworks. However, these frameworks usually pre-define anchor box shapes in heuristic ways and fix the sizes during training. To improve the accuracy and reduce the effort of designing anchor boxes, we propose to dynamically learn the anchor shapes, which allows the anchors to automatically adapt to the data distribution and the
more » ... work learning capability. The learning approach can be easily implemented with stochastic gradient descent and can be plugged into any anchor box-based detection framework. The extra training cost is almost negligible and it has no impact on the inference time or memory cost. Exhaustive experiments demonstrate that the proposed anchor optimization method consistently achieves significant improvement (> 1% mAP absolute gain) over the baseline methods on several benchmark datasets including Pascal VOC 07+12, MS COCO and Brainwash. Meanwhile, the robustness is also verified towards different anchor initialization methods and the number of anchor shapes, which greatly simplifies the problem of anchor box design.
arXiv:1812.00469v2 fatcat:b3sfka5jwbf5lemlagsnxlnej4

DAP: Detection-Aware Pre-training with Weak Supervision [article]

Yuanyi Zhong, Jianfeng Wang, Lijuan Wang, Jian Peng, Yu-Xiong Wang, Lei Zhang
2021 arXiv   pre-print
This paper presents a detection-aware pre-training (DAP) approach, which leverages only weakly-labeled classification-style datasets (e.g., ImageNet) for pre-training, but is specifically tailored to benefit object detection tasks. In contrast to the widely used image classification-based pre-training (e.g., on ImageNet), which does not include any location-related training tasks, we transform a classification dataset into a detection dataset through a weakly supervised object localization
more » ... d based on Class Activation Maps to directly pre-train a detector, making the pre-trained model location-aware and capable of predicting bounding boxes. We show that DAP can outperform the traditional classification pre-training in terms of both sample efficiency and convergence speed in downstream detection tasks including VOC and COCO. In particular, DAP boosts the detection accuracy by a large margin when the number of examples in the downstream task is small.
arXiv:2103.16651v1 fatcat:javcwclhmfa7ho747hsoaaeawu

Coordinate-wise Control Variates for Deep Policy Gradients [article]

Yuanyi Zhong, Yuan Zhou, Jian Peng
2021 arXiv   pre-print
Correspondence to: Yuanyi Zhong <yuanyiz2@illinois.edu>.  ... 
arXiv:2107.04987v2 fatcat:d6dkth7kbbh4ppzimkramacvlu

Rethinking Feature Distribution for Loss Functions in Image Classification

Weitao Wan, Yuanyi Zhong, Tianpeng Li, Jiansheng Chen
2018 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition  
We propose a large-margin Gaussian Mixture (L-GM) loss for deep neural networks in classification tasks. Different from the softmax cross-entropy loss, our proposal is established on the assumption that the deep features of the training set follow a Gaussian Mixture distribution. By involving a classification margin and a likelihood regularization, the L-GM loss facilitates both a high classification performance and an accurate modeling of the training feature distribution. As such, the L-GM
more » ... s is superior to the softmax loss and its major variants in the sense that besides classification, it can be readily used to distinguish abnormal inputs, such as the adversarial examples, based on their features' likelihood to the training feature distribution. Extensive experiments on various recognition benchmarks like MNIST, CIFAR, ImageNet and LFW, as well as on adversarial examples demonstrate the effectiveness of our proposal.
doi:10.1109/cvpr.2018.00950 dblp:conf/cvpr/WanZLC18 fatcat:hzz5knge7fcuxbohs7zbojaxyq

Toward End-to-End Face Recognition Through Alignment Learning

Yuanyi Zhong, Jiansheng Chen, Bo Huang
2017 IEEE Signal Processing Letters  
Plenty of effective methods have been proposed for face recognition during the past decade. Although these methods differ essentially in many aspects, a common practice of them is to specifically align the facial area based on the prior knowledge of human face structure before feature extraction. In most systems, the face alignment module is implemented independently. This has actually caused difficulties in the designing and training of end-to-end face recognition models. In this paper we
more » ... the possibility of alignment learning in end-to-end face recognition, in which neither prior knowledge on facial landmarks nor artificially defined geometric transformations are required. Specifically, spatial transformer layers are inserted in front of the feature extraction layers in a Convolutional Neural Network (CNN) for face recognition. Only human identity clues are used for driving the neural network to automatically learn the most suitable geometric transformation and the most appropriate facial area for the recognition task. To ensure reproducibility, our model is trained purely on the publicly available CASIA-WebFace dataset, and is tested on the Labeled Face in the Wild (LFW) dataset. We have achieved a verification accuracy of 99.08\% which is comparable to state-of-the-art single model based methods.
doi:10.1109/lsp.2017.2715076 fatcat:rbea5eqebjfh7ih66kt37abkoe

Electro-Deformation of Fused Cells in a Microfluidic Array Device

Yan Liu, Xiaoling Zhang, Mengdi Chen, Danfen Yin, Zhong Yang, Xi Chen, Zhenyu Wang, Jie Xu, Yuanyi Li, Jun Qiu, Ning Hu, Jun Yang
2016 Micromachines  
We present a new method of analyzing the deformability of fused cells in a microfluidic array device. Electrical stresses-generated by applying voltages (4-20 V) across discrete co-planar microelectrodes along the side walls of a microfluidic channel-have been used to electro-deform fused and unfused stem cells. Under an electro-deformation force induced by applying an alternating current (AC) signal, we observed significant electro-deformation phenomena. The experimental results show that the
more » ... used stem cells were stiffer than the unfused stem cells at a relatively low voltage (<16 V). However, at a relatively high voltage, the fused stem cells were more easily deformed than were the unfused stem cells. In addition, the electro-deformation process is modeled based on the Maxwell stress tensor and structural mechanics of cells. The theoretical results show that a positive correlation is found between the deformation of the cell and the applied voltage, which is consistent with the experimental results. Combined with a numerical analysis and experimental study, the results showed that the significant difference of the deformation ratio of the fused and unfused cells is not due to their size difference. This demonstrates that some other properties of cell membranes (such as the membrane structure) were also changed in the electrofusion process, in addition to the size modification of that process.
doi:10.3390/mi7110204 pmid:30404377 pmcid:PMC6189768 fatcat:kfxdkusbqbezdgfkofshkd6ux4

Frequency-Dependent Electroformation of Giant Unilamellar Vesicles in 3D and 2D Microelectrode Systems

Qiong Wang, Xiaoling Zhang, Ting Fan, Zhong Yang, Xi Chen, Zhenyu Wang, Jie Xu, Yuanyi Li, Ning Hu, Jun Yang
2017 Micromachines  
A giant unilamellar vesicle (GUV), with similar properties to cellular membrane, has been widely studied. Electroformation with its simplicity and accessibility has become the most common method for GUV production. In this work, GUV electroformation in devices with traditional 3D and new 2D electrode structures were studied with respect to the applied electric field. An optimal frequency (10 kHz in the 3D and 1 kHz in the 2D systems) was found in each system. A positive correlation was found
more » ... ween GUV formation and applied voltage in the 3D electrode system from 1 to 10 V. In the 2D electrode system, the yield of the generated GUV increased first but decreased later as voltage increased. These phenomena were further confirmed by numerically calculating the load that the lipid film experienced from the generated electroosmotic flow (EOF). The discrepancy between the experimental and numerical results of the 3D electrode system may be because the parameters that were adopted in the simulations are quite different from those of the lipid film in experiments. The lipid film was not involved in the simulation of the 2D system, and the numerical results matched well with the experiments.
doi:10.3390/mi8010024 fatcat:rkqz43zhkrfr3h3g4iokv2kive

Prediction of Tacrolimus Dose/Weight-Adjusted Trough Concentration in Pediatric Refractory Nephrotic Syndrome: A Machine Learning Approach

Xiaolan Mo, Xiujuan Chen, Xianggui Wang, Xiaoli Zhong, Huiying Liang, Yuanyi Wei, Houliang Deng, Rong Hu, Tao Zhang, Yilu Chen, Xia Gao, Min Huang (+1 others)
2022 Pharmacogenomics and Personalized Medicine  
Tacrolimus (TAC) is a first-line immunosuppressant for patients with refractory nephrotic syndrome (NS). However, there is a high inter-patient variability of TAC pharmacokinetics, thus therapeutic drug monitoring (TDM) is required. In this study, we aimed to employ machine learning algorithms to investigate the impact of clinical and genetic variables on the TAC dose/weight-adjusted trough concentration (C0/D) in Chinese children with refractory NS, and then develop and validate the TAC C0/D
more » ... ediction models. The association of 82 clinical variables and 244 single nucleotide polymorphisms (SNPs) with TAC C0/D in the third month since TAC treatment was examined in 171 children with refractory NS. Extremely randomized trees (ET), gradient boosting decision tree (GBDT), random forest (RF), extreme gradient boosting (XGBoost), and Lasso regression were carried out to establish and validate prediction models, respectively. The best prediction models were validated on a cohort of 30 refractory NS patients. GBDT algorithm performed best in the whole group (R2=0.444, MSE=591.032, MAE=20.782, MedAE=18.980) and CYP3A5 nonexpresser group (R2=0.264, MSE=477.948, MAE=18.119, MedAE=18.771), while ET algorithm performed best in the CYP3A5 expresser group (R2=0.380, MSE=1839.459, MAE=31.257, MedAE=19.399). These prediction models included 3 clinical variables (ALB0, AGE0, and gender) and 10 SNPs (ACTN4 rs3745859, ACTN4 rs56113315, ACTN4 rs62121818, CTLA4 rs4553808, CYP3A5 rs776746, IL2RA rs12722489, INF2 rs1128880, MAP3K11 rs7946115, MYH9 rs2239781, and MYH9 rs4821478). The association between the clinical and genetic variables and TAC C0/D was described, and three TAC C0/D prediction models integrating clinical and genetic variables were developed and validated using machine learning, which may support individualized TAC dosing.
doi:10.2147/pgpm.s339318 pmid:35228813 pmcid:PMC8881964 fatcat:47fxzk4ozfcohabryufdptkaua

Prognostic roles of miR-124-3p and its target ANXA7 and their effects on cell migration and invasion in hepatocellular carcinoma

Honghai Wang, Jun Mao, Yuhong Huang, Jun Zhang, Lin Zhong, Ying Wu, He Huang, Jiayu Yang, Yuanyi Wei, Jianwu Tang
2020 International Journal of Clinical and Experimental Pathology  
Recent studies have indicated that ANXA7 promotes progression and metastasis of hepatocellular carcinoma (HCC). In this study we found a significant negative correlation between the levels of miR-124-3p and ANXA7 protein in HCC. Level of miR-124-3p in tumor tissues was negatively correlated, while ANXA7 protein was positively correlated, with TNM stage and tumor metastasis. Furthermore, we confirmed ANXA7 was a target gene of miR-124-3p by a dual luciferase reporter assay. In vitro,
more » ... n of miR-124-3p promotes apoptosis and inhibits migration and invasion of Hca-F. Bcl-2 correlates X protein (Bax) protein level was up-regulated, while ANXA7, B-cell lymphoma-2 (Bcl-2), Matrix metalloproteinase (MMP-9) and C-X-C motif chemokine 12 (CXCL12) protein levels were suppressed relative to miR-124-3p over-expression. In vivo, up-regulation of miR-124-3p suppresses lymph node metastasis (LNM) and tumorigenicity of Hca-F cells. The expression of ANXA7, MMP-9, and CXCL12 protein in transplanted tumors was suppressed relative to miR-124-3p overexpression. In addition, we found the levels of Bcl-2, MMP-9, and CXCL12 in Hca-F cells decreased significantly after transfection of shRNA-Anxa7 in vitro. In conclusion, our study revealed miR-124-3p inhibits tumor growth, invasion, and lymphatic metastasis in HCC by down-regulation of ANXA7 gene, thereby reducing the expression of Bcl-2, MMP-9, and CXCL12.
pmid:32269673 pmcid:PMC7137028 fatcat:hn7bggmfsbhnrhpycyqauj7ik4

Association between Neighborhood Built Environment and Body Mass Index among Chinese Adults: Hierarchical Linear Model

Mengqi Zhong, Tongji University, Yuanyi Shen, Yifan Yu, China, Tongji University; China, Tongji University; China
2019 Proceedings of the 55th ISOCARP World Planning Congress   unpublished
The 2016 CLDS survey was conducted in 29 Zhong M, Shen Y, Yu Y Association between Neighborhood Built Environment and Body Mass Index among Chinese Adults: Hierarchical Linear Model Table 1 1  ...  0.008 -0.01385 0.005312 0.010 Health 0.009358 0.006932 0.177 0.009107 0.006969 0.192 Whether exercise 0.012774 0.032231 0.692 0.017222 0.02876 0.549 Neighborhood-level predictors Zhong  ... 
doi:10.47472/bfwj3902 fatcat:k5gv6yssezh4tlw4g6ii4ibyam
« Previous Showing results 1 — 15 out of 91 results