Filters








186 Hits in 4.9 sec

Accurate Scene Text Detection via Scale-Aware Data Augmentation and Shape Similarity Constraint

Pengwen Dai, Yang Li, Hua Zhang, Jingzhi Li, Xiaochun Cao
2021 IEEE transactions on multimedia  
Scene text detection has attracted increasing concerns with the rapid development of deep neural networks in recent years.  ...  However, existing scene text detectors may overfit on the public datasets due to the limited training data, or generate inaccurate localization for arbitrary-shape scene texts.  ...  on the public arbitrary-shape scene text benchmarks.  ... 
doi:10.1109/tmm.2021.3073575 fatcat:r4rgthwbk5cw3boixvko3zbpoi

All You Need is a Second Look: Towards Arbitrary-Shaped Text Detection [article]

Meng Cao, Can Zhang, Dongming Yang, Yuexian Zou
2021 arXiv   pre-print
In this paper, we propose a two-stage segmentation-based detector, termed as NASK (Need A Second looK), for arbitrary-shaped text detection.  ...  Arbitrary-shaped text detection is a challenging task since curved texts in the wild are of the complex geometric layouts.  ...  EAST [18] is another one-stage based detector which directly predicts text instances with arbitrary orientations and quadrilateral shapes in full images. Liao et al.  ... 
arXiv:2106.12720v1 fatcat:3h3dt2ofundwrpenf2uqdgrv6a

Location-Aware Feature Selection Text Detection Network [article]

Zengyuan Guo, Zilin Wang, Zhihui Wang, Wanli Ouyang, Haojie Li, Wen Gao
2020 arXiv   pre-print
However, they are behind in accuracy comparing with recent segmentation-based text detectors.  ...  To address this issue, we propose a novel Location-Aware feature Selection text detection Network (LASNet).  ...  It realizes text detection of arbitrary shape.  ... 
arXiv:2004.10999v2 fatcat:27rnkktwzfbc7cbxuq5v4lljr4

Cluttered TextSpotter: An End-to-End Trainable Light- weight Scene Text Spotter for Cluttered Environment

Randheer Bagi, Tanima Dutta, Hari Prabhat Gupta
2020 IEEE Access  
It helps to localize in scene images with background clutters, where partially occluded text parts, truncation artifacts, and perspective distortions are present.  ...  Scene text spotting aims at simultaneously localizing and recognizing text instances, symbols, and logos in natural scene images.  ...  In [46] , arbitrary shape text is detected by extracting text proposals, which are refined using a recurrent neural network (RNN) and an adaptive number of boundary points.  ... 
doi:10.1109/access.2020.3002808 fatcat:x4kbcajahrc5vgtuxsc6oyyjsa

Learning Pixel Affinity Pyramid for Arbitrary-Shaped Text Detection

Zilong Fu, Hongtao Xie, Shancheng Fang, Yuxin Wang, MengTing Xing, Yongdong Zhang
2022 ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)  
, denoting its superior capability in detecting arbitrary-shaped texts.  ...  boundary description for separating closely located texts.  ...  in the arbitrary-shaped text detection.  ... 
doi:10.1145/3524617 fatcat:wccolxaysndjxg5nijush7ayte

Axis Learning for Orientated Objects Detection in Aerial Images

Zhifeng Xiao, Linjun Qian, Weiping Shao, Xiaowei Tan, Kai Wang
2020 Remote Sensing  
Besides, a new aspect-ratio-aware orientation centerness method is proposed to better weigh positive pixel points, in order to guide the network to learn discriminative features from a complex background  ...  The method is tested on two common aerial image datasets, achieving better performance compared with most one-stage orientated methods and many two-stage anchor-based methods with a simpler procedure and  ...  achieved great performances with aerial images such as the DOTA dataset or on natural scene-text detection such as MSRA-TD500.  ... 
doi:10.3390/rs12060908 fatcat:ohce7ratdneqpgmvvjtzuozk6q

PoseDet: Fast Multi-Person Pose Estimation Using Pose Embedding [article]

Chenyu Tian, Ran Yu, Xinyuan Zhao, Weihao Xia, Haoqian Wang, Yujiu Yang
2021 arXiv   pre-print
Moreover, we propose the keypoint-aware pose embedding to represent an object in terms of the locations of its keypoints.  ...  Extensive experiments on the CrowdPose benchmark show the robustness in the crowd scenes. Source code is available.  ...  Figure 1 : 1 Keypoint-aware pose embedding usually detect person instances by using a person detector, then estimate the single-person pose within the detected box.  ... 
arXiv:2107.10466v2 fatcat:v74qs67sajakdm7xemmgifsgru

A Survey of Visual Transformers [article]

Yang Liu, Yao Zhang, Yixin Wang, Feng Hou, Jin Yuan, Jiang Tian, Yang Zhang, Zhongchao Shi, Jianping Fan, Zhiqiang He
2022 arXiv   pre-print
Because of their competitive modeling capabilities, the visual Transformers have achieved impressive performance improvements over multiple benchmarks as compared with modern Convolution Neural Networks  ...  Furthermore, we have revealed a series of essential but unexploited aspects that may empower such visual Transformers to stand out from numerous architectures, e.g., slack high-level semantic embeddings  ...  We use standard learnable 1D position embeddings, since we have not observed significant performance gains from using more advanced 2D-aware position embeddings (Appendix D.3).  ... 
arXiv:2111.06091v3 fatcat:a3fq6lvvzzgglb3qtus5qwrwpe

Quadbox: Quadrilateral Bounding Box Based Scene Text Detection Using Vector Regression

Prateek Keserwani, Ankit Dhankhar, Rajkumar Saini, Partha Pratim Roy
2021 IEEE Access  
Scene text appears with a wide range of sizes and arbitrary orientations.  ...  For detecting such text in the scene image, the quadrilateral bounding boxes provide a much tight bounding box compared to the rotated rectangle.  ...  With proposed properties, the model can detect horizontal, arbitrary oriented, and arbitrary shaped text.  ... 
doi:10.1109/access.2021.3063030 fatcat:2c2hezixrrejhhxt6p2uapqona

Context Encoders: Feature Learning by Inpainting

Deepak Pathak, Philipp Krahenbuhl, Jeff Donahue, Trevor Darrell, Alexei A. Efros
2016 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)  
with removing arbitrary shapes vious methods. from images, obtained from random masks in the PASCAL VOC 2012 dataset [12].  ...  Input Context Context Encoder Content-Aware Fill Figure 5: Comparison with Content-Aware Fill (Photoshop 5. Evaluation feature based on [2]).  ... 
doi:10.1109/cvpr.2016.278 dblp:conf/cvpr/PathakKDDE16 fatcat:dpxpf3ircjgxzenfnmzyv6o6nu

Multimodal Learning with Transformers: A Survey [article]

Peng Xu, Xiatian Zhu, David A. Clifton
2022 arXiv   pre-print
Transformer is a promising neural network learner, and has achieved great success in various machine learning tasks.  ...  CONCLUSION This survey focuses on multimodal machine learning with Transformers. We reviewed the landscape by introducing the Transformer designs and training in the multimodal contexts.  ...  this survey gives a helpful and detailed overview for new researchers and practitioners, provides a convenient reference for relevant experts (e.g., multimodal machine learning researchers, Transformer network  ... 
arXiv:2206.06488v1 fatcat:6aoaczzbtvc43my2kmobo7glvy

2021 Index IEEE Transactions on Image Processing Vol. 30

2021 IEEE Transactions on Image Processing  
., +, TIP 2021 1116-1129 MABAN: Multi-Agent Boundary-Aware Network for Natural Language Moment Retrieval.  ...  Yang, L., +, TIP 2021 39-54 MABAN: Multi-Agent Boundary-Aware Network for Natural Language Moment Retrieval.  ... 
doi:10.1109/tip.2022.3142569 fatcat:z26yhwuecbgrnb2czhwjlf73qu

Learning Neural Textual Representations for Citation Recommendation

Binh Thanh Kieu, Inigo Jauregi Unanue, Son Bao Pham, Hieu Xuan Phan, Massimo Piccardi
2021 2020 25th International Conference on Pattern Recognition (ICPR)  
Qian, Xijun; Liu, Yifan; Yang Yu- Bin 924 PS T4.2 An Accurate Threshold Insensitive Kernel Detector for Arbitrary Shaped Text DAY 4 -Jan 15, 2021 -DAY 2 -Jan 13, 2021 Live Dasgupta, Kinjal;  ...  4.2 An Accurate Threshold Insensitive Kernel Detector for Arbitrary Shaped Text DAY 4 -Jan 15, 2021 -DAY 2 -Jan 13, 2021 Dasgupta, Kinjal; Das, Sudip; Bhattacharya, Ujjwal 955 OS T 4.2 Stratified  ... 
doi:10.1109/icpr48806.2021.9412725 fatcat:3vge2tpd2zf7jcv5btcixnaikm

A Survey of Deep Learning-based Object Detection

Licheng Jiao, Fan Zhang, Fang Liu, Shuyuan Yang, Lingling Li, Zhixi Feng, Rong Qu
2019 IEEE Access  
With the rapid development of deep learning networks for detection tasks, the performance of object detectors has been greatly improved.  ...  Afterwards and primarily, we provide a comprehensive overview of a variety of object detection methods in a systematic manner, covering the one-stage and two-stage detectors.  ...  These methods have difficulties to deal with the large aspect ratios and arbitrary-orientation of scene text.  ... 
doi:10.1109/access.2019.2939201 fatcat:jesz2av2tjbkxfpaqyecptgls4

Oriented Object Detection in Remote Sensing Images with Anchor-Free Oriented Region Proposal Network

Jianxiang Li, Yan Tian, Yiping Xu, Zili Zhang
2022 Remote Sensing  
Currently, mainstream oriented object detectors are based on densely placed predefined anchors.  ...  To address the problem, this paper proposes a novel anchor-free two-stage oriented object detector.  ...  Based on Faster-RCNN [13] , RRPN [23] uses Rotation RPN and Rotation RoI pooling for arbitrary-oriented text detection.  ... 
doi:10.3390/rs14051246 fatcat:hsnpxrds45a5dbkgkfgbdrc43u
« Previous Showing results 1 — 15 out of 186 results