Filters








14,363 Hits in 6.2 sec

Towards Accurate Scene Text Recognition with Semantic Reasoning Networks [article]

Deli Yu, Xuan Li, Chengquan Zhang, Junyu Han, Jingtuo Liu, Errui Ding
2020 arXiv   pre-print
To mitigate these limitations, we propose a novel end-to-end trainable framework named semantic reasoning network (SRN) for accurate scene text recognition, where a global semantic reasoning module (GSRM  ...  Although the previous scene text recognition methods have made great progress over the past few years, the research on mining semantic information to assist text recognition attracts less attention, only  ...  Furthermore, we propose a novel framework named semantic reasoning network (SRN) for accurate scene text recognition, which integrates not only global semantic reasoning module (GSRM) but also parallel  ... 
arXiv:2003.12294v1 fatcat:qdkx5x5jxjgmfobx66ycyw3gtq

Towards Accurate Scene Text Recognition With Semantic Reasoning Networks

Deli Yu, Xuan Li, Chengquan Zhang, Tao Liu, Junyu Han, Jingtuo Liu, Errui Ding
2020 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)  
To mitigate these limitations, we propose a novel end-to-end trainable framework named semantic reasoning network (SRN) for accurate scene text recognition, where a global semantic reasoning module (GSRM  ...  Although the previous scene text recognition methods have made great progress over the past few years, the research on mining semantic information to assist text recognition attracts less attention, only  ...  Furthermore, we propose a novel framework named semantic reasoning network (SRN) for accurate scene text recognition, which integrates not only global semantic reasoning module (GSRM) but also parallel  ... 
doi:10.1109/cvpr42600.2020.01213 dblp:conf/cvpr/YuLZLHLD20 fatcat:uqjnmhzgejadjgvtkrgh6g52yq

From Two to One: A New Scene Text Recognizer with Visual Language Modeling Network [article]

Yuxin Wang, Hongtao Xie, Shancheng Fang, Jing Wang, Shenggao Zhu, Yongdong Zhang
2021 arXiv   pre-print
In this paper, we abandon the dominant complex language model and rethink the linguistic learning process in the scene text recognition.  ...  information to enhance the visual features for accurate recognition.  ...  We regard the proposed VisionLAN as a basic step toward more robust and accurate scene text recognition, and we will further explore its potential in the future.  ... 
arXiv:2108.09661v1 fatcat:qu5y2terk5bencrvfxdmf6izrm

Towards Escaping from Language Bias and OCR Error: Semantics-Centered Text Visual Question Answering [article]

Chengyang Fang, Gangyan Zeng, Yu Zhou, Daiqing Wu, Can Ma, Dayong Hu, Weiping Wang
2022 arXiv   pre-print
Texts in scene images convey critical information for scene understanding and reasoning.  ...  SC-Net surpasses previous works with a noticeable margin and is more reasonable for the TextVQA task.  ...  As a semantically rich entity, text plays a critical role in scene understanding and reasoning.  ... 
arXiv:2203.12929v1 fatcat:kxtbpm4jazhjdeasa5e3r624te

Toward Arbitrary-Shaped Text Spotting Based On End-To-End

Guangcun Wei, Wansheng Rong, Yongquan Liang, Xinguang Xiao, Xiang Liu
2020 IEEE Access  
At present, text spotting in natural scenes has become one of the research hotspots. Among them, curvilinear text and long text are the main difficulties of text spotting in natural scenes.  ...  More importantly, the joint optimization strategy realizes the mutual promotion function of the text detection task and the text recognition task.  ...  TEXT RECOGNITION IN NATURAL SCENE Natural scene text recognition includes two categories: the CTC-based method and the method based on the attention mechanism.  ... 
doi:10.1109/access.2020.3020387 fatcat:tnmd4q5lqvc2zarwcytwhejx5u

Computer Vision for Autonomous Vehicles: Problems, Datasets and State of the Art

Joel Janai, Fatma Güney, Aseem Behl, Andreas Geiger
2020 Foundations and Trends in Computer Graphics and Vision  
, scene understanding, and end-to-end learning for autonomous driving.  ...  As with any rapidly growing field, it becomes increasingly difficult to stay up-to-date or enter the field as a beginner.  ...  Acknowledgements 226 References Full text available at: http://dx.doi.org/10.1561/0600000079  ... 
doi:10.1561/0600000079 fatcat:oisu6zhrtrby7i3q4di23wcxui

A Survey of Content-Aware Video Analysis for Sports

Huang-Chia Shih
2018 IEEE transactions on circuits and systems for video technology (Print)  
Content-aware analysis methods are discussed with respect to object-, event-, and context-oriented groups.  ...  Previous surveys have focused on the methodologies of sports video analysis from the spatiotemporal viewpoint instead of a content-based viewpoint, and few of these studies have considered semantics.  ...  With a combination of motion information, expected appearance modeling, occlusion reasoning, and backtracking, the ball trajectory can be accurately tracked without velocity information. 3) Naming Objects  ... 
doi:10.1109/tcsvt.2017.2655624 fatcat:rwqzu46sgfb7tpkcav4ysmh6ae

Language-Grounded Indoor 3D Semantic Segmentation in the Wild [article]

David Rozenberszki, Or Litany, Angela Dai
2022 arXiv   pre-print
Recent advances in 3D semantic segmentation with deep neural networks have shown remarkable success, with rapid performance increase on available datasets.  ...  Thus, we propose to study a larger vocabulary for 3D semantic segmentation with a new extended benchmark on ScanNet data with 200 class categories, an order of magnitude more than previously studied.  ...  Recent advances in 3D semantic segmentation with deep neural networks have shown remarkable success, with rapid performance increase on available datasets.  ... 
arXiv:2204.07761v2 fatcat:axlrozwpnjejnaarmeqrxyvgla

2021 Index IEEE Transactions on Image Processing Vol. 30

2021 IEEE Transactions on Image Processing  
., +, TIP 2021 2207-2219 SLOAN: Scale-Adaptive Orientation Attention Network for Scene Text Recognition.  ...  ., +, TIP 2021 3764-3777 Towards Efficient Scene Understanding via Squeeze Reasoning. Li, X., +, TIP 2021 7050-7063 Towards Fine-Grained Human Pose Transfer With Detail Replenishing Net-work.  ... 
doi:10.1109/tip.2022.3142569 fatcat:z26yhwuecbgrnb2czhwjlf73qu

Systematic Review of Computer Vision Semantic Analysis in Socially Assistive Robotics

Antonio Victor Alencar Lundgren, Matheus Albert Oliveira dos Santos, Byron Leite Dantas Bezerra, Carmelo José Albanez Bastos-Filho
2022 AI  
Therefore, we present a systematic review of visual semantics works concerned with assistive robotics. Furthermore, we discuss the trends and possible research gaps in those fields.  ...  The merging of these fields creates demand for more complex and autonomous solutions, often struggling with the lack of contextual understanding of tasks that semantic analysis can provide and hardware  ...  prediction; area (3)-text recognition; and area (4)-scene recognition, divided into (a) semantic segmentation, and (b) scene classification. 4. 1 . 1 RQ1: What Is the Current State of Semantic Analysis  ... 
doi:10.3390/ai3010014 fatcat:fplfxt2kdfbafhvxeye6odzsha

TextScanner: Reading Characters in Order for Robust Scene Text Recognition [article]

Zhaoyi Wan, Minghang He, Haoran Chen, Xiang Bai, Cong Yao
2020 arXiv   pre-print
Driven by deep learning and the large volume of data, scene text recognition has evolved rapidly in recent years.  ...  To tackle these challenges, we propose in this paper an alternative approach, called TextScanner, for scene text recognition.  ...  As scene text provides pivotal and specific information, accurate recognition of text plays crucial roles in various real-world scenarios (Phan et al. 2013) .  ... 
arXiv:1912.12422v2 fatcat:wsgdxglopvablobekvdrg3zsde

TextScanner: Reading Characters in Order for Robust Scene Text Recognition

Zhaoyi Wan, Minghang He, Haoran Chen, Xiang Bai, Cong Yao
2020 PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE  
Driven by deep learning and a large volume of data, scene text recognition has evolved rapidly in recent years.  ...  To tackle these challenges, we propose in this paper an alternative approach, called TextScanner, for scene text recognition.  ...  As scene text provides pivotal and specific information, accurate recognition of text plays crucial roles in various real-world scenarios (Phan et al. 2013) .  ... 
doi:10.1609/aaai.v34i07.6891 fatcat:skt2bspnljbqfnlro7feu2elra

Towards Accurate Scene Text Detection with Bidirectional Feature Pyramid Network

Dongping Cao, Jiachen Dang, Yong Zhong
2021 Symmetry  
In order to enhance the feature representation ability of FCOS for text detection tasks, we apply the Bidirectional Feature Pyramid Network (BiFPN) as the backbone network, enhancing the model learning  ...  Scene text detection, this task of detecting text from real images, is a hot research topic in the machine vision community. Most of the current research is based on an anchor box.  ...  Reference [5] proposed an end-to-end two-stage scene text detection network architecture, named the quadrilateral region proposal network (QRPN), that can accurately locate scene texts with quadrilateral  ... 
doi:10.3390/sym13030486 fatcat:haqr7qo4braw5obsyrhjgvyira

Miniaturized five fundamental issues about visual knowledge

Yun-he Pan
2020 Frontiers of Information Technology & Electronic Engineering  
Therefore, its structure is clear, with perceivable semantics and inferable knowledge. Typical examples include semantic networks and knowledge maps (Zhang NY et al., 2020) .  ...  of semantics, which is applicable to the retrieval and reasoning of character information; (2) visual knowledge-the memory component of visual scenes, which is available for spatio-temporal inference  ... 
doi:10.1631/fitee.2040000 fatcat:f7bfiev6src65ii2wxmvutqury

RUArt: A Novel Text-Centered Solution for Text-Based Visual Question Answering [article]

Zan-Xia Jin, Heran Wu, Chun Yang, Fang Zhou, Jingyan Qin, Lei Xiao, Xu-Cheng Yin
2020 arXiv   pre-print
Finally, it answers the related text for the given question through text semantic matching and reasoning.  ...  Taking an image and a question as input, RUArt first reads the image and obtains text and scene objects.  ...  0.2890 +Semantic Reasoning 0.3133 TABLE III : III Comparison with participants of ST-VQA on test set with the metric ANLS.  ... 
arXiv:2010.12917v1 fatcat:65hpww2zijcmrfamw44runltwy
« Previous Showing results 1 — 15 out of 14,363 results