Filters








5,504 Hits in 6.3 sec

Chinese Text in the Wild [article]

Tai-Ling Yuan, Zhe Zhu, Kun Xu, Cheng-Jun Li, Shi-Min Hu
2018 arXiv   pre-print
We introduce Chinese Text in the Wild, a very large dataset of Chinese text in street view images.  ...  In this paper we provide details of a newly created dataset of Chinese text with about 1 million Chinese characters annotated by experts in over 30 thousand street view images.  ...  Chinese Text in the Wild Dataset In this section, we present Chinese Text in the Wild (CTW), a very large dataset of Chinese text in street view images.  ... 
arXiv:1803.00085v1 fatcat:m23eej5gwnds5cqjbjadl4o5b4

Traditional Chinese Synthetic Datasets Verified with Labeled Data for Scene Text Recognition [article]

Yi-Chang Chen, Yu-Chuan Chang, Yen-Cheng Chang, Yi-Ren Yeh
2021 arXiv   pre-print
To the best of our knowledge, public datasets for Traditional Chinese text recognition are lacking.  ...  Training a text recognition model often requires a large amount of labeled data, but data labeling can be difficult, expensive, or time-consuming, especially for Traditional Chinese text recognition.  ...  In addition to the synthetic data engine, we also create a real-world Traditional Chinese scene text dataset for evaluation.  ... 
arXiv:2111.13327v1 fatcat:qtnk5isqmzgtjmh54yhekf75ee

Chinese Street View Text: Large-Scale Chinese Text Reading With Partially Supervised Learning

Yipeng Sun, Jiaming Liu, Wei Liu, Junyu Han, Errui Ding, Jingtuo Liu
2019 2019 IEEE/CVF International Conference on Computer Vision (ICCV)  
To recognize Chinese text in the wild while keeping large-scale datasets labeling cost-effective, we propose to annotate one part of the C-SVT dataset (30,000 images) in locations and text labels as full  ...  To address this issue, we introduce a new large-scale text reading benchmark dataset named Chinese Street View Text (C-SVT) with 430, 000 street view images, which is at least 14 times as large as the  ...  C-SVT is at least 14 times as large as the previous Chinese benchmarks [36, 43] , making it the largest dataset for reading Chinese text in the wild.  ... 
doi:10.1109/iccv.2019.00918 dblp:conf/iccv/SunLLHDL19 fatcat:cnujswgm3vaxfhj5kcdx3shbwm

ICDAR 2015 Text Reading in the Wild Competition [article]

Xinyu Zhou and Shuchang Zhou and Cong Yao and Zhimin Cao and Qi Yin
2015 arXiv   pre-print
This technical report presents the final results of the ICDAR 2015 Text Reading in the Wild (TRW 2015) competition, which aims at establishing a benchmark for assessing detection and recognition algorithms  ...  devised for both Chinese and English scripts and providing a playground for researchers from the community.  ...  Therefore, we organized the ICDAR 2015 Text Reading in the Wild (TRW 2015) competition 1 , which generates a large-scale text image database, proposes two text detection or recognition tasks and devises  ... 
arXiv:1506.03184v1 fatcat:e3a6vlzy3zfc3lfhey2dw5epqm

Chinese Street View Text: Large-scale Chinese Text Reading with Partially Supervised Learning [article]

Yipeng Sun, Jiaming Liu, Wei Liu, Junyu Han, Errui Ding, Jingtuo Liu
2020 arXiv   pre-print
To recognize Chinese text in the wild while keeping large-scale datasets labeling cost-effective, we propose to annotate one part of the CSVT dataset (30,000 images) in locations and text labels as full  ...  To address this issue, we introduce a new large-scale text reading benchmark dataset named Chinese Street View Text (C-SVT) with 430,000 street view images, which is at least 14 times as large as the existing  ...  C-SVT is at least 14 times as large as the previous Chinese benchmarks [36, 43] , making it the largest dataset for reading Chinese text in the wild.  ... 
arXiv:1909.07808v2 fatcat:dmycik3a2bgztavkvmk2max4mi

ICDAR2017 Competition on Reading Chinese Text in the Wild (RCTW-17) [article]

Baoguang Shi, Cong Yao, Minghui Liao, Mingkun Yang, Pei Xu, Linyan Cui, Serge Belongie, Shijian Lu, Xiang Bai
2018 arXiv   pre-print
This report introduces RCTW, a new competition that focuses on Chinese text reading. The competition features a large-scale dataset with 12,263 annotated images.  ...  Despite the large potential value, datasets and competitions in the past primarily focus on English, which bares very different characteristics than Chinese.  ...  The authors also thank Zhiyong Liu, Yang Yang, Zhiqiang Zhang, Rui Yu and Xuelei Zhang for their efforts in annotating the data.  ... 
arXiv:1708.09585v3 fatcat:fksdh4az3vf4fic2z4obz4lwka

ICDAR2017 Competition on Reading Chinese Text in the Wild (RCTW-17)

Baoguang Shi, Cong Yao, Minghui Liao, Mingkun Yang, Pei Xu, Linyan Cui, Serge Belongie, Shijian Lu, Xiang Bai
2017 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)  
This report introduces RCTW, a new competition that focuses on Chinese text reading. The competition features a large-scale dataset with over 12,000 annotated images.  ...  Despite the large potential value, datasets and competitions in the past primarily focus on English, which bares very different characteristics than Chinese.  ...  The authors also thank Zhiyong Liu, Yang Yang, Zhiqiang Zhang, Rui Yu and Xuelei Zhang for their efforts in annotating the data.  ... 
doi:10.1109/icdar.2017.233 dblp:conf/icdar/ShiYLYXCBLB17 fatcat:xbb4qjl2era4nc4tchozl3kkja

A CNN Based Scene Chinese Text Recognition Algorithm With Synthetic Data Engine [article]

Xiaohang Ren, Kai Chen, Jun Sun
2016 arXiv   pre-print
In this paper, we propose a CNN based Chinese text recognition algorithm.  ...  To ensure the small size nature character dataset and the large size artificial character dataset are comparable in training, the CNN model are trained progressively.  ...  The CNN model are trained in two steps to ensure the small size nature character dataset and the large size artificial character dataset are comparable in training.  ... 
arXiv:1604.01891v1 fatcat:r7nham4d3zdb7nffvytntdpv3q

Automatic Script Identification in the Wild [article]

Baoguang Shi, Cong Yao, Chengquan Zhang, Xiaowei Guo, Feiyue Huang, Xiang Bai
2015 arXiv   pre-print
A large-scale dataset with a great quantity of natural images and 10 types of widely used languages is constructed and released.  ...  In allusion to the challenges in script identification in real-world scenarios, a deep learning based algorithm is proposed.  ...  We believe the SIW-10 dataset can be serve as a standard benchmark for script identification in the wild. III.  ... 
arXiv:1505.02982v1 fatcat:zm2ehhfafnfpdieua42xh5v3y4

Automatic script identification in the wild

Baoguang Shi, Cong Yao, Chengquan Zhang, Xiaowei Guo, Feiyue Huang, Xiang Bai
2015 2015 13th International Conference on Document Analysis and Recognition (ICDAR)  
A large-scale dataset with a great quantity of natural images and 10 types of widely-used languages is constructed and released.  ...  In allusion to the challenges in script identification in real-world scenarios, a deep learning based algorithm is proposed.  ...  ACKNOWLEDGMENT This work was primarily supported by National Natural Science Foundation of China (NSFC) (No.61222308), and in part by Program for New Century Excellent Talents in University under Grant  ... 
doi:10.1109/icdar.2015.7333818 dblp:conf/icdar/ShiYZGHB15 fatcat:r7l7e633pfezhh6avn5mxzlkji

A Multi-oriented Chinese Keyword Spotter Guided by Text Line Detection [article]

Pei Xu, Shan Huang, Hongzhen Wang, Hao Song, Shen Huang, Qi Ju
2020 arXiv   pre-print
In this way, the text lines and keywords are predicted in parallel. We create two Chinese keyword datasets based on RCTW-17 and ICPR MTWI2018 to verify the effectiveness of our method.  ...  In this paper, we propose a new Chinese keyword spotter for natural images, which is inspired by Mask R-CNN. We propose to predict the keyword masks guided by text line detection.  ...  Chinese text lines naturally have variety in length as there is no limit to the number of characters in a text line. Besides, Chinese text lines have a large variety of sizes and orientation.  ... 
arXiv:2001.00722v2 fatcat:evzj4nw64fecfcs5os6wt6eyza

ICDAR2019 Robust Reading Challenge on Arbitrary-Shaped Text (RRC-ArT) [article]

Chee-Kheng Chng, Yuliang Liu, Yipeng Sun, Chun Chet Ng, Canjie Luo, Zihan Ni, ChuanMing Fang, Shuaitao Zhang, Junyu Han, Errui Ding, Jingtuo Liu, Dimosthenis Karatzas, Chee Seng Chan, Lianwen Jin
2019 arXiv   pre-print
The dataset, the evaluation kit as well as the results are publicly available at https://rrc.cvc.uab.es/?ch=14  ...  Apart from the results, this paper also details the ArT dataset, tasks description, evaluation metrics and participants methods.  ...  INTRODUCTION Text in the wild comes in a variety of shapes.  ... 
arXiv:1909.07145v1 fatcat:itdnm6uogbgmpnc3pzohhnswve

Major Soybean Maturity Gene Haplotypes Revealed by SNPViz Analysis of 72 Sequenced Soybean Genomes

Tiffany Langewisch, Hongxin Zhang, Ryan Vincent, Trupti Joshi, Dong Xu, Kristin Bilyeu, Tianzhen Zhang
2014 PLoS ONE  
For this study, we utilized two available soybean genomic datasets for a total of 72 soybean genotypes encompassing cultivars, landraces, and the wild species Glycine soja.  ...  Analyses of these large-scale datasets can become cumbersome, especially when comparing nucleotide polymorphisms across many samples within a dataset and among different datasets or organisms.  ...  Acknowledgments We thank Perry Cregan for providing the NAM parents' sequencing data before publication. Qijian Song is thanked for his contribution in the  ... 
doi:10.1371/journal.pone.0094150 pmid:24727730 pmcid:PMC3984090 fatcat:s2zlr2wufjchzghtwgdqbgl7fm

Mixed Vertical-and-Horizontal-Text Traffic Sign Detection and Recognition for Street-Level Scene

Jiefeng Guo, Rongxuan You, Lianfen Huang
2020 IEEE Access  
Our proposed method uses the position and structural information of the characters to form the text lines. A dataset of Chinese text-based traffic signs is collected.  ...  To the best of our knowledge, there is nothing in the literature about simultaneous recognition of both horizontal and vertical text in Chinese text-based traffic signs.  ...  To solve this problem, the similarity measurement of the HOG features between a Chinese character on text-based traffic signs and each Chinese character in the dataset is compared.  ... 
doi:10.1109/access.2020.2986500 fatcat:mz7pv5nuezedzjqygisbtjj6xu

ICDAR 2019 Competition on Large-scale Street View Text with Partial Labeling – RRC-LSVT [article]

Yipeng Sun, Zihan Ni, Chee-Kheng Chng, Yuliang Liu, Canjie Luo, Chun Chet Ng, Junyu Han, Errui Ding, Jingtuo Liu, Dimosthenis Karatzas, Chee Seng Chan, Lianwen Jin
2019 arXiv   pre-print
During the competition period, a total of 41 teams participated in the two proposed tasks with 132 valid submissions, i.e., text detection and end-to-end text spotting.  ...  To scale up the amount of training data while keeping the labeling procedure cost-effective, this competition introduces a new challenge on Large-scale Street View Text with Partial Labeling (LSVT), providing  ...  The organizers would like to thank all the participants for their valuable and helpful feedback, which has also contributed to the success of the competitions.  ... 
arXiv:1909.07741v1 fatcat:mtpknsxztzar5ej2ydzbp3tpba
« Previous Showing results 1 — 15 out of 5,504 results