Filters








41,441 Hits in 4.8 sec

Godot Reinforcement Learning Agents [article]

Edward Beeching, Jilles Debangoye, Olivier Simonin, Christian Wolf
2021 arXiv   pre-print
The Godot RL Agents interface allows the design, creation and learning of agent behaviors in challenging 2D and 3D environments with various on-policy and off-policy Deep RL algorithms.  ...  We present Godot Reinforcement Learning (RL) Agents, an open-source interface for developing environments and agents in the Godot Game Engine.  ...  Deep Reinforcement Learning Reinforcement Learning approaches provide the ability to learn in sequential decision making problems, where the objective is to maximize accumulated reward.  ... 
arXiv:2112.03636v1 fatcat:ekc7xvtmdzddjaklg5ffumxtde

A Comprehensive Survey of Deep Learning for Image Captioning [article]

Md. Zakir Hossain, Ferdous Sohel, Mohd Fairuz Shiratuddin, Hamid Laga
2018 arXiv   pre-print
We also discuss the datasets and the evaluation metrics popularly used in deep learning based automatic image captioning.  ...  Deep learning-based techniques are capable of handling the complexities and challenges of image captioning.  ...  Therefore, recently, researchers are focusing more on reinforcement learning and unsupervised learning-based techniques for image captioning.  ... 
arXiv:1810.04020v2 fatcat:javmi4oqffbvxn6m4d2hjmzbhi

Computer Vision for Autonomous Vehicles: Problems, Datasets and State of the Art

Joel Janai, Fatma Güney, Aseem Behl, Andreas Geiger
2020 Foundations and Trends in Computer Graphics and Vision  
, scene understanding, and end-to-end learning for autonomous driving.  ...  Recent years have witnessed enormous progress in AI-related fields such as computer vision, machine learning, and autonomous vehicles.  ...  Acknowledgements 226 References Full text available at: http://dx.doi.org/10.1561/0600000079  ... 
doi:10.1561/0600000079 fatcat:oisu6zhrtrby7i3q4di23wcxui

Image Captioning Based on Deep Neural Networks

Shuang Liu, Liang Bai, Yanli Hu, Haoran Wang, Yansong Wang
2018 MATEC Web of Conferences  
Image captioning is a representative of this filed, which makes the computer learn to use one or more sentences to understand the visual content of an image.  ...  In this paper, we mainly describe three image captioning methods using the deep neural networks: CNN-RNN based, CNN-CNN based and Reinforcement-based framework.  ...  Cross-language text description of images The existing image captioning method based on deep learning or machine learning requires a lot of marked training samples.  ... 
doi:10.1051/matecconf/201823201052 fatcat:a2qt4hcqojahzdr3bopco3dd6y

Special Issue on Advances in Deep Learning

Diego Gragnaniello, Andrea Bottino, Sandro Cumani, Wonjoon Kim
2020 Applied Sciences  
Nowadays, deep learning is the fastest growing research field in machine learning and has a tremendous impact on a plethora of daily life applications, ranging from security and surveillance to autonomous  ...  driving, automatic indexing and retrieval of media content, text analysis, speech recognition, automatic translation, and many others.[...]  ...  Finally, we place on record our gratitude to the editorial team of Applied Sciences and special thanks to Daria Shi, Managing Editor, from MDPI Branch Office, Beijing.  ... 
doi:10.3390/app10093172 fatcat:kdowatxbprdhbkmox62nlqyquq

New Ideas and Trends in Deep Multimodal Content Understanding: A Review [article]

Wei Chen and Weiping Wang and Li Liu and Michael S. Lew
2020 arXiv   pre-print
The focus of this survey is on the analysis of two modalities of multimodal deep learning: image and text.  ...  Unlike classic reviews of deep learning where monomodal image classifiers such as VGG, ResNet and Inception module are central topics, this paper will examine recent multimodal deep models and structures  ...  Adversarial learning focuses on the overall distributions of two different modalities instead of just focusing on each pair.  ... 
arXiv:2010.08189v1 fatcat:2l7molbcn5hf3oyhe3l52tdwra

Tracking e-cigarette warning label compliance on Instagram with deep learning [article]

Chris J. Kennedy, Julia Vassey, Ho-Chun Herbert Chang, Jennifer B. Unger, Emilio Ferrara
2021 arXiv   pre-print
We conclude that deep learning models can effectively identify vaping posts on Instagram and track compliance with FDA warning label requirements.  ...  We sought to develop and evaluate a deep learning system designed to automatically determine if an Instagram post promotes vaping, and if so, if an FDA-compliant warning label was included or if a non-compliant  ...  In recent years, the focus has turned to the use of deep learning to synthesize information and learn feature representations directly from the data. Hu et al. utilize both tags and images (J.  ... 
arXiv:2102.04568v1 fatcat:feztlbweoreblcntvodavgt454

Improving Target-driven Visual Navigation with Attention on 3D Spatial Relationships [article]

Yunlian Lv, Ning Xie, Yimin Shi, Zijiao Wang, Heng Tao Shen
2020 arXiv   pre-print
In this paper, we investigate the target-driven visual navigation using deep reinforcement learning (DRL) in 3D indoor scenes, whose navigation task aims to train an agent that can intelligently make a  ...  On the other hand, TSE module is used to generate sub-targets which allow agent to learn from failures.  ...  Navigation skills using deep reinforcement learning are learned by maximizing accumulated rewards.  ... 
arXiv:2005.02153v1 fatcat:nexy3dw7krgvbp7pt7kfgnwu7a

Deep Learning – A first Meta-Survey of selected Reviews across Scientific Disciplines and their Research Impact [article]

Jan Egger, Antonio Pepe, Christina Gsaxner, Jianning Li
2020 arXiv   pre-print
However, there are several review articles about deep learning, which are focused on specific scientific fields or applications, for example deep learning advances in computer vision or in specific tasks  ...  Mimicking the learning process of humans with their senses, deep learning networks are fed with (sensory) data, like texts, images, videos or sounds.  ...  [20] give a comprehensive overview on latest trends and advances in the field of image super-resolution focusing on deep learning methods.  ... 
arXiv:2011.08184v1 fatcat:7eofypvqordn7i4o7qmtoaaydi

Image captioning Bot

Mukund Upadhyay and Prof. Shallu Bashambu
2020 International journal of modern trends in science and technology  
Image captioning is a representative of this filed, which makes the computer learn to use one or more sentences to understand the visual content of an image.  ...  Image captioning means automatically generating a caption for an image with the development of deep learning, the combination of computer vision and natural language process has caught great attention  ...  Cross-language text description The existing image captioning method based on deep learning or machine learning requires a lot of marked training samples.  ... 
doi:10.46501/ijmtst061265 fatcat:aldvijmynbhnhjci3flml2pcly

Adaptive Adversarial Attack on Scene Text Recognition [article]

Xiaoyong Yuan, Pan He, Xiaolin Andy Li, Dapeng Oliver Wu
2020 arXiv   pre-print
A unified architecture is developed and evaluated for both non-sequential tasks and sequential ones. To validate the effectiveness, we take the scene text recognition task as a case study.  ...  To our best knowledge, our proposed method is the first attempt to adversarial attack for scene text recognition.  ...  We then focus on the performance of Adaptive Attack for the sequential task. We attack a scene text recognition model as our use case.  ... 
arXiv:1807.03326v3 fatcat:5xiolddubje5fmr5ccbrbdrbti

Understanding in Artificial Intelligence [article]

Stefan Maetschke and David Martinez Iraola and Pieter Barnard and Elaheh ShafieiBavani and Peter Zhong and Ying Xu and Antonio Jimeno Yepes
2021 arXiv   pre-print
Current Artificial Intelligence (AI) methods, most based on deep learning, have facilitated progress in several fields, including computer vision and natural language understanding.  ...  The progress of these AI methods is measured using benchmarks designed to solve challenging tasks, such as visual question answering.  ...  Using reinforcement learning, they demonstrate significantly improved generalization for learning context-free parsers.  ... 
arXiv:2101.06573v1 fatcat:nlp6h5toh5f6lpwjsafn6gulbq

New Ideas and Trends in Deep Multimodal Content Understanding: A Review

Wei Chen, Weiping Wang, Li Liu, Michael S. Lew
2020 Neurocomputing  
The focus of this survey is on the analysis of two modalities of multimodal deep learning: image and text.  ...  Unlike classic reviews of deep learning where monomodal image classifiers such as VGG, ResNet and Inception module are central topics, this paper will examine recent multimodal deep models and structures  ...  His research interest focuses on cross-modal retrieval with deep learning methods.  ... 
doi:10.1016/j.neucom.2020.10.042 fatcat:hyjkj5enozfrvgzxy6avtbmoxu

Image Processing Based Scene-Text Detection and Recognition with Tesseract [article]

Ebin Zacharias, Martin Teuchler, Bénédicte Bernier
2020 arXiv   pre-print
The use case in focus facilitates the possibility to detect the text area in natural scenes with greater accuracy because of the availability of images under constraints.  ...  This project focuses on word detection and recognition in natural images. In comparison to reading text in scanned documents, the targeted problem is significantly more challenging.  ...  Tesseract 5 is used for text recognition which is a deep learning-based model and utilizes LSTM (Long Short Term Memory).  ... 
arXiv:2004.08079v1 fatcat:cya3elyrjvaurkhtwkmiv4iqam

Action Recognition Using Action Sequences Optimization and Two-Stream 3D Dilated Neural Network

Xin Xiong, Weidong Min, Qing Han, Qi Wang, Cheng Zha, Hubert Cecotti
2022 Computational Intelligence and Neuroscience  
The majority of existing methods fail to recognize actions accurately because of interference of background changes when the proportion of high-activity action areas is not reinforced and by using RGB  ...  The method is based on shot segmentation and dynamic weighted sampling, and it reconstructs the video by reinforcing the proportion of high-activity action areas, eliminating redundant intervals, and extracting  ...  In recent years, deep learning [6] has progressed considerably in image-based object and scene classi cation [7] [8] [9] [10] and recognition [11] [12] [13] [14] .  ... 
doi:10.1155/2022/6608448 pmid:35733557 pmcid:PMC9208928 fatcat:m56ufg2z25goxdqzphmbsxaili
« Previous Showing results 1 — 15 out of 41,441 results