487,490 Hits in 7.5 sec

A task-based evaluation of combined set and network visualization

Peter Rodgers, Gem Stapleton, Bilal Alsallakh, Luana Michallef, Rob Baker, Simon Thompson
2016 Information Sciences  
We conducted a comparative crowdsourced user study to evaluate all five techniques based on tasks that require information from both the network and the sets.  ...  This paper addresses the problem of how best to visualize network data grouped into overlapping sets. We address it by evaluating various existing techniques alongside a new technique.  ...  and EulerView respectively; and Mohamad Tayssir Alkowatly for help with setting up the framework that enabled us to run EulerView.  ... 
doi:10.1016/j.ins.2016.05.045 fatcat:vlwrjnvzxfaizgzyur2wn3xqbi

The effectiveness of visual aids in teaching foreign languages

M Usmonov
2022 Zamonaviy lingvistik tadqiqotlar: xorijiy tajribalar, istiqbolli izlanishlar va tillarni o'qitishning innovatsion usullari  
The article analyzes the effectiveness of visual aids in the process of teaching foreign languages and suggests ways to use them in the classroom.  ...  and use of computer-based testing, diagnostic, monitoring and evaluation systems.  ...  These systems are a set of software and hardware tools and equipment that allows you to combine various types of information (text, hand-drawn graphics, slides, music, moving images, sound, video) and  ... 
doi:10.47689/linguistic-research-vol-iss1-pp233-235 fatcat:cc4fzp3wabflbcwn3qgiedczr4

A Cross-Modal Image Fusion Method Guided by Human Visual Characteristics [article]

Aiqing Fang and Xinbo Zhao and Jiaqi Yang and Yanning Zhang
2020 arXiv   pre-print
research of image fusion theory based on the characteristics of human visual perception is less.  ...  Inspired by the characteristics of human visual perception, we propose a robust multi-task auxiliary learning optimization image fusion theory.  ...  found a complete and effective evaluation of image quality.  ... 
arXiv:1912.08577v4 fatcat:lajicugsc5cllpjzihqag6tcda

Cross-Modal Image Fusion Theory Guided by Subjective Visual Attention [article]

Aiqing Fang, Xinbo Zhao, Yanning Zhang
2019 arXiv   pre-print
Secondly, based on the human visual attention perception mechanism, we introduce the human visual attention network guided by subjective tasks on the basis of the multi-task auxiliary learning network.  ...  The human visual perception system has very strong robustness and contextual awareness in a variety of image processing tasks.  ...  In view of this problem, although there are some image quality evaluation methods based on generative network [27] and deep learning [28] in the field of image quality evaluation, they are still quite  ... 
arXiv:1912.10718v1 fatcat:4fgrayc4wjdolcey4vspg3miuy

Survey of Recent Advances in Visual Question Answering [article]

Supriya Pandhre, Shagun Sodhani
2017 arXiv   pre-print
Visual Question Answering (VQA) presents a unique challenge as it requires the ability to understand and encode the multi-modal inputs - in terms of image processing and natural language processing.  ...  This paper presents a survey of different approaches proposed to solve the problem of Visual Question Answering. We also describe the current state of the art model in later part of paper.  ...  These networks are then combined with other modules and jointly trained to predict the answers. Based on different tasks, 5 types of modules are described.  ... 
arXiv:1709.08203v1 fatcat:sqmhzvf2o5frjjj3gdi7nq74du

Zhejiang University at ImageCLEF 2019 Visual Question Answering in the Medical Domain

Xin Yan, Lin Li, Chulin Xie, Jun Xiao, Lin Gu
2019 Conference and Labs of the Evaluation Forum  
We propose a novel convolutional neural network (CNN) based on VGG16 network and Global Average Pooling strategy to extract visual features.  ...  This paper describes the submission of Zhejiang University for Visual Question Answering task in medical domain (VQA-Med) of ImageCLEF 2019 [2] .  ...  Feature fuse with co-attention mechanism The strategy to combine visual and semantic feature plays an important role in improving performance of VQA task.  ... 
dblp:conf/clef/YanLXXG19 fatcat:74xc5442yrg2xkdkyux2gdebeq

Visual Encodings for Networks with Multiple Edge Types

Athanasios Vogogias, Daniel Archambault, Benjamin Bach, Jessie Kennedy
2020 Proceedings of the International Conference on Advanced Visual Interfaces  
e) combines position and colour, f) uses size and g) combines size and colour to create a glyph.  ...  The best encodings were integrated into a visual analytics tool for inferring dynamic Bayesian networks and evaluated by computational biologists for additional evidence.  ...  Data Set Used in the Evaluation To create a realistic collection of networks, we ran BANJO on the songbird data [36] to create 17 DBNs.  ... 
doi:10.1145/3399715.3399827 dblp:conf/avi/VogogiasABK20 fatcat:k4iguksr6rfuxnjym2i7vfq2au

V-PROM: A Benchmark for Visual Reasoning Using Visual Progressive Matrices

Damien Teney, Peng Wang, Jiewei Cao, Lingqiao Liu, Chunhua Shen, Anton Van den Hengel
We evaluate a range of deep learning architectures, and find that existing models, including those popular for vision-and-language tasks, are unable to solve seemingly-simple instances.  ...  Such tasks include, for example, image captioning, visual question answering, and visual navigation. Their evaluation is however hindered by task-specific confounding factors and dataset biases.  ...  Related work Evaluation of abstract visual reasoning Evaluating reasoning has a long history in the field of AI, but is typically based on pre-defined or easily identifiable symbols.  ... 
doi:10.1609/aaai.v34i07.6885 fatcat:v63l3mjlcbewhpbrr3yabmupwe

Flood Detection via Twitter Streams using Textual and Visual Features [article]

Firoj Alam, Zohaib Hassan, Kashif Ahmad, Asma Gul, Michael Reiglar, Nicola Conci, Ala AL-Fuqaha
2020 arXiv   pre-print
For the text-based flood events detection, we use a transformer network (i.e., pretrained Italian BERT model) achieving an F1-score of .853.  ...  In total, we proposed four different solutions including a multi-modal solution combining textual and visual information for the mandatory run, and three single modal image and text-based solutions as  ...  Multimodal Model (Run 1) The multimodal network consists of a text and an image network combined to form a shared representation before the classification layer.  ... 
arXiv:2011.14944v1 fatcat:czr7jr66sfe3vkjortx74guavy

VIREO @ TRECVID 2017: Video-to-Text, Ad-hoc Video Search, and Video hyperlinking

Phuong Anh Nguyen, Qing Li, Zhi-Qi Cheng, Yi-Jie Lu, Hao Zhang, Xiao Wu, Chong-Wah Ngo
2017 TREC Video Retrieval Evaluation  
In this study, we intend to find whether the combination of the concept-based system, captioning system and text-based search system would do any help to improve search performance.  ...  One is our concept-based, zero-example video search system which has been proved useful in previous years [4], one is a video captioning system which is individually trained in VTT task, the other is a  ...  Acknowledgment The work described in this paper was supported by two grants from the Research Grants Council of the Hong Kong Special Administrative Region, China (CityU 11250716, 11210514).  ... 
dblp:conf/trecvid/NguyenLCL00N17 fatcat:jjfp3n7qunfg5mvnrmdmf4m5dq

UNITOR @ DANKMEMES: Combining Convolutional Models and Transformer-based architectures for accurate MEME management [chapter]

Claudia Breazzano, Edoardo Rubino, Danilo Croce, Roberto Basili
2020 EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020  
UNITOR implements a neural model which combines a Deep Convolutional Neural Network to encode visual information of input images and a Transformerbased architecture to encode the meaning of the attached  ...  UNITOR ranked first in all subtasks, clearly confirming the robustness of the investigated neural architectures and suggesting the beneficial impact of the proposed combination strategy.  ...  Transformer-based Architectures for text classification. A MEME is a combination of visual information and the overlaid caption.  ... 
doi:10.4000/books.aaccademia.7420 fatcat:qvcld66xafd3de7ky4s7jlqxle

Multimodal Feature Fusion for Video Advertisements Tagging Via Stacking Ensemble [article]

Qingsong Zhou, Hai Liang, Zhimin Lin, Kele Xu
2021 arXiv   pre-print
This framework introduces stacking-based ensembling approach to reduce the influence of varying levels of noise and conflicts between different modalities.  ...  Specifically, we propose a novel multi-modal feature fusion framework, with the goal to combine complementary information from multiple modalities.  ...  We consider different perspectives (visual, text, sound, and other features) to predict the tagging of posts, and train a DNN-based classification model to obtain the taggings.  ... 
arXiv:2108.00679v1 fatcat:o24bdkie65gd5jkdbimglj2wwe

Adopting Semantic Information of Grayscale Radiographs for Image Classification and Retrieval

Obioma Pelka, Felix Nensa, Christoph M. Friedrich
2018 Proceedings of the 11th International Joint Conference on Biomedical Engineering Systems and Technologies  
All images are visually represented using Convolutional Neural Networks (CNN) and the Long Short-Term Memory (LSTM) based Recurrent Neural Network (RNN) Show-and-Tell model is adopted for keyword generation  ...  For the image classification tasks, Random Forest models trained with Bag-of-Keypoints visual representations were adopted.  ...  Keyword Generation For keyword generation, a combination of encoding and decoding using Convolutional Neural Networks (CNN) and Long Short-Term Memory (LSTM) (Hochreiter and Schmidhuber, 1997) based  ... 
doi:10.5220/0006732301790187 dblp:conf/biostec/PelkaNF18 fatcat:aoadhd2inbc6tkx5ahua4uuuz4

Off-the-Shelf Deep Features for Saliency Detection

Aymen Azaza, Mehrez Abdellaoui, Ali Douik
2021 SN Computer Science  
We propose several saliency features which are computed from different networks and different levels with the aim to define which optimal network and layer for the task of saliency detection.  ...  In this paper, we develop a saliency approach based on the computation of deep whitened features combined with shape features from object proposals.  ...  Related Work Visual Saliency Itti [21] proposed a saliency method based on combining low level features such as color, orientation and intensity.  ... 
doi:10.1007/s42979-021-00499-7 fatcat:wgu6awptnfdo5kqrnwh7x5fxny

Deep Learning Frameworks Applied For Audio-Visual Scene Classification [article]

Lam Pham, Alexander Schindler, Mina Schütz, Jasmin Lampert, Sven Schlarb, Ross King
2021 arXiv   pre-print
The highest classification accuracy of 93.9%, obtained from an ensemble of audio-based and visual-based frameworks, shows an improvement of 16.5% compared with DCASE baseline.  ...  In this paper, we present deep learning frameworks for audio-visual scene classification (SC) and indicate how individual visual and audio features as well as their combination affect SC performance.  ...  (SC), indicate individual roles of visual and audio features as well as their combination within SC task. 2) We then propose an ensemble of audio-based and visual-based frameworks, which help to achieve  ... 
arXiv:2106.06840v1 fatcat:ihkzmj4wcncuhgg3jyeymxomhq
« Previous Showing results 1 — 15 out of 487,490 results