Filters








851 Hits in 12.4 sec

Semantic Annotation and Automated Extraction of Audio-Visual Staging Patterns in Large-Scale Empirical Film Studies

Henning Agt-Rickauer, Christian Hentschel, Harald Sack
2018 International Conference on Semantic Systems  
This work is partially supported by the Federal Ministry of Education and Research under grant number 01UG1632B.  ...  Tool-Supported Empirical Film Studies The systematic empirical study of audio-visual patterns in feature films, documentaries and TV reports requires a digitally supported methodology to produce consistent  ...  The annotation vocabulary for empirical film studies and semantic annotations of audio-visual material based on Linked Open Data principles enables the publication, reuse, retrieval, and visualization  ... 
dblp:conf/i-semantics/Agt-RickauerHS18 fatcat:ktggai5mj5hblajutwrjnnnhmq

Unsupervised Approaches for Textual Semantic Annotation, A Survey

Xiaofeng Liao, Zhiming Zhao
2019 ACM Computing Surveys  
ACKNOWLEDGMENTS The authors thank the anonymous reviewers for their helpful comments, in addition to Cees de Laat, Paul Martin, Jayachander Surbiryala, and ZeShun Shi for useful discussions.  ...  refinement" -image -OR -video -OR -audio -OR -visual -OR -gene -OR -scene Column "Automation" indicates whether that paper investigates the degree of automation of the semantic annotation tools covered  ...  "semantic annotation") AND TITLE: (survey OR review OR study OR examination OR "state of the art" OR analyse) NOT TITLE: (image OR video OR audio OR visual OR gene OR scene OR imagery OR vision OR multimedia  ... 
doi:10.1145/3324473 fatcat:fg5ucwtloze6ljdlh4hqjkqxfe

Multimedia content analysis for emotional characterization of music video clips

Ashkan Yazdani, Evangelos Skodras, Nikolaos Fakotakis, Touradj Ebrahimi
2013 EURASIP Journal on Image and Video Processing  
In this paper, multimedia content analysis is performed to extract affective audio and visual cues from different music video clips.  ...  Furthermore, several fusion techniques are used to combine the information extracted from the audio and video contents of music video clips.  ...  Quality of Experience in Multimedia Systems and Services -QUALINET, and the NCCR Interactive Multimodal Information Management (IM2).  ... 
doi:10.1186/1687-5281-2013-26 fatcat:gjriub35ovgkbop56lecfphkwu

Deep Learning for Visual Speech Analysis: A Survey [article]

Changchong Sheng, Gangyao Kuang, Liang Bai, Chenping Hou, Yulan Guo, Xin Xu, Matti Pietikäinen, Li Liu
2022 arXiv   pre-print
Visual speech, referring to the visual domain of speech, has attracted increasing attention due to its wide applications, such as public security, medical treatment, military defense, and film entertainment  ...  We cover different aspects of visual speech, including fundamental problems, challenges, benchmark datasets, a taxonomy of existing methods, and state-of-the-art performance.  ...  LRS3-TED [89] is a large-scale sentence-level audio-visual dataset. Compared to LRS2-TED, it has a larger scale in terms of duration, vocabulary, and number of speakers.  ... 
arXiv:2205.10839v1 fatcat:l5m4ohtcvnevrliaiwawg3phjq

Tools for Searching, Annotation and Analysis of Speech, Music, Film and Video A Survey

A. Marsden, A. Mackenzie, A. Lindsay, H. Nock, J. Coleman, G. Kochanski
2007 Literary and Linguistic Computing  
This paper examines the actual and potential use of software tools in research in the arts and humanities focussing on audiovisual materials such as recorded speech, music, video and film.  ...  Many researchers make some kind of transcription of materials, and would value tools to automate this process.  ...  Acknowledgements We gratefully acknowledge the generous amount of time and information given by all of the participants in interviews, and the financial support of the Arts and Humanities Research Council  ... 
doi:10.1093/llc/fqm021 fatcat:oniladdzyjbzbazq5a5qwzgmoa

Access to recorded interviews

Franciska De Jong, Douglas W. Oard, Willemijn Heeren, Roeland Ordelman
2008 ACM Journal on Computing and Cultural Heritage  
A large number of important research issues are identified, and from that set of issues, a coherent research agenda is proposed.  ...  This article reviews the present state of practice for making recorded interviews available and the state-of-the-art for key component technologies.  ...  The diversity in the collection was large both with respect to the topics and the recording dates (spanning half a century from the early years of film onwards).  ... 
doi:10.1145/1367080.1367083 fatcat:2ov64c4q65g3jkkmuazcf7qf7m

Design Patterns for Resource-Constrained Automated Deep-Learning Methods

Lukas Tuggener, Mohammadreza Amirian, Fernando Benites, Pius von Däniken, Prakhar Gupta, Frank-Peter Schilling, Thilo Stadelmann
2020 AI  
We present an extensive evaluation of a wide variety of promising design patterns for automated deep-learning (AutoDL) methods, organized according to the problem categories of the 2019 AutoDL challenges  ...  In particular, we establish (a) that very wide fully connected layers learn meaningful features faster; we illustrate (b) how the lack of pretraining in audio processing can be compensated by architecture  ...  Methods for Audio Data This section describes the unique challenges and opportunities in automated audio processing in the context of the AutoSpeech challenge of AutoDL 2019 and presents empirical results  ... 
doi:10.3390/ai1040031 fatcat:joz2b36rmnclhkbmijhlqrgznm

Multimodal Stereoscopic Movie Summarization Conforming to Narrative Characteristics

Ioannis Mademlis, Anastasios Tefas, Nikos Nikolaidis, Ioannis Pitas
2016 IEEE Transactions on Image Processing  
The European Union is not liable for any use that may be made of the information contained therein.  ...  ACKNOWLEDGMENT The research leading to these results has received funding from the European Union Seventh Framework Programme (FP7/2007-2013) under grant agreement numbers 287674 (3DTVS) and 316564 (IMPART  ...  the production stage of theatrical films or TV series, where the action described in the script is typically filmed using multiple cameras, or the post-production stage of such material, where its semantic  ... 
doi:10.1109/tip.2016.2615289 pmid:28113502 fatcat:a5rw43bakvbibcyqtaif32p5qq

Multi-Modal Music Information Retrieval: Augmenting Audio-Analysis with Visual Computing for Improved Music Video Analysis [article]

Alexander Schindler
2020 arXiv   pre-print
Additionally, new visual features are introduced capturing rhythmic visual patterns. In all of these experiments the audio-based results serve as benchmark for the visual and audio-visual approaches.  ...  A series of comprehensive experiments and evaluations are conducted which are focused on the extraction of visual information and its application in different MIR tasks.  ...  analyzed if identified visual patterns reported in literature (see RQ1) can be verified through automated analysis.  ... 
arXiv:2002.00251v1 fatcat:6cz6rivc3fbg7fahdsnokxfrk4

Multimedia content analysis-using both audio and visual clues

Yao Wang, Zhu Liu, Jin-Cheng Huang
2000 IEEE Signal Processing Magazine  
Such skims have demonstrated significant benefits in performance and user satisfaction compared to simple subsampled skims in empirical studies [54] .  ...  Scene detection and classification depends on the definition of scene types in terms of their audio and visual patterns.  ... 
doi:10.1109/79.888862 fatcat:lxquhqnvxbduthix4zg52v32ca

SceneMaker: Intelligent Multimodal Visualisation of Natural Language Scripts [chapter]

Eva Hanser, Paul Mc Kevitt, Tom Lunney, Joan Condell
2010 Lecture Notes in Computer Science  
Accuracy of content animation, effectiveness of expression and usability of the interface will be evaluated in empirical tests.  ...  During the generation of the story content, special attention will be given to emotional aspects and their reflection in the execution of all types of modalities: fluency and manner of actions and behaviour  ...  Linguistic and visual semantics are connected through the proposed Lexical Visual Semantics Representation (LVSR) which focuses in particular on visual semantics of verbs and suitability for action execution  ... 
doi:10.1007/978-3-642-17080-5_17 fatcat:w23elmcseja3lfycqehk5awdju

Advancing an Interdisciplinary Science of Conversation: Insights from a Large Multimodal Corpus of Human Speech [article]

Andrew Reece, Gus Cooney, Peter Bull, Christine Chung, Bryn Dawson, Casey Fitzpatrick, Tamara Glazer, Dean Knox, Alex Liebscher, Sebastian Marin
2022 arXiv   pre-print
increasingly interested in the study of conversation.  ...  This 7+ million word, 850 hour corpus totals over 1TB of audio, video, and transcripts, with moment-to-moment measures of vocal, facial, and semantic expression, along with an extensive survey of speaker  ...  -Alexi Robichaux and Gabriella Kellerman in particular-for its sponsorship of this research and for BetterUp's willingness to share the data collected for research among the wider scientific community.  ... 
arXiv:2203.00674v1 fatcat:mezjuaeapnf4lkoyt4k6jkn4xu

Media Lifecycle and Content Analysis in Social Media Communities

Lexing Xie, Hari Sundaram
2012 2012 IEEE International Conference on Multimedia and Expo  
We first examine the content production, dissemination and consumption patterns in the recent social media studies literature.  ...  This paper examines the role of content analysis in media-rich online communities.  ...  The case study in Sec. 4.2 makes use of large-scale video remix to infer content interestingness and user influence.  ... 
doi:10.1109/icme.2012.138 dblp:conf/icmcs/XieS12 fatcat:3wqz3b3tgzfoloki2wrupvaz3m

Knowledge Extraction And Representation Learning For Music Recommendation And Classification

Sergio Oramas, Xavier Serra
2017 Zenodo  
In this thesis, we address the problems of classifying and recommending music present in large collections.  ...  Then, we show how modeling semantic information may impact musicological studies and helps to outperform purely text-based approaches in music similarity, classification, and recommendation.  ...  First, blurring the boundaries between the two-stage architecture, which implies fully automated optimization of both stages at once.  ... 
doi:10.5281/zenodo.1048497 fatcat:kdh5jhvocbh3riwln6n2f756su

Knowledge Extraction And Representation Learning For Music Recommendation And Classification

Sergio Oramas, Xavier Serra
2017 Zenodo  
In this thesis, we address the problems of classifying and recommending music present in large collections.  ...  Then, we show how modeling semantic information may impact musicological studies and helps to outperform purely text-based approaches in music similarity, classification, and recommendation.  ...  First, blurring the boundaries between the two-stage architecture, which implies fully automated optimization of both stages at once.  ... 
doi:10.5281/zenodo.1100973 fatcat:yfpmc6qxbbakjp6qzvywyoaoci
« Previous Showing results 1 — 15 out of 851 results