A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2022; you can also visit the original URL.
The file type is application/pdf
.
Filters
Semantic Annotation and Automated Extraction of Audio-Visual Staging Patterns in Large-Scale Empirical Film Studies
2018
International Conference on Semantic Systems
This work is partially supported by the Federal Ministry of Education and Research under grant number 01UG1632B. ...
Tool-Supported Empirical Film Studies The systematic empirical study of audio-visual patterns in feature films, documentaries and TV reports requires a digitally supported methodology to produce consistent ...
The annotation vocabulary for empirical film studies and semantic annotations of audio-visual material based on Linked Open Data principles enables the publication, reuse, retrieval, and visualization ...
dblp:conf/i-semantics/Agt-RickauerHS18
fatcat:ktggai5mj5hblajutwrjnnnhmq
Unsupervised Approaches for Textual Semantic Annotation, A Survey
2019
ACM Computing Surveys
ACKNOWLEDGMENTS The authors thank the anonymous reviewers for their helpful comments, in addition to Cees de Laat, Paul Martin, Jayachander Surbiryala, and ZeShun Shi for useful discussions. ...
refinement" -image -OR -video -OR -audio -OR -visual -OR -gene -OR -scene Column "Automation" indicates whether that paper investigates the degree of automation of the semantic annotation tools covered ...
"semantic annotation") AND TITLE: (survey OR review OR study OR examination OR "state of the art" OR analyse) NOT TITLE: (image OR video OR audio OR visual OR gene OR scene OR imagery OR vision OR multimedia ...
doi:10.1145/3324473
fatcat:fg5ucwtloze6ljdlh4hqjkqxfe
Multimedia content analysis for emotional characterization of music video clips
2013
EURASIP Journal on Image and Video Processing
In this paper, multimedia content analysis is performed to extract affective audio and visual cues from different music video clips. ...
Furthermore, several fusion techniques are used to combine the information extracted from the audio and video contents of music video clips. ...
Quality of Experience in Multimedia Systems and Services -QUALINET, and the NCCR Interactive Multimodal Information Management (IM2). ...
doi:10.1186/1687-5281-2013-26
fatcat:gjriub35ovgkbop56lecfphkwu
Deep Learning for Visual Speech Analysis: A Survey
[article]
2022
arXiv
pre-print
Visual speech, referring to the visual domain of speech, has attracted increasing attention due to its wide applications, such as public security, medical treatment, military defense, and film entertainment ...
We cover different aspects of visual speech, including fundamental problems, challenges, benchmark datasets, a taxonomy of existing methods, and state-of-the-art performance. ...
LRS3-TED [89] is a large-scale sentence-level audio-visual dataset. Compared to LRS2-TED, it has a larger scale in terms of duration, vocabulary, and number of speakers. ...
arXiv:2205.10839v1
fatcat:l5m4ohtcvnevrliaiwawg3phjq
Tools for Searching, Annotation and Analysis of Speech, Music, Film and Video A Survey
2007
Literary and Linguistic Computing
This paper examines the actual and potential use of software tools in research in the arts and humanities focussing on audiovisual materials such as recorded speech, music, video and film. ...
Many researchers make some kind of transcription of materials, and would value tools to automate this process. ...
Acknowledgements We gratefully acknowledge the generous amount of time and information given by all of the participants in interviews, and the financial support of the Arts and Humanities Research Council ...
doi:10.1093/llc/fqm021
fatcat:oniladdzyjbzbazq5a5qwzgmoa
Access to recorded interviews
2008
ACM Journal on Computing and Cultural Heritage
A large number of important research issues are identified, and from that set of issues, a coherent research agenda is proposed. ...
This article reviews the present state of practice for making recorded interviews available and the state-of-the-art for key component technologies. ...
The diversity in the collection was large both with respect to the topics and the recording dates (spanning half a century from the early years of film onwards). ...
doi:10.1145/1367080.1367083
fatcat:2ov64c4q65g3jkkmuazcf7qf7m
Design Patterns for Resource-Constrained Automated Deep-Learning Methods
2020
AI
We present an extensive evaluation of a wide variety of promising design patterns for automated deep-learning (AutoDL) methods, organized according to the problem categories of the 2019 AutoDL challenges ...
In particular, we establish (a) that very wide fully connected layers learn meaningful features faster; we illustrate (b) how the lack of pretraining in audio processing can be compensated by architecture ...
Methods for Audio Data This section describes the unique challenges and opportunities in automated audio processing in the context of the AutoSpeech challenge of AutoDL 2019 and presents empirical results ...
doi:10.3390/ai1040031
fatcat:joz2b36rmnclhkbmijhlqrgznm
Multimodal Stereoscopic Movie Summarization Conforming to Narrative Characteristics
2016
IEEE Transactions on Image Processing
The European Union is not liable for any use that may be made of the information contained therein. ...
ACKNOWLEDGMENT The research leading to these results has received funding from the European Union Seventh Framework Programme (FP7/2007-2013) under grant agreement numbers 287674 (3DTVS) and 316564 (IMPART ...
the production stage of theatrical films or TV series, where the action described in the script is typically filmed using multiple cameras, or the post-production stage of such material, where its semantic ...
doi:10.1109/tip.2016.2615289
pmid:28113502
fatcat:a5rw43bakvbibcyqtaif32p5qq
Multi-Modal Music Information Retrieval: Augmenting Audio-Analysis with Visual Computing for Improved Music Video Analysis
[article]
2020
arXiv
pre-print
Additionally, new visual features are introduced capturing rhythmic visual patterns. In all of these experiments the audio-based results serve as benchmark for the visual and audio-visual approaches. ...
A series of comprehensive experiments and evaluations are conducted which are focused on the extraction of visual information and its application in different MIR tasks. ...
analyzed if identified visual patterns reported in literature (see RQ1) can be verified through automated analysis. ...
arXiv:2002.00251v1
fatcat:6cz6rivc3fbg7fahdsnokxfrk4
Multimedia content analysis-using both audio and visual clues
2000
IEEE Signal Processing Magazine
Such skims have demonstrated significant benefits in performance and user satisfaction compared to simple subsampled skims in empirical studies [54] . ...
Scene detection and classification depends on the definition of scene types in terms of their audio and visual patterns. ...
doi:10.1109/79.888862
fatcat:lxquhqnvxbduthix4zg52v32ca
SceneMaker: Intelligent Multimodal Visualisation of Natural Language Scripts
[chapter]
2010
Lecture Notes in Computer Science
Accuracy of content animation, effectiveness of expression and usability of the interface will be evaluated in empirical tests. ...
During the generation of the story content, special attention will be given to emotional aspects and their reflection in the execution of all types of modalities: fluency and manner of actions and behaviour ...
Linguistic and visual semantics are connected through the proposed Lexical Visual Semantics Representation (LVSR) which focuses in particular on visual semantics of verbs and suitability for action execution ...
doi:10.1007/978-3-642-17080-5_17
fatcat:w23elmcseja3lfycqehk5awdju
Advancing an Interdisciplinary Science of Conversation: Insights from a Large Multimodal Corpus of Human Speech
[article]
2022
arXiv
pre-print
increasingly interested in the study of conversation. ...
This 7+ million word, 850 hour corpus totals over 1TB of audio, video, and transcripts, with moment-to-moment measures of vocal, facial, and semantic expression, along with an extensive survey of speaker ...
-Alexi Robichaux and Gabriella Kellerman in particular-for its sponsorship of this research and for BetterUp's willingness to share the data collected for research among the wider scientific community. ...
arXiv:2203.00674v1
fatcat:mezjuaeapnf4lkoyt4k6jkn4xu
Media Lifecycle and Content Analysis in Social Media Communities
2012
2012 IEEE International Conference on Multimedia and Expo
We first examine the content production, dissemination and consumption patterns in the recent social media studies literature. ...
This paper examines the role of content analysis in media-rich online communities. ...
The case study in Sec. 4.2 makes use of large-scale video remix to infer content interestingness and user influence. ...
doi:10.1109/icme.2012.138
dblp:conf/icmcs/XieS12
fatcat:3wqz3b3tgzfoloki2wrupvaz3m
Knowledge Extraction And Representation Learning For Music Recommendation And Classification
2017
Zenodo
In this thesis, we address the problems of classifying and recommending music present in large collections. ...
Then, we show how modeling semantic information may impact musicological studies and helps to outperform purely text-based approaches in music similarity, classification, and recommendation. ...
First, blurring the boundaries between the two-stage architecture, which implies fully automated optimization of both stages at once. ...
doi:10.5281/zenodo.1048497
fatcat:kdh5jhvocbh3riwln6n2f756su
Knowledge Extraction And Representation Learning For Music Recommendation And Classification
2017
Zenodo
In this thesis, we address the problems of classifying and recommending music present in large collections. ...
Then, we show how modeling semantic information may impact musicological studies and helps to outperform purely text-based approaches in music similarity, classification, and recommendation. ...
First, blurring the boundaries between the two-stage architecture, which implies fully automated optimization of both stages at once. ...
doi:10.5281/zenodo.1100973
fatcat:yfpmc6qxbbakjp6qzvywyoaoci
« Previous
Showing results 1 — 15 out of 851 results