1,559 Hits in 7.1 sec

Violence detection in hollywood movies by the fusion of visual and mid-level audio cues

Esra Acar, Frank Hopfgartner, Sahin Albayrak
2013 Proceedings of the 21st ACM international conference on Multimedia - MM '13  
(VQ: vector quantization, SIFT: Scale Invariant Features Transform, STIP: Spatial-Temporal Interest Points, VSD: Violent Scenes Detection) [1] Team Features Modality Method APat20 Shanghai-Hongkong  ...  by an SVM-based fusion and manage to be in the top 25% among submissions in the MediaEval Violent Scenes Detection (VSD) task [2] in terms of average precision at 20.  ... 
doi:10.1145/2502081.2502187 dblp:conf/mm/AcarHA13 fatcat:bg4yzmzz75e3dkisokp7dwbrva

Predicting Violence Rating Based on Pairwise Comparison

Ying JI, Yu WANG, Jien KATO, Kensaku MORI
2020 IEICE transactions on information and systems  
Many previous studies only address the problems of violence scene detection and violent action recognition, yet violence rating problem is still not solved.  ...  With the rapid development of multimedia, violent video can be easily accessed in games, movies, websites, and so on.  ...  Different from previous violent scene detection approaches, we do not detect exact actions or scenes in a violent video.  ... 
doi:10.1587/transinf.2020edp7056 fatcat:7jxl753s3jcrjkvt2iuduuixc4

A naive mid-level concept-based fusion approach to violence detection in Hollywood movies

Bogdan Ionescu, Jan Schlüter, Ionut Mironica, Markus Schedl
2013 Proceedings of the 3rd ACM conference on International conference on multimedia retrieval - ICMR '13  
Experimental validation conducted in the context of the Violent Scenes Detection task of the MediaEval 2012 Multimedia Benchmark Evaluation show the potential of this approach that ranked first among 34  ...  Given the high variability in appearance of violent scenes in movies, training a classifier to predict violent frames directly from visual or/and auditory features seems rather difficult.  ...  , Affect task: Violent Scenes Detection [2, 17] .  ... 
doi:10.1145/2461466.2461502 dblp:conf/mir/IonescuSMS13 fatcat:nq5sp5g3hbaehfhygm727vseaq

Detecting violent content in Hollywood movies by mid-level audio representations

Esra Acar, Frank Hopfgartner, Sahin Albayrak
2013 2013 11th International Workshop on Content-Based Multimedia Indexing (CBMI)  
ACKNOWLEDGMENT We would like to thank Technicolor (http://www. for providing the ground truth, video shot boundaries and the corresponding keyframes which have been used in this work.  ...  III AVERAGE PRECISION (AP) AT 100 FOR THE BEST RUN OF TEAMS IN THE MEDIAEVAL VSD TASK AND OUR METHODS (VQ: VECTOR QUANTIZATION, SC: SPARSE CODING, SIFT: SCALE INVARIANT FEATURES TRANSFORM, STIP: SPATIAL-TEMPORAL  ...  In our current framework, we only exploit the audio modality of videos to detect violent segments, since sound effects are essential elements which film-makers make use of in order to stimulate people's  ... 
doi:10.1109/cbmi.2013.6576556 dblp:conf/cbmi/AcarHA13 fatcat:p7mnw4v6uzci7iy6g6ne6pnsjq

Breaking down violence detection: Combining divide-et-impera and coarse-to-fine strategies

Esra Acar, Frank Hopfgartner, Sahin Albayrak
2016 Neurocomputing  
The results demonstrate the potential of the proposed approach on the standardized datasets of the latest editions of the MediaEval Affect in Multimedia: Violent Scenes Detection (VSD) task of 2014 and  ...  Traditional approaches to violent scene detection build on audio or visual features to model violence as a single concept in the feature space.  ...  Multi-task LDA to perform multi-view action recognition based on temporal self-similarity matrices.  ... 
doi:10.1016/j.neucom.2016.05.050 fatcat:v27v4axjlneydkhxgiehbwuvhy

A Novel Spatio-Temporal Violence Classification Framework Based on Material Derivative and LSTM Neural Network

Wafa Lejmi, Anouar Ben Khalifa, Mohamed Ali Mahjoub
2020 Traitement du signal  
recognition, based on a preliminary spatio-temporal features extraction using the material derivative which describes the rate of change of a particle while in motion with respect to time.  ...  The classification algorithm is then carried out using a deep learning LSTM technique to classify generated features into eight specified violent and non-violent categories and a prediction value for each  ...  including the existing methods commonly employed in violence detection, we conceptualized a novel prediction framework for violent scenes recognition, based on a preliminary spatio-temporal features extraction  ... 
doi:10.18280/ts.370501 fatcat:wtqbfu3jcvawjlnjvebpol6piq

Crowded Scene Analysis: A Survey

Teng Li, Huan Chang, Meng Wang, Bingbing Ni, Richang Hong, Shuicheng Yan
2015 IEEE transactions on circuits and systems for video technology (Print)  
This paper surveys the state-of-the-art techniques on this topic. We first provide the background knowledge and the available features related to crowded scenes.  ...  In the past few years, an increasing number of works on crowded scene analysis have been reported, covering different aspects including crowd motion pattern learning, crowd behavior and activity analysis  ...  The global behaviors of video clips are modeled based on the distributions of lowlevel visual features, and multi-agent interactions are modeled based on the distributions of atomic activities.  ... 
doi:10.1109/tcsvt.2014.2358029 fatcat:prgoh37gjfcl7n6dp2u6tsdoda

Vision-based Fight Detection from Surveillance Cameras

Seymanur Akti, Gozde Ayse Tataroglu, Hazim Kemal Ekenel
2019 2019 Ninth International Conference on Image Processing Theory, Tools and Applications (IPTA)  
Vision-based action recognition is one of the most challenging research topics of computer vision and pattern recognition.  ...  A specific application of it, namely, detecting fights from surveillance cameras in public areas, prisons, etc., is desired to quickly get under control these violent incidents.  ...  In this method two CNNs are used, one for spatial feature extraction, which learns the actions from single images and the other one is for the temporal feature extraction, which learns from the optical  ... 
doi:10.1109/ipta.2019.8936070 dblp:conf/ipta/AktiTE19 fatcat:dj2h6vj44jfcjgrnyttxxrlwam

Violence Detection in Videos [article]

Praveen Tirupattur, Christian Schulze, Andreas Dengel
2021 arXiv   pre-print
The performance of the system is evaluated on two classification tasks, Multi-Class classification, and Binary Classification.  ...  Binary SVM classifiers are trained on each of these features to detect violence.  ...  This spatio-temporal dynamic activity feature is based on the amount of dynamic motion that is present in the shot.  ... 
arXiv:2109.08941v1 fatcat:bqcpyvilprftxjvuy7egxwsjw4

Explicit kissing scene detection in cartoon using convolutional long short-term memory

Muhammad Arif Haikal Muhammad Fadzli, Mohd Fadzil Abu Hassan, Norazlin Ibrahim
2022 Bulletin of Electrical Engineering and Informatics  
This paper proposes a deep learning-based classifier to detect the kissing scene in the cartoon by using Darknet-19 for frame-level feature extraction, while the feature aggregation in the temporal domain  ...  Extensive experiments prove that the proposed system provides excellent results of 96.43% accuracy to detect the kissing scene in the cartoon.  ...  The convolutional long short-term memory (conv-LSTM) can capture the localized spatial-temporal features that enable the analysis of local motion taking place in the violent video. Núñez et al.  ... 
doi:10.11591/eei.v11i1.3542 fatcat:be4mjlwwbjdkzp3xosblbom5ge

A review on Video Classification with Methods, Findings, Performance, Challenges, Limitations and Future Work

Md Shofiqul Islam, Mst Sunjida Sultana, Uttam Kumar Roy, Jubayer Al Mahmud
2021 Jurnal Ilmiah Teknik Elektro Komputer dan Informatika  
Lastly, we also present a quick summary table based on selected features.  ...  Study on video classification systems using their tools, benefits, drawbacks, as well as other features to compare the techniques they have used also constitutes a key task of this review.  ...  A nice review on deep learning based on video classification and captioning task [2] .  ... 
doi:10.26555/jiteki.v6i2.18978 fatcat:jbdixy73xvfurpmyezlk755xzu

Mid-level Representation for Visual Recognition [article]

Moin Nabi
2015 arXiv   pre-print
In the case of image understanding, we focus on object detection/recognition task.  ...  We investigate on discovering and learning a set of mid-level patches to be used for representing the images of an object category.  ...  On one hand, feature-based methods adopt the classical strategy of detecting first a set of low-level spatio-temporal features, followed by the definition of the related de- scriptors.  ... 
arXiv:1512.07314v1 fatcat:knmhkwxqk5aczis7ce6g2sv2wm

Weakly-supervised Visual Instrument-playing Action Detection in Videos [article]

Jen-Yu Liu, Yi-Hsuan Yang, Shyh-Kang Jeng
2018 arXiv   pre-print
Instrument playing is among the most common scenes in music-related videos, which represent nowadays one of the largest sources of online videos.  ...  We evaluate the result of the proposed method temporally and spatially on a small dataset (totally 5,400 frames) that we manually annotated.  ...  on the learned features.  ... 
arXiv:1805.02031v1 fatcat:wa562fonhjc7livby4ovvkcpee

A fusion scheme of visual and auditory modalities for event detection in sports video

Min Xu, Ling-Yu Duan, Chang-Sheng Xu, Qi Tian
2003 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698)  
Correctly detected sports video events will greatly facilitate further structural and temporal analysis, such as sports video skimming, table of content, etc.  ...  In this paper, we propose an effective fusion scheme of visual and auditory modalities to detect events in sports video.  ...  [4] tried to detect violent events and car chases in feature films by performing the analysis of environmental sounds such as gunfire, engines, horns, and explosions. Y. Rui et al.  ... 
doi:10.1109/icme.2003.1220922 dblp:conf/icmcs/XuDXT03 fatcat:6gxfz64ayzb7nfg65445zjtwbm

Multimodal Video Indexing: A Review of the State-of-the-art

Cees G.M. Snoek, Marcel Worring
2005 Multimedia tools and applications  
Efficient and effective handling of video documents depends on the availability of indexes. Manual indexing is unfeasible for large video collections. In this paper we survey several methods aiming  ...  This research is sponsored by the ICES/KIS Multimedia Information Analysis project and TNO Institute of Applied Physics (TPD).  ...  Acknowledgements The authors would like to thank Jeroen Vendrig and Ioannis Patras from the University of Amsterdam for their valuable comments and suggestions.  ... 
doi:10.1023/b:mtap.0000046380.27575.a5 fatcat:cskgpbfx5bgapl5uv72cs352ae
« Previous Showing results 1 — 15 out of 1,559 results