Filters








369 Hits in 8.5 sec

Automatic Image and Video Caption Generation with Deep Learning: A Concise Review and Algorithmic Overlap

Soheyla Amirian, Khaled Rasheed, Thiab R. Taha, Hamid R. Arabnia
2020 IEEE Access  
The purpose of the proposed system is to automatically generate a title and also an abstract for a video clip without manual intervention.  ...  architecture; • A novel application (case study) of video captioning, namely, the automatic generation of "titles" for video clips.  ... 
doi:10.1109/access.2020.3042484 fatcat:ssl5awoxlrb5rdxbekvv3af74u

Deep Learning based, a New Model for Video Captioning

Elif Güsta Özer, Ilteber Nur, Sena Basbug, Sümeyye Turan, Anil Utku, M. Ali
2020 International Journal of Advanced Computer Science and Applications  
In this study, a video captioning system has been developed for visually impaired individuals to analyze the events through real-time images and express them in meaningful sentences.  ...  First, all clips have been muted so that the sounds of the clips have not been used in the sentence extraction process.  ...  The video tag is the name of a extraction of particular object or event in the video. Image (frame) captioning is automatically generating a single sentence or multiple sentence that define an image.  ... 
doi:10.14569/ijacsa.2020.0110365 fatcat:qki6jquumrfetcmihhovlf7tti

Video Description: A Survey of Methods, Datasets and Evaluation Metrics [article]

Nayyer Aafaq, Ajmal Mian, Wei Liu, Syed Zulqarnain Gilani, Mubarak Shah
2019 arXiv   pre-print
Video description is the automatic generation of natural language sentences that describe the contents of a given video.  ...  Classical video description approaches combined subject, object and verb detection with template based language models to generate sentences.  ...  ACKNOWLEDGEMENTS The authors acknowledge Marcus Rohrbach (Facebook AI Research) for his valuable input. The research was supported by ARC Discovery Grant DP160101458 and DP150102405.  ... 
arXiv:1806.00186v3 fatcat:elxztcpzizhr7clugnbjvvrpte

A semi-automatic approach to home video editing

Andreas Girgensohn, John Boreczky, Patrick Chiu, John Doherty, Jonathan Foote, Gene Golovchinsky, Shingo Uchihashi, Lynn Wilcox
2000 Proceedings of the 13th annual ACM symposium on User interface software and technology - UIST '00  
Combined with standard editing rules, this score is used to identify clips for inclusion in the final video and to select their start and end points.  ...  To create a custom video, the user drags keyframes corresponding to the desired clips into a storyboard. Users can lengthen or shorten the clip without specifying the start and end frames explicitly.  ...  After reviewing the automatically generated video, the user can use the storyboard interface to lengthen or shorten the individual clips in cases where the rules yielded unwanted results.  ... 
doi:10.1145/354401.354415 dblp:conf/uist/GirgensohnBCDFGUW00 fatcat:xz7ff3l34vegfaalsaerz6m6de

Keyframe-based user interfaces for digital video

A. Girgensohn, J. Boreczky, L. Wilcox
2001 Computer  
Interactive browsing of video summaries can shorten the time required to find the desired segment.  ...  The browser displays the piles row by row in the time order of the first clip in each pile, as Figure 4 shows.  ... 
doi:10.1109/2.947093 fatcat:hupaa4khnbey7jqewkk2awyssm

[Invited Paper] Content Analysis for Home Videos

Naoko Nitta, Noboru Babaguchi
2013 ITE Transactions on Media Technology and Applications  
This paper introduces the content analysis techniques, namely, techniques for segmentation, indexing, and static and dynamic representation generation, which have been developed to help viewers watch such  ...  poor-quality videos by considering the characteristics of home videos.  ...  For example, a music-video-like video can be automatically generated by aligning visually similar subshots to repetitive music segments in a music clip such as prelude, interlude, and coda 37) .  ... 
doi:10.3169/mta.1.91 fatcat:qhjiyeecyvgejkbozzjqj5353e

Teaching Machines to Understand Baseball Games: Large-Scale Baseball Video Database for Multiple Video Understanding Tasks [chapter]

Minho Shim, Young Hwi Kim, Kyungmin Kim, Seon Joo Kim
2018 Lecture Notes in Computer Science  
A major obstacle in teaching machines to understand videos is the lack of training data, as creating temporal annotations for long videos requires a huge amount of human effort.  ...  The new dataset has several major challenging factors compared to other datasets: 1) the dataset contains a large number of visually similar segments with different labels. 2) It can be used for many video  ...  First, we extract optical flows from clips for every 5 frame and normalize it to [0,255], which allows storing the optical flow as an image.  ... 
doi:10.1007/978-3-030-01267-0_25 fatcat:3oqweqmqvvacdjnsprxnkec4hm

Rescribe: Authoring and Automatically Editing Audio Descriptions [article]

Amy Pavel, Gabriel Reyes, Jeffrey P. Bigham
2020 arXiv   pre-print
An experienced audio description author will produce content that fits narration necessary to understand, enjoy, or experience the video content into the time available.  ...  Audio descriptions make videos accessible to those who cannot see them by describing visual content in audio.  ...  ACKNOWLEDGEMENTS We thank Kyle Murray for his contributions to the design and development of Rescribe.  ... 
arXiv:2010.03667v1 fatcat:efqccyftqfhsnhrpmwtfbb2saa

Video summarization [chapter]

Edward Delp, Cuneyt Taskiran
2004 Computer Engineering Series  
In the Informedia project time ordered keyframes, known as filmstrips, were as video abstracts [36] .  ...  [30] search the closed-caption text for cue words to generate summaries for talk shows.  ... 
doi:10.1201/9780203486788.ch8 fatcat:eelfmpcjxzb4vd3keg5z2gq7gy

Multimedia simplification for optimized MMS synthesis

Wei-Qi Yan, Mohan S. Kankanhalli
2007 ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)  
Once a request for MMS is received, the MMS server makes use of the simplified media from the gallery. The multimedia data is aligned with respect to the time-line for MMS message synthesis.  ...  The proposed approach aims at reducing the redundancy in the multimedia data captured by multiple types of media sensors. The simplified data is first stored into a gallery for further usage.  ...  , segment and index video for intelligent search and image retrieval with automatically generated metadata and indices for retrieving videos from the library with thousands hours of video with over two  ... 
doi:10.1145/1198302.1198307 fatcat:5kljq3izezetbmkmzkawqgjeba

Transcript to Video: Efficient Clip Sequencing from Texts [article]

Yu Xiong, Fabian Caba Heilbron, Dahua Lin
2021 arXiv   pre-print
For fast inference, we introduce an efficient search strategy for real-time video clip sequencing.  ...  To meet the demands for non-experts, we present Transcript-to-Video -- a weakly-supervised framework that uses texts as input to automatically create video sequences from an extensive collection of shots  ...  (ii) Original frame-based encoder in [41] processes a relatively short video clip as a frame stream.  ... 
arXiv:2107.11851v1 fatcat:vfcx7w75kzgg7ppurgswceoi5i

Personalized Abstraction of Broadcasted American Football Video by Highlight Selection

N. Babaguchi, Y. Kawai, T. Ogura, T. Kitahashi
2004 IEEE transactions on multimedia  
We first detect significant events in the video stream by matching textual overlays appearing in an image frame with the descriptions of gamestats in which highlights of the game are described.  ...  Video abstraction is defined as creating shorter video clips or video posters from an original video stream.  ...  In general, such a shot can be lengthy and needs to be shortened in editing process. Therefore, we divide the shot into the first still segment and its subsequent motion segment.  ... 
doi:10.1109/tmm.2004.830811 fatcat:tes4vkuo4rclrmvn4uok53vswy

A Prospective Study of the Use of Fetal Intelligent Navigation Echocardiography (FINE) to Obtain Standard Fetal Echocardiography Views

Paola Veronese, Gianna Bogana, Alessia Cerutti, Lami Yeo, Roberto Romero, Maria Teresa Gervasi
2016 Fetal Diagnosis and Therapy  
Conclusion: FINE applied to STIC volumes can successfully generate nine standard fetal echocardiography views in 96-100% of cases in the 2nd and 3rd trimesters.  ...  In normal cases, FINE was able to generate nine fetal echocardiography views using: (1) diagnostic planes in 76-100% of the cases, (2) VIS-Assistance ® in 96-100% of the cases, and (3) a combination of  ...  Disclosure Statement An application for a patent ('Apparatus and Method for Fetal Intelligent Navigation Echocardiography') has been filed with the US Patent and Trademark Office, and the patent is pending  ... 
doi:10.1159/000446982 pmid:27309391 pmcid:PMC5164869 fatcat:svxmuwjsbfhqxljuxzyemvfawy

Automatic detection of TV commercials

B. Satterwhite, O. Marques
2004 IEEE potentials  
Characteristics of commercials The problem of detecting commercials within television broadcasts is related to several-more general-problems in video processing.  ...  Developments in this arena may lead to interesting, additional work in video analysis (e.g., automatic detection of program genre to activate/deactivate commercial skipping features, or automatic techniques  ... 
doi:10.1109/mp.2004.1309790 fatcat:qbfif5m3ajc2nhsecu53xb2vcy

A Video Library System Using Scene Detection and Automatic Tagging [chapter]

Lorenzo Baraldi, Costantino Grana, Rita Cucchiara
2017 Communications in Computer and Information Science  
In the proposed system, database videos are automatically decomposed into meaningful and storytelling parts (i.e. scenes) and tagged in an automatic way by leveraging their transcript.  ...  We present a novel video browsing and retrieval system for edited videos, based on scene detection and automatic tagging.  ...  Every time a video is uploaded on the platform, a remote web service is called to perform the automatic decomposition of the video into scene, and to extract key-frames for visualization.  ... 
doi:10.1007/978-3-319-68130-6_5 fatcat:3bb6flzzw5ffbndcfmzaqge47a
« Previous Showing results 1 — 15 out of 369 results