A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2022; you can also visit the original URL.
The file type is application/pdf
.
Filters
MMSys'22 Grand Challenge on AI-based Video Production for Soccer
[article]
2022
arXiv
pre-print
In addition, event detection should be thoroughly enhanced by annotation and classification, proper clipping, generating short descriptions, selecting appropriate thumbnails for highlight clips, and finally ...
In particular, we focus on the enhancement operations that take place after an event has been detected, namely event clipping (Task 1), thumbnail selection (Task 2), and game summarization (Task 3). ...
We also want to acknowledge Norsk Toppfotball (NTF), the Norwegian association for elite soccer, for making videos and metadata available for the challenge. ...
arXiv:2202.01031v1
fatcat:hewdb5t5tfggnp5uewhlvcztge
Comixify: Transform video into a comics
[article]
2018
arXiv
pre-print
In the first stage, we propose a state-of-the-art keyframes extraction algorithm that selects a subset of frames from the video to provide the most comprehensive video context and we filter those frames ...
In this paper, we propose a solution to transform a video into a comics. We approach this task using a neural style algorithm based on Generative Adversarial Networks (GANs). ...
In this stage we propose to use a state-of-the-art keyframe extraction algorithm based on reinforcement learning, which we further extend by combining temporal segmentation method with the image aesthetic ...
arXiv:1812.03473v1
fatcat:whsi2kzia5dhri4kg66nrfszki
Generative Target Update for Adaptive Siamese Tracking
[article]
2022
arXiv
pre-print
In particular, our approach relies on an auto-encoder trained through adversarial learning to detect changes in a target object's appearance and predict a future target template, using a set of target ...
Results indicate that our proposed approach can outperform state-of-art trackers, and its overall robustness allows tracking for a longer time before failure. ...
via reinforcement learning Duman and Erdem [2019] . ...
arXiv:2202.09938v1
fatcat:f2lr6pr5sfhnbch7i4qjgpivie
Video Summarization through Reinforcement Learning with a 3D Spatio-Temporal U-Net
[article]
2021
arXiv
pre-print
A 3D spatio-temporal U-Net is used to efficiently encode spatio-temporal information of the input videos for downstream reinforcement learning (RL). ...
An RL agent learns from spatio-temporal latent scores and predicts actions for keeping or rejecting a video frame in a video summary. ...
ACKNOWLEDGMENT We thank the volunteers and sonographers from routine fetal screening at St. Thomas' Hospital London. This work was supported by the Wellcome Trust IEH Award ...
arXiv:2106.10528v1
fatcat:6q3wnioqtrdktkfl5w5n3pjthi
Community-Empowered Air Quality Monitoring System
[article]
2018
arXiv
pre-print
The residents lacked the technological fluency to gather and curate diverse scientific data to advocate for regulatory change. ...
Developing information technology to democratize scientific knowledge and support citizen empowerment is a challenging task. ...
ACKNOWLEDGMENTS The Heinz Endowments, Allegheny County Clean Air Now, and all other participants. The authors thank Yen-Chi Chen for the advice in statistical analysis. ...
arXiv:1804.03293v1
fatcat:lqukrqk2eraydmmfnozoxos6xq
Summarizing Videos with Attention
[article]
2019
arXiv
pre-print
In this work we propose a novel method for supervised, keyshots based video summarization by applying a conceptually simple and computationally efficient soft, self-attention mechanism. ...
To that end we propose a simple, self-attention based network for video summarization which performs the entire sequence to sequence transformation in a single feed forward pass and single backward pass ...
[23] propose an adversarial network to summarize the video by minimizing the distance between the video and its summary. ...
arXiv:1812.01969v2
fatcat:5kzfj6ne2fa3tdywbs776dblf4
Visual Summarization of Scholarly Videos Using Word Embeddings and Keyphrase Extraction
[chapter]
2019
Lecture Notes in Computer Science
For this purpose, we exploit video annotations that are automatically generated by speech recognition and video OCR (optical character recognition). ...
Besides the quality of the learning resource's content, it is essential to discover the most relevant and suitable video in order to support the learning process most effectively. ...
As stated in Section 2, this technique models topics and phrases in a single graph, and their mutual reinforcement together with a specific mechanism to select the most import keyphrases are used to generate ...
doi:10.1007/978-3-030-30760-8_28
fatcat:hr4cw4osprfzhar3tgznhblxqu
Artificial Intelligence in the Creative Industries: A Review
[article]
2021
arXiv
pre-print
(RNNs) and Deep Reinforcement Learning (DRL). ...
A brief background of AI, and specifically Machine Learning (ML) algorithms, is provided including Convolutional Neural Network (CNNs), Generative Adversarial Networks (GANs), Recurrent Neural Networks ...
A category sentence generative adversarial network has also been proposed that combines GAN, RNN and reinforcement learning to enlarge training datasets, which improves performance for sentiment classification ...
arXiv:2007.12391v5
fatcat:mn2xqeylyrbabbu5zwln3admtm
Artificial intelligence in the creative industries: a review
2021
Artificial Intelligence Review
(RNNs) and deep Reinforcement Learning (DRL). ...
A brief background of AI, and specifically machine learning (ML) algorithms, is provided including convolutional neural networks (CNNs), generative adversarial networks (GANs), recurrent neural networks ...
A category sentence generative adversarial network has also been proposed that combines GAN, RNN and reinforcement learning to enlarge training datasets, which improves performance for sentiment classification ...
doi:10.1007/s10462-021-10039-7
fatcat:tcctdi7vprfx7mlujvqmpiy3ru
Characterizing Abhorrent, Misinformative, and Mistargeted Content on YouTube
[article]
2021
arXiv
pre-print
., flat earth) than for emerging ones (like COVID-19) and that these recommendations are more common on the search results page than on a user's homepage or the video recommendations section. ...
YouTube has revolutionized the way people discover and consume video. ...
For the detection of disturbing videos, we built a deep learning classifier that analyzes various metadata of a given video, as well as its thumbnail. ...
arXiv:2105.09819v1
fatcat:k2mdclt4orgunnlwb6hnnrka4i
Task-driven Semantic Coding via Reinforcement Learning
[article]
2021
arXiv
pre-print
learning (RL). ...
To solve this challenge, we design semantic maps for different tasks to extract the pixelwise semantic fidelity for videos/images. ...
Task-driven Semantic Coding via Reinforcement Learning Xin Li, Jun Shi and Zhibo Chen, Senior Member, IEEE, Abstract-Task-driven semantic video/image coding has drawn considerable attention with the development ...
arXiv:2106.03511v1
fatcat:nmdyqgerufc3jj6dj2zfdo3hp4
MERLOT: Multimodal Neural Script Knowledge Models
[article]
2021
arXiv
pre-print
By pretraining with a mix of both frame-level (spatial) and video-level (temporal) objectives, our model not only learns to match images to temporally corresponding words, but also to contextualize what ...
Ablation analyses demonstrate the complementary importance of: 1) training on videos versus static images; 2) scaling the magnitude and diversity of the pretraining video corpus; and 3) using diverse objectives ...
’s thumbnail selection algorithm selects high quality, clear frames. https://ai.
googleblog.com/2015/10/improving-youtube-video-thumbnails-with.html
7
http://www.speech.cs.cmu.edu/cgi-bin/cmudict ...
arXiv:2106.02636v3
fatcat:mrj2t3yuanbdzhsujshtky4enq
A Black-Box Attack Model for Visually-Aware Recommender Systems
[article]
2020
arXiv
pre-print
Due to the advances in deep learning, visually-aware recommender systems (RS) have recently attracted increased research interest. ...
Such systems combine collaborative signals with images, usually represented as feature vectors outputted by pre-trained image models. ...
We train a model for each combination of dataset and underlying RS, yielding 4 models in total. ...
arXiv:2011.02701v1
fatcat:nccqgmz5tzhojpk2kar4w4fogq
Exposure: A White-Box Photo Post-Processing Framework
[article]
2018
arXiv
pre-print
To apply the filters in a proper sequence and with suitable parameters, we employ a deep reinforcement learning approach that learns to make decisions on what action to take next, given the current state ...
As it is difficult for users to acquire paired images that reflect their retouching preferences, we present in this paper a deep learning approach that is instead trained on unpaired data, namely a set ...
How to determine the sequence and parameters of these filters for a given input image is learned with a deep reinforcement learning (RL) approach guided by a generative adversarial network (GAN) that models ...
arXiv:1709.09602v2
fatcat:2kx5a34cz5ahbdfbcqdb2o4lwy
Exposure
2018
ACM Transactions on Graphics
How to determine the sequence and parameters of these filters for a given input image is learned with a deep reinforcement learning (RL) approach guided by a generative adversarial network (GAN) that models ...
In addition, multiple operations are learned separately, while in our work operations are optimized elegantly as a whole, guided by the RL and GAN architecture. Reinforcement Learning. ...
doi:10.1145/3181974
fatcat:s3pf7fdi25hqncvga2gjbfjdzu
« Previous
Showing results 1 — 15 out of 135 results