5 Hits in 0.5 sec

YouTube UGC Dataset for Video Compression Research [article]

Yilin Wang, Sasi Inguva, Balu Adsumilli
2019 arXiv   pre-print
Non-professional video, commonly known as User Generated Content (UGC) has become very popular in today's video sharing applications. However, traditional metrics used in compression and quality assessment, like BD-Rate and PSNR, are designed for pristine originals. Thus, their accuracy drops significantly when being applied on non-pristine originals (the majority of UGC). Understanding difficulties for compression and quality assessment in the scenario of UGC is important, but there are few
more » ... lic UGC datasets available for research. This paper introduces a large scale UGC dataset (1500 20 sec video clips) sampled from millions of YouTube videos. The dataset covers popular categories like Gaming, Sports, and new features like High Dynamic Range (HDR). Besides a novel sampling method based on features extracted from encoding, challenges for UGC compression and quality evaluation are also discussed. Shortcomings of traditional reference-based metrics on UGC are addressed. We demonstrate a promising way to evaluate UGC quality by no-reference objective quality metrics, and evaluate the current dataset with three no-reference metrics (Noise, Banding, and SLEEQ).
arXiv:1904.06457v2 fatcat:suek2m5x3bdm5bj66ipkupcb3y

A no-reference video quality predictor for compression and scaling artifacts

Deepti Ghadiyaram, Chao Chen, Sasi Inguva, Anil Kokaram
2017 2017 IEEE International Conference on Image Processing (ICIP)  
No-Reference (NR) video quality assessment (VQA) models are gaining popularity as they offer scope for broader applicability to user-uploaded video-centric services such as YouTube and Facebook, where the pristine references are unavailable. However, there are few, well-performing NR-VQA models owing to the difficulty of the problem. We propose a novel NR video quality predictor that solely relies on the 'quality-aware' natural statistical models in the space-time domain. The proposed quality
more » ... edictor called Self-reference based LEarning-free Evaluator of Quality (SLEEQ) consists of three components: feature extraction in the spatial and temporal domains, motion-based feature fusion, and spatialtemporal feature pooling to derive a single quality score for a given video. SLEEQ achieves higher than 0.9 correlation with the subjective video quality scores on tested public databases and thus outperforms the existing NR VQA models. Index Terms-Perceptual video quality, objective quality assessment, H.264 compression, scaling artifacts.
doi:10.1109/icip.2017.8296922 dblp:conf/icip/GhadiyaramCIK17 fatcat:gz3xfrcmkzbvblrpv4hehrxpu4

Temporal synchronization of multiple audio signals

Julius Kammerl, Neil Birkbeck, Sasi Inguva, Damien Kelly, A. J. Crawford, Hugh Denman, Anil Kokaram, Caroline Pantofaru
2014 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)  
Given the proliferation of consumer media recording devices, events often give rise to a large number of recordings. These recordings are taken from different spatial positions and do not have reliable timestamp information. In this paper, we present two robust graph-based approaches for synchronizing multiple audio signals. The graphs are constructed atop the over-determined system resulting from pairwise signal comparison using cross-correlation of audio features. The first approach uses a
more » ... imum Spanning Tree (MST) technique, while the second uses Belief Propagation (BP) to solve the system. Both approaches can provide excellent solutions and robustness to pairwise outliers, however the MST approach is much less complex than BP. In addition, an experimental comparison of audio features-based synchronization shows that spectral flatness outperforms the zero-crossing rate and signal energy.
doi:10.1109/icassp.2014.6854474 dblp:conf/icassp/KammerlBIKCDKP14 fatcat:wfv5h564engx3apbj3zr2ipgou

Reconstruction of the Pose of Uncalibrated Cameras via User-Generated Videos

Stuart Bennett, Joan Lasenby, Anil Kokaram, Sasi Inguva, Neil Birkbeck
2014 Proceedings of the International Conference on Distributed Smart Cameras - ICDSC '14  
Extraction of 3D geometry from hand-held unsteady uncalibrated cameras faces multiple difficulties: finding usable frames, featurematching and unknown variable focal length to name three. We have built a prototype system to allow a user to spatially navigate playback viewpoints of an event of interest, using geometry automatically recovered from casually captured videos. The system, whose workings we present in this paper, necessarily estimates not only scene geometry, but also relative
more » ... t position, overcoming the mentioned difficulties in the process. The only inputs required are video sequences from various viewpoints of a common scene, as are readily available online from sporting and music events. Our methods make no assumption of the synchronization of the input and do not require file metadata, instead exploiting the video to selfcalibrate. The footage need only contain some camera rotation with little translation -for hand-held event footage a likely occurrence.
doi:10.1145/2659021.2659028 dblp:conf/icdsc/BennettLKIB14 fatcat:bl6n4bvstnevbeu2oa7qpok4ia

A Subjective Study for the Design of Multi-resolution ABR Video Streams with the VP9 Codec

Chao Chen, Sasi Inguva, Andrew Rankin, Anil Kokaram
2016 IS&T International Symposium on Electronic Imaging Science and Technology  
Adaptive bit rate (ABR) streaming is one enabling technology for video streaming over modern throughput-varying communication networks. A widely used ABR streaming method is to adapt the video bit rate to channel throughput by dynamically changing the video resolution. Since videos have different ratequality performances at different resolutions, such ABR strategy can achieve better rate-quality trade-off than single resolution ABR streaming. The key problem for resolution switched ABR is to
more » ... k out the bit rate appropriate at each resolution. In this paper, we investigate optimal strategies to estimate this bit rate using both quantitative and subjective quality assessment. We use the design of bitrates for 2K and 4K resolutions as an example of the performance of this strategy. We introduce strategies for selecting an appropriate corpus for subjective assessment and find that at this high resolution there is good agreement between quantitative and subjective analysis. The optimal switching bit rate between 2K and 4K resolutions is 4 Mbps.
doi:10.2352/issn.2470-1173.2016.2.vipc-235 fatcat:undism5eqvewdjgesulmcamvt4