34 Hits in 1.3 sec

A PCA-like Autoencoder [article]

Saïd Ladjal, Alasdair Newson, Chi-Hieu Pham
2019 arXiv   pre-print
An autoencoder is a neural network which data projects to and from a lower dimensional latent space, where this data is easier to understand and model. The autoencoder consists of two sub-networks, the encoder and the decoder, which carry out these transformations. The neural network is trained such that the output is as close to the input as possible, the data having gone through an information bottleneck : the latent space. This tool bears significant ressemblance to Principal Component
more » ... is (PCA), with two main differences. Firstly, the autoencoder is a non-linear transformation, contrary to PCA, which makes the autoencoder more flexible and powerful. Secondly, the axes found by a PCA are orthogonal, and are ordered in terms of the amount of variability which the data presents along these axes. This makes the interpretability of the PCA much greater than that of the autoencoder, which does not have these attributes. Ideally, then, we would like an autoencoder whose latent space consists of independent components, ordered by decreasing importance to the data. In this paper, we propose an algorithm to create such a network. We create an iterative algorithm which progressively increases the size of the latent space, learning a new dimension at each step. Secondly, we propose a covariance loss term to add to the standard autoencoder loss function, as well as a normalisation layer just before the latent space, which encourages the latent space components to be statistically independent. We demonstrate the results of this autoencoder on simple geometric shapes, and find that the algorithm indeed finds a meaningful representation in the latent space. This means that subsequent interpolation in the latent space has meaning with respect to the geometric properties of the images.
arXiv:1904.01277v1 fatcat:2qavywqk5fd2tnon7lxyb7f324

High Resolution Face Age Editing [article]

Xu Yao, Gilles Puy, Alasdair Newson, Yann Gousseau, Pierre Hellier
2020 arXiv   pre-print
Face age editing has become a crucial task in film post-production, and is also becoming popular for general purpose photography. Recently, adversarial training has produced some of the most visually impressive results for image manipulation, including the face aging/de-aging task. In spite of considerable progress, current methods often present visual artifacts and can only deal with low-resolution images. In order to achieve aging/de-aging with the high quality and robustness necessary for
more » ... er use, these problems need to be addressed. This is the goal of the present work. We present an encoder-decoder architecture for face age editing. The core idea of our network is to create both a latent space containing the face identity, and a feature modulation layer corresponding to the age of the individual. We then combine these two elements to produce an output image of the person with a desired target age. Our architecture is greatly simplified with respect to other approaches, and allows for continuous age editing on high resolution images in a single unified model.
arXiv:2005.04410v1 fatcat:qruu4ea2xvdkrc5sgx4suc3fhi

Processsing Simple Geometric Attributes with Autoencoders [article]

Alasdair Newson, Andrés Almansa, Yann Gousseau, Saïd Ladjal
2019 arXiv   pre-print
Image synthesis is a core problem in modern deep learning, and many recent architectures such as autoencoders and Generative Adversarial networks produce spectacular results on highly complex data, such as images of faces or landscapes. While these results open up a wide range of new, advanced synthesis applications, there is also a severe lack of theoretical understanding of how these networks work. This results in a wide range of practical problems, such as difficulties in training, the
more » ... cy to sample images with little or no variability, and generalisation problems. In this paper, we propose to analyse the ability of the simplest generative network, the autoencoder, to encode and decode two simple geometric attributes : size and position. We believe that, in order to understand more complicated tasks, it is necessary to first understand how these networks process simple attributes. For the first property, we analyse the case of images of centred disks with variable radii. We explain how the autoencoder projects these images to and from a latent space of smallest possible dimension, a scalar. In particular, we describe a closed-form solution to the decoding training problem in a network without biases, and show that during training, the network indeed finds this solution. We then investigate the best regularisation approaches which yield networks that generalise well. For the second property, position, we look at the encoding and decoding of Dirac delta functions, also known as 'one-hot' vectors. We describe a hand-crafted filter that achieves encoding perfectly, and show that the network naturally finds this filter during training. We also show experimentally that the decoding can be achieved if the dataset is sampled in an appropriate manner.
arXiv:1904.07099v1 fatcat:oc7fqgdkkfckbahyrzbcqxdtcy

Feature-Style Encoder for Style-Based GAN Inversion [article]

Xu Yao, Alasdair Newson, Yann Gousseau, Pierre Hellier
2022 arXiv   pre-print
We propose a novel architecture for GAN inversion, which we call Feature-Style encoder. The style encoder is key for the manipulation of the obtained latent codes, while the feature encoder is crucial for optimal image reconstruction. Our model achieves accurate inversion of real images from the latent space of a pre-trained style-based GAN model, obtaining better perceptual quality and lower reconstruction error than existing methods. Thanks to its encoder structure, the model allows fast and
more » ... ccurate image editing. Additionally, we demonstrate that the proposed encoder is especially well-suited for inversion and editing on videos. We conduct extensive experiments for several style-based generators pre-trained on different data domains. Our proposed method yields state-of-the-art results for style-based GAN inversion, significantly outperforming competing approaches. Source codes are available at .
arXiv:2202.02183v1 fatcat:7xtfszrk6vhgxavmz7zu2m2pfm

Multi-View Radar Semantic Segmentation [article]

Arthur Ouaknine, Alasdair Newson, Patrick Pérez, Florence Tupin, Julien Rebut
2021 arXiv   pre-print
Understanding the scene around the ego-vehicle is key to assisted and autonomous driving. Nowadays, this is mostly conducted using cameras and laser scanners, despite their reduced performances in adverse weather conditions. Automotive radars are low-cost active sensors that measure properties of surrounding objects, including their relative speed, and have the key advantage of not being impacted by rain, snow or fog. However, they are seldom used for scene understanding due to the size and
more » ... lexity of radar raw data and the lack of annotated datasets. Fortunately, recent open-sourced datasets have opened up research on classification, object detection and semantic segmentation with raw radar signals using end-to-end trainable models. In this work, we propose several novel architectures, and their associated losses, which analyse multiple "views" of the range-angle-Doppler radar tensor to segment it semantically. Experiments conducted on the recent CARRADA dataset demonstrate that our best model outperforms alternative models, derived either from the semantic segmentation of natural images or from radar scene understanding, while requiring significantly fewer parameters. Both our code and trained models are available at
arXiv:2103.16214v2 fatcat:7dsyn6nfijflnp7hapzgeoilmq

Realistic Film Grain Rendering

Alasdair Newson, Noura Faraj, Bruno Galerne, Julie Delon
2017 Image Processing On Line  
Full resolution (2048 × 1536) Zoom 5× Zoom 30× Full resolution (604 × 453) Zoom 5× Zoom 30× Alasdair Newson, Noura Faraj, Julie Delon, Bruno Galerne,  ...  Newson et al. [13] employ an inhomogeneous Boolean model [4] to achieve this goal.  ... 
doi:10.5201/ipol.2017.192 fatcat:vzilm2hfrjcm7gsvvzk6qjr2xq

Patch-Based Stochastic Attention for Image Editing [article]

Nicolas Cherel, Andrés Almansa, Yann Gousseau, Alasdair Newson
2022 arXiv   pre-print
Attention mechanisms have become of crucial importance in deep learning in recent years. These non-local operations, which are similar to traditional patch-based methods in image processing, complement local convolutions. However, computing the full attention matrix is an expensive step with a heavy memory and computational load. These limitations curb network architectures and performances, in particular for the case of high resolution images. We propose an efficient attention layer based on
more » ... e stochastic algorithm PatchMatch, which is used for determining approximate nearest neighbors. We refer to our proposed layer as a "Patch-based Stochastic Attention Layer" (PSAL). Furthermore, we propose different approaches, based on patch aggregation, to ensure the differentiability of PSAL, thus allowing end-to-end training of any network containing our layer. PSAL has a small memory footprint and can therefore scale to high resolution images. It maintains this footprint without sacrificing spatial precision and globality of the nearest neighbours, which means that it can be easily inserted in any level of a deep architecture, even in shallower levels. We demonstrate the usefulness of PSAL on several image editing tasks, such as image inpainting and image colorization.
arXiv:2202.03163v2 fatcat:76ncey4su5epnewb6bpe55powy

Non-Local Patch-Based Image Inpainting

Alasdair Newson, Andrés Almansa, Yann Gousseau, Patrick Pérez
2017 Image Processing On Line  
concerning the correct inpainting of textures with patch-based methods, and modify the patch distance to address this problem, in a similar manner to that of Liu and Caselles [13] in image inpainting and Newson  ... 
doi:10.5201/ipol.2017.189 fatcat:jzu5s2wivzaahfvedgmh53i32m

Low-Rank Spatio-Temporal Video Segmentation

Alasdair Newson, Mariano Tepper, Guillermo Sapiro
2015 Procedings of the British Machine Vision Conference 2015  
Recently, a great deal of interest has been generated by the technique known as Robust Principle Component Analysis (RPCA) of Candès et al. [1] , which addresses the problem of separating a matrix into a low-rank and a sparse component. This very general formulation can be used for tasks such as background estimation in videos and face recognition. In the case of background estimation, the low-rank matrix models the background, and the sparse matrix corresponds to the foreground. A considerable
more » ... drawback of this approach is its poor robustness to local lighting conditions. If lighting conditions vary locally, one of two things may happen. Either the method incorporates the lighting variation into the foreground, which is clearly undesirable, or the rank of the background model is allowed to increase. Unfortunately, this second option means that the true foreground is likely to become included in the background, especially for objects which are static for a short while. Here, we propose to model the background as a piece-wise low-rank matrix. In this manner, it will be possible to extract several localised models which correspond to coherent lighting conditions. However, for this we need to segment the input video into such coherent regions. We refer to this problem as a low-rank spatio-temporal video segmentation. We present an algorithm to address this segmentation problem, based on region merging and spectral clustering techniques. We show that by carrying out a local RPCA in each region, the results of foreground/background separation are greatly improved, in comparison with both the standard RPCA and several other well-known background estimation techniques. Let X ∈ R m×n represent an input video, in matrix form. Each frame contains m pixels, and there are a total of n frames in our video. The goal of RPCA is to decompose X as X ≈ L + S, where L is the low-rank matrix and S is the sparse matrix. Unfortunately, the rank of a matrix is a non-convex function, so a surrogate function, the nuclear norm is used. Thus, the background/foreground separation problem may be formulated as follows: where L * = ∑ i σ i (L) is the nuclear norm of L and σ i (L) is the i th singular value of L. The scalars λ * and λ are optimisation parameters, · F is the Frobenius matrix norm and · 1 is the 1 matrix norm, which induces sparsity in the foreground matrix. To segment X into different regions where the low-rank requirement is respected, we start by creating a regular 3D grid, which we denote with Ω, on the video domain. Each Ω i corresponds to a rectangular cuboid of video information. We then create an undirected, weighted graph where each node represents a region Ω i , and a node is connected with a 6connectivity to the regions around it. Our goal will be to cluster this graph using spectral clustering techniques. The main challenge here is to design a cost function which shows how "coherent" two regions are in terms of their low-rank background representation. More formally, consider two regions to merge, Ω i and Ω j . We wish to see whether it is better to decompose the regions separately or jointly. The decomposition of Ω i will be Ω i ≈ L i + S j , and similarly for Ω j . Our first observation is that it is easier to compare the coherence of the decompositions resulting from a rank-constrained version of Equation (1) : subject to rank(L) ≤ r. The comparisons are made clearer because the λ * parameter is removed and replaced with one which is more easily interpretable, the maximum rank of each local model, r. Once the decompositions of Ω i , Ω j and Two frames from a video with locally varying lighting Foreground detection using standard RPCA (foreground in green) tim e Create graph Cluster graph Segmentation into regions with locally low-rank background Foreground detection using (proposed) local RPCA Figure 1: Illustration and results of the algorithm Ω i∪ j are obtained in this manner, we can calculate the cost of merging the two regions. Let e i = X i − L i − S i 2 F be the quadratic error of the decomposition of Ω i , and similarly for Ω j and Ω i∪ j . Our cost function is: where φ i∪ j is a scalar. Once we have established the cost of merging two regions, we convert it into a similarity cost, and cluster the resulting graph using robust spectral clustering techniques [5] . Figure 1 illustrates the problems caused by locally varying lighting conditions: either the foreground is merged into the background (second row, left), or the global (standard) RPCA is not able to represent local lighting changes (second row, right). This is corrected by segmenting the video, and carrying out a local RPCA in each region. We compare our algorithm qualitatively and quantitatively with respect to several algorithms of the literature [2, 3, 4] and find greatly improved performance in challenging situations.
doi:10.5244/c.29.103 dblp:conf/bmvc/NewsonTS15 fatcat:r2omuf5fmvcqlgkck36dat225u

PCAAE: Principal Component Analysis Autoencoder for organising the latent space of generative networks [article]

Chi-Hieu Pham and Saïd Ladjal and Alasdair Newson
2020 arXiv   pre-print
., Newson, A., Pham, C.H.: A PCA−like autoencoder. arXiv preprint arXiv: 8. 4 4 Results of the proposed methods for synthetic data We show the evaluations of the compared methods: those of AE, VAE [42  ... 
arXiv:2006.07827v1 fatcat:qxw7tmm45redvmnv2xoxdfpp3m

Robust Automatic Line Scratch Detection in Films

Alasdair Newson, Andres Almansa, Yann Gousseau, Patrick Perez
2014 IEEE Transactions on Image Processing  
Line scratch detection in old films is a particularly challenging problem due to the variable spatio-temporal characteristics of this defect. Some of the main problems include sensitivity to noise and texture, and false detections due to thin vertical structures belonging to the scene. We propose a robust and automatic algorithm for frame-by-frame line scratch detection in old films, as well as a temporal algorithm for the filtering of false detections. In the frame-by-frame algorithm, we relax
more » ... some of the hypotheses used in previous algorithms in order to detect a wider variety of scratches. This step's robustness and lack of external parameters is ensured by the combined use of an a contrario methodology and local statistical estimation. In this manner, over-detection in textured or cluttered areas is greatly reduced. The temporal filtering algorithm eliminates false detections due to thin vertical structures by exploiting the coherence of their motion with that of the underlying scene. Experiments demonstrate the ability of the resulting detection procedure to deal with difficult situations, in particular in the presence of noise, texture and slanted or partial scratches. Comparisons show significant advantages over previous work.
doi:10.1109/tip.2014.2300824 pmid:24723525 fatcat:hrdubon4dbe5hhclulsq5x56he

Multi-temporal foreground detection in videos

Mariano Tepper, Alasdair Newson, Pablo Sprechmann, Guillermo Sapiro
2015 2015 IEEE International Conference on Image Processing (ICIP)  
A common task in video processing is the binary separation of a video's content into either background or moving foreground. However, many situations require a foreground analysis with a finer temporal granularity, in particular for objects or people which remain immobile for a certain period of time. We propose an efficient method which detects foreground at different timescales, by exploiting the desirable theoretical and practical properties of Robust Principal Component Analysis. Our
more » ... hm can be used in a variety of scenarios such as detecting people who have fallen in a video, or analysing the fluidity of road traffic, while avoiding costly computations needed for nearest neighbours searches or optical flow analysis. Finally, our algorithm has the useful ability to perform motion analysis without explicitly requiring computationally expensive motion estimation.
doi:10.1109/icip.2015.7351678 dblp:conf/icip/TepperNSS15 fatcat:qr7ei245dbglbcan22yhpgp7sm

A Latent Transformer for Disentangled Face Editing in Images and Videos [article]

Xu Yao, Alasdair Newson, Yann Gousseau, Pierre Hellier
2021 arXiv   pre-print
High quality facial image editing is a challenging problem in the movie post-production industry, requiring a high degree of control and identity preservation. Previous works that attempt to tackle this problem may suffer from the entanglement of facial attributes and the loss of the person's identity. Furthermore, many algorithms are limited to a certain task. To tackle these limitations, we propose to edit facial attributes via the latent space of a StyleGAN generator, by training a dedicated
more » ... latent transformation network and incorporating explicit disentanglement and identity preservation terms in the loss function. We further introduce a pipeline to generalize our face editing to videos. Our model achieves a disentangled, controllable, and identity-preserving facial attribute editing, even in the challenging case of real (i.e., non-synthetic) images and videos. We conduct extensive experiments on image and video datasets and show that our model outperforms other state-of-the-art methods in visual quality and quantitative evaluation. Source codes are available at
arXiv:2106.11895v2 fatcat:czi7onsp75e3bm4zontcks7or4

Towards fast, generic video inpainting

Alasdair Newson, Andrés Almansa, Matthieu Fradet, Yann Gousseau, Patrick Pérez
2013 Proceedings of the 10th European Conference on Visual Media Production - CVMP '13  
Achieving globally coherent video inpainting results in reasonable time and in an automated manner is still an open problem. In this paper, we build on the seminal work by Wexler et al. to propose an automatic video inpainting algorithm yielding convincing results in greatly reduced computational times. We extend the PatchMatch algorithm to the spatio-temporal case in order to accelerate the search for approximate nearest neighbours in the patch space. We also provide a simple and fast solution
more » ... to the well known over-smoothing problem resulting from the averaging of patches. Furthermore, we show that results similar to those of a supervised state-of-the-art method may be obtained on high resolution videos without any manual intervention. Our results indicate that globally coherent patch-based algorithms are feasible and an attractive solution to the difficult problem of video inpainting.
doi:10.1145/2534008.2534019 dblp:conf/cvmp/NewsonAFGP13 fatcat:mcgd6cyqgjggleanh3aqcclhn4

Temporal filtering of line scratch detections in degraded films

Alasdair Newson, Andres Almansa, Yann Gousseau, Patrick Perez
2013 2013 IEEE International Conference on Image Processing  
Recently Newson et al. [8] presented an algorithm which uses a contrario methods [9] for the robust detection line scratches.  ... 
doi:10.1109/icip.2013.6738842 dblp:conf/icip/NewsonAGP13 fatcat:gj5t75emo5g4df5w65ciypg6yy
« Previous Showing results 1 — 15 out of 34 results