Filters








35 Hits in 2.5 sec

ObamaNet: Photo-realistic lip-sync from text [article]

Rithesh Kumar, Jose Sotelo, Kundan Kumar, Alexandre de Brebisson, Yoshua Bengio
2017 arXiv   pre-print
Finally, we apply PCA to de-correlate the 20 normalized keypoints (40-D vector). We noticed that the first 5 PCA-coefficients capture >98% variability in the data.  ... 
arXiv:1801.01442v1 fatcat:kczda6izyvfpfdhanrljh4m5xi

Deep Neural Networks for Anatomical Brain Segmentation [article]

Alexandre de Brebisson, Giovanni Montana
2015 arXiv   pre-print
We present a novel approach to automatically segment magnetic resonance (MR) images of the human brain into anatomical regions. Our methodology is based on a deep artificial neural network that assigns each voxel in an MR image of the brain to its corresponding anatomical region. The inputs of the network capture information at different scales around the voxel of interest: 3D and orthogonal 2D intensity patches capture the local spatial context while large, compressed 2D orthogonal patches and
more » ... distances to the regional centroids enforce global spatial consistency. Contrary to commonly used segmentation methods, our technique does not require any non-linear registration of the MR images. To benchmark our model, we used the dataset provided for the MICCAI 2012 challenge on multi-atlas labelling, which consists of 35 manually segmented MR images of the brain. We obtained competitive results (mean dice coefficient 0.725, error rate 0.163) showing the potential of our approach. To our knowledge, our technique is the first to tackle the anatomical segmentation of the whole brain using deep neural networks.
arXiv:1502.02445v2 fatcat:oor4ilmwhnfudph65zxqcruaty

A Cheap Linear Attention Mechanism with Fast Lookups and Fixed-Size Representations [article]

Alexandre de Brébisson, Pascal Vincent
2016 arXiv   pre-print
The softmax content-based attention mechanism has proven to be very beneficial in many applications of recurrent neural networks. Nevertheless it suffers from two major computational limitations. First, its computations for an attention lookup scale linearly in the size of the attended sequence. Second, it does not encode the sequence into a fixed-size representation but instead requires to memorize all the hidden states. These two limitations restrict the use of the softmax attention mechanism
more » ... to relatively small-scale applications with short sequences and few lookups per sequence. In this work we introduce a family of linear attention mechanisms designed to overcome the two limitations listed above. We show that removing the softmax non-linearity from the traditional attention formulation yields constant-time attention lookups and fixed-size representations of the attended sequences. These properties make these linear attention mechanisms particularly suitable for large-scale applications with extreme query loads, real-time requirements and memory constraints. Early experiments on a question answering task show that these linear mechanisms yield significantly better accuracy results than no attention, but obviously worse than their softmax alternative.
arXiv:1609.05866v1 fatcat:ftsmfswxw5gu3da2ij6cctkj24

Deep neural networks for anatomical brain segmentation

Alexandre de Brebisson, Giovanni Montana
2015 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)  
We present a novel approach to automatically segment magnetic resonance (MR) images of the human brain into anatomical regions. Our methodology is based on a deep artificial neural network that assigns each voxel in an MR image of the brain to its corresponding anatomical region. The inputs of the network capture information at different scales around the voxel of interest: 3D and orthogonal 2D intensity patches capture a local spatial context while downscaled large 2D orthogonal patches and
more » ... tances to the regional centroids enforce global spatial consistency. Contrary to commonly used segmentation methods, our technique does not require any non-linear registration of the MR images. To benchmark our model, we used the dataset provided for the MICCAI 2012 challenge on multi-atlas labelling, which consists of 35 manually segmented MR images of the brain. We obtained competitive results (mean dice coefficient 0.725, error rate 0.163) showing the potential of our approach. To our knowledge, our technique is the first to tackle the anatomical segmentation of the whole brain using deep neural networks.
doi:10.1109/cvprw.2015.7301312 dblp:conf/cvpr/BrebissonM15 fatcat:ikemsocucvfnhhzzagto2k2vqq

Exact gradient updates in time independent of output size for the spherical loss family [article]

Pascal Vincent, Alexandre de Brébisson, Xavier Bouthillier
2016 arXiv   pre-print
An important class of problems involves training deep neural networks with sparse prediction targets of very high dimension D. These occur naturally in e.g. neural language models or the learning of word-embeddings, often posed as predicting the probability of next words among a vocabulary of size D (e.g. 200,000). Computing the equally large, but typically non-sparse D-dimensional output vector from a last hidden layer of reasonable dimension d (e.g. 500) incurs a prohibitive O(Dd)
more » ... l cost for each example, as does updating the D × d output weight matrix and computing the gradient needed for backpropagation to previous layers. While efficient handling of large sparse network inputs is trivial, the case of large sparse targets is not, and has thus so far been sidestepped with approximate alternatives such as hierarchical softmax or sampling-based approximations during training. In this work we develop an original algorithmic approach which, for a family of loss functions that includes squared error and spherical softmax, can compute the exact loss, gradient update for the output weights, and gradient for backpropagation, all in O(d^2) per example instead of O(Dd), remarkably without ever computing the D-dimensional output. The proposed algorithm yields a speedup of up to D/4d i.e. two orders of magnitude for typical sizes, for that critical part of the computations that often dominates the training time in this kind of network architecture.
arXiv:1606.08061v1 fatcat:cjmshlm3lzbe3pl6jexawyacfu

Artificial Neural Networks Applied to Taxi Destination Prediction [article]

Alexandre de Brébisson, Étienne Simon, Alex Auvolat, Pascal Vincent, Yoshua Bengio
2015 arXiv   pre-print
We describe our first-place solution to the ECML/PKDD discovery challenge on taxi destination prediction. The task consisted in predicting the destination of a taxi based on the beginning of its trajectory, represented as a variable-length sequence of GPS points, and diverse associated meta-information, such as the departure time, the driver id and client information. Contrary to most published competitor approaches, we used an almost fully automated approach based on neural networks and we
more » ... ed first out of 381 teams. The architectures we tried use multi-layer perceptrons, bidirectional recurrent neural networks and models inspired from recently introduced memory networks. Our approach could easily be adapted to other applications in which the goal is to predict a fixed-length output from a variable-length sequence.
arXiv:1508.00021v2 fatcat:5bwmzxu6gjf2rpewztefp3ptky

A Deep Reinforcement Learning Chatbot [article]

Iulian V. Serban, Chinnadhurai Sankar, Mathieu Germain, Saizheng Zhang, Zhouhan Lin, Sandeep Subramanian, Taesup Kim, Michael Pieper, Sarath Chandar, Nan Rosemary Ke, Sai Rajeshwar, Alexandre de Brebisson (+5 others)
2017 arXiv   pre-print
We present MILABOT: a deep reinforcement learning chatbot developed by the Montreal Institute for Learning Algorithms (MILA) for the Amazon Alexa Prize competition. MILABOT is capable of conversing with humans on popular small talk topics through both speech and text. The system consists of an ensemble of natural language generation and retrieval models, including template-based models, bag-of-words models, sequence-to-sequence neural network and latent variable neural network models. By
more » ... g reinforcement learning to crowdsourced data and real-world user interactions, the system has been trained to select an appropriate response from the models in its ensemble. The system has been evaluated through A/B testing with real-world users, where it performed significantly better than many competing systems. Due to its machine learning architecture, the system is likely to improve with additional data.
arXiv:1709.02349v2 fatcat:ocymhb6py5cyjpubii2kiwc7u4

A Deep Reinforcement Learning Chatbot (Short Version) [article]

Iulian V. Serban, Chinnadhurai Sankar, Mathieu Germain, Saizheng Zhang, Zhouhan Lin, Sandeep Subramanian, Taesup Kim, Michael Pieper, Sarath Chandar, Nan Rosemary Ke, Sai Rajeswar, Alexandre de Brebisson (+5 others)
2018 arXiv   pre-print
We present MILABOT: a deep reinforcement learning chatbot developed by the Montreal Institute for Learning Algorithms (MILA) for the Amazon Alexa Prize competition. MILABOT is capable of conversing with humans on popular small talk topics through both speech and text. The system consists of an ensemble of natural language generation and retrieval models, including neural network and template-based models. By applying reinforcement learning to crowdsourced data and real-world user interactions,
more » ... he system has been trained to select an appropriate response from the models in its ensemble. The system has been evaluated through A/B testing with real-world users, where it performed significantly better than other systems. The results highlight the potential of coupling ensemble systems with deep reinforcement learning as a fruitful path for developing real-world, open-domain conversational agents.
arXiv:1801.06700v1 fatcat:g52sngttjrh67os6vzywhmwrxq

MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis [article]

Kundan Kumar, Rithesh Kumar, Thibault de Boissiere, Lucas Gestin, Wei Zhen Teoh, Jose Sotelo, Alexandre de Brebisson, Yoshua Bengio, Aaron Courville
2019 arXiv   pre-print
Previous works (Donahue et al., 2018a; Engel et al., 2019a) have found that generating coherent raw audio waveforms with GANs is challenging. In this paper, we show that it is possible to train GANs reliably to generate high quality coherent waveforms by introducing a set of architectural changes and simple training techniques. Subjective evaluation metric (Mean Opinion Score, or MOS) shows the effectiveness of the proposed approach for high quality mel-spectrogram inversion. To establish the
more » ... nerality of the proposed techniques, we show qualitative results of our model in speech synthesis, music domain translation and unconditional music synthesis. We evaluate the various components of the model through ablation studies and suggest a set of guidelines to design general purpose discriminators and generators for conditional sequence synthesis tasks. Our model is non-autoregressive, fully convolutional, with significantly fewer parameters than competing models and generalizes to unseen speakers for mel-spectrogram inversion. Our pytorch implementation runs at more than 100x faster than realtime on GTX 1080Ti GPU and more than 2x faster than real-time on CPU, without any hardware specific optimization tricks.
arXiv:1910.06711v3 fatcat:nt7pffnvongujezmazuaksxjiq

An Exploration of Softmax Alternatives Belonging to the Spherical Loss Family [article]

Alexandre de Brébisson, Pascal Vincent
2016 arXiv   pre-print
ACKNOWLEDGMENTS We would like to thank Harm de Vries for helpful discussions about the optimization of the log spherical softmax and for providing us with a good baseline model for CIFAR10.  ... 
arXiv:1511.05042v3 fatcat:2ufmren6f5bi3gqkciwsze5diu

The Z-loss: a shift and scale invariant classification loss belonging to the Spherical Family [article]

Alexandre de Brébisson, Pascal Vincent
2016 arXiv   pre-print
Several spherical loss functions have already been investigated (Brébisson and Vincent, 2016) but they do not seem to perform as well as the log-softmax on large output problems.  ...  Although the Taylor softmax performs slightly better than the softmax on small output problems such as MNIST and CIFAR10, it does not scale well with the number of output classes (Brébisson and Vincent  ... 
arXiv:1604.08859v2 fatcat:sp2k3l2sa5e2lik6poekf5tvxi

Efficient Exact Gradient Update for training Deep Networks with Very Large Sparse Targets [article]

Pascal Vincent, Alexandre de Brébisson, Xavier Bouthillier
2015 arXiv   pre-print
An important class of problems involves training deep neural networks with sparse prediction targets of very high dimension D. These occur naturally in e.g. neural language models or the learning of word-embeddings, often posed as predicting the probability of next words among a vocabulary of size D (e.g. 200 000). Computing the equally large, but typically non-sparse D-dimensional output vector from a last hidden layer of reasonable dimension d (e.g. 500) incurs a prohibitive O(Dd)
more » ... l cost for each example, as does updating the D x d output weight matrix and computing the gradient needed for backpropagation to previous layers. While efficient handling of large sparse network inputs is trivial, the case of large sparse targets is not, and has thus so far been sidestepped with approximate alternatives such as hierarchical softmax or sampling-based approximations during training. In this work we develop an original algorithmic approach which, for a family of loss functions that includes squared error and spherical softmax, can compute the exact loss, gradient update for the output weights, and gradient for backpropagation, all in O(d^2) per example instead of O(Dd), remarkably without ever computing the D-dimensional output. The proposed algorithm yields a speedup of D/4d , i.e. two orders of magnitude for typical sizes, for that critical part of the computations that often dominates the training time in this kind of network architecture.
arXiv:1412.7091v3 fatcat:e2iibolqnbatbm7xksqi646hrm

Theano: A Python framework for fast computation of mathematical expressions [article]

The Theano Development Team: Rami Al-Rfou, Guillaume Alain, Amjad Almahairi, Christof Angermueller, Dzmitry Bahdanau, Nicolas Ballas, Frédéric Bastien, Justin Bayer, Anatoly Belikov, Alexander Belopolsky, Yoshua Bengio, Arnaud Bergeron, James Bergstra, Valentin Bisson (+90 others)
2016 arXiv   pre-print
Save and reload optimized graphs Optimized computation graphs, such as the ones in Theano functions, can now be serialized using the pickle module, and get de-serialized without being optimized again.  ... 
arXiv:1605.02688v1 fatcat:2lcqwrk2zrbt5dyjmcofn6shhu

Nécrologie

1912 Bulletin de la Société Botanique de France  
Plus tard il orienta ses investigations vers la Cryptogamie et décrivit notamment le Champignon qui pousse dans les solutions arsenicales et que de Brébisson avait nommé Hyg?·ocrocis arsenicus.  ...  Conseillé et guidé par Adolphe Brongniart, né également, en 1901, à la Manufacture de Siwres dont son père Alexandre, le géologue, était directeur, c'est vers les fleurs et la botanique que fut entraîné  ... 
doi:10.1080/00378941.1912.10832450 fatcat:msrm2hognbam3atgamzw7xlzxa

On Learning from Taxi GPS Traces (Preamble)

João Mendes-Moreira, Luís Moreira-Matias
2015 European Conference on Principles of Data Mining and Knowledge Discovery  
Alexandre de Brébisson, Étienne Simon, Alex Auvolat, Pascal Vincent and Yoshua Bengio, all working partial/full time at MILA lab, University of Montréal, Canada won subproblem (a) using multi-layer perceptrons  ... 
dblp:conf/pkdd/Mendes-MoreiraM15 fatcat:wgbbhjhci5a43ew7vcyjto3vvq
« Previous Showing results 1 — 15 out of 35 results