Improving transfer learning accuracy by reusing Stacked Denoising Autoencoders

Chetak Kandaswamy, Luis M. Silva, Luis A. Alexandre, Ricardo Sousa, Jorge M. Santos, Joaquim Marques de Sa
2014 2014 IEEE International Conference on Systems, Man, and Cybernetics (SMC)  
Transfer learning is a process that allows reusing a learning machine trained on a problem to solve a new problem. Transfer learning studies on shallow architectures show low performance as they are generally based on hand-crafted features obtained from experts. It is therefore interesting to study transference on deep architectures, known to directly extract the features from the input data. A Stacked Denoising Autoencoder (SDA) is a deep model able to represent the hierarchical features
more » ... for solving classification problems. In this paper we study the performance of SDAs trained on one problem and reused to solve a different problem not only with different distribution but also with a different tasks. We propose two different approaches: 1) unsupervised feature transference, and 2) supervised feature transference using deep transfer learning. We show that SDAs using the unsupervised feature transference outperform randomly initialized machines on a new problem. We achieved 7% relative improvement on average error rate and 41% on average computation time to classify typed uppercase letters. In the case of supervised feature transference, we achieved 5.7% relative improvement in the average error rate, by reusing the first and second hidden layer, and 8.5% relative improvement for the average error rate and 54% speed up w.r.t the baseline by reusing all three hidden layers for the same data. We also explore transfer learning between geometrical shapes and canonical shapes, we achieved 7.4% relative improvement on average error rate in case of supervised feature transference approach.
doi:10.1109/smc.2014.6974107 dblp:conf/smc/KandaswamySASSS14 fatcat:hdnp2lak2jdtfkzs6ibhx5m2l4