In this work we introduce a simple new regularization technique, aptly named Floor, which drops low weight connections on every forward pass whenever they fall below a specified event horizon threshold. We compare the results of this technique side by side on identical network architectures between regular Dropout and Floor algorithms. We report similar or improved regularization, with the Floor algorithm versus regular Dropout and/or in concert with regular Dropout.
In this paper we also describe our research into transfer learning by sharing of probability distribution parameters in which we investigated methods of transferring Gaussian prior parameters derived from the latent output of a Variational Auto-encoder to a subsequent 'posterior' classification networks in an attempt to regularize and internally organize that network along class boundaries. The investigation explored ways to increase the density of the classification networks by promoting regional specialization and co-regional collaboration across labels making networks more expressive and discriminating with fewer nodes by directing specific classes down specific branches of a network. The technique explored, extracted Gaussian priors from a Variational Auto-encoder, and then used these parameters in a specially constructed branching layer in which each branch sampled from the prior distribution to generate a class specific filter to apply to the feed forward operation. Our theory was that this would encourage specific classes to travel down specific branches, and thus reduce the interference between feature detectors in a standard ML feed forward network. Unfortunately this research did not reveal any significant improvements in classification, sparsity, or expressive ability over standard feed forward networks with Dropout.
Byrne, Daniel; Smith, Stacey; Duran, Joanna; and Santerre, John
"Floor Regularization and Investigation of Transfer Learning through Sharing of Probability Distribution Parameters,"
SMU Data Science Review: Vol. 3
, Article 8.
Available at: https://scholar.smu.edu/datasciencereview/vol3/iss2/8
Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial 4.0 License