• Setting up Knet
    • Installation
    • Tips for developers
    • Using Amazon AWS
  • Introduction to Knet
    • Contents
    • Installation
    • Examples
    • Benchmarks
    • Function reference
    • Optimization methods
    • Under the hood
    • Contributing
  • Backpropagation
    • Partial derivatives
    • Chain rule
    • Multiple dimensions
    • Multiple instances
    • Stochastic Gradient Descent
    • References
  • Softmax Classification
    • Classification
    • Likelihood
    • Softmax
    • One-hot vectors
    • Gradient of log likelihood
    • MNIST example
    • Representational power
    • References
  • Multilayer Perceptrons
    • Stacking linear classifiers is useless
    • Introducing nonlinearities
    • Types of nonlinearities (activation functions)
    • Representational power
    • Matrix vs Neuron Pictures
    • Programming Example
    • References
  • Convolutional Neural Networks
    • Motivation
    • Convolution
    • Pooling
    • Normalization
    • Architectures
    • Exercises
    • References
  • Recurrent Neural Networks
    • References
  • Reinforcement Learning
    • References
  • Optimization
    • References
  • Generalization
    • References
 
Knet.jl
  • Docs »
  • Optimization
  • Edit on GitHub

Optimization¶

References¶

  • http://www.deeplearningbook.org/contents/numerical.html (basic intro in 4.3)
  • http://www.deeplearningbook.org/contents/optimization.html (8.1 generalization, 8.2 problems, 8.3 algorithms, 8.4 init, 8.5 adaptive lr, 8.6 approx 2nd order, 8.7 meta)
  • http://andrew.gibiansky.com/blog/machine-learning/gauss-newton-matrix/ (great posts on optimization)
  • https://www.cs.cmu.edu/~quake-papers/painless-conjugate-gradient.pdf (excellent tutorial on cg, gd, eigens etc)
  • http://arxiv.org/abs/1412.6544 (Goodfellow paper)
  • https://d396qusza40orc.cloudfront.net/neuralnets/lecture_slides/lec6.pdf (hinton slides)
  • https://d396qusza40orc.cloudfront.net/neuralnets/lecture_slides/lec8.pdf (hinton slides)
  • http://www.denizyuret.com/2015/03/alec-radfords-animations-for.html
  • http://machinelearning.wustl.edu/mlpapers/paper_files/icml2010_Martens10.pdf
  • http://arxiv.org/abs/1503.05671
  • http://arxiv.org/abs/1412.1193
  • http://www.springer.com/us/book/9780387303031 (nocedal and wright)
  • http://www.nrbook.com (numerical recipes)
  • https://maths-people.anu.edu.au/~brent/pub/pub011.html (without derivatives)
  • http://stanford.edu/~boyd/cvxbook/ (only convex optimization)
Next Previous

Revision 8a10ace4.