• Setting up Knet
    • Installation
    • Tips for developers
    • Using Amazon AWS
  • A Tutorial Introduction
    • 1. Functions and models
    • 2. Training a model
    • 3. Making models generic
    • 4. Defining new operators
    • 5. Training with minibatches
    • 6. MLP
    • 7. Convnet
    • 8. Conditional Evaluation
    • 9. Recurrent neural networks
    • 10. Training with sequences
    • Some useful tables
  • Backpropagation
    • Partial derivatives
    • Chain rule
    • Multiple dimensions
    • Multiple instances
    • Stochastic Gradient Descent
    • References
  • Softmax Classification
    • Classification
    • Likelihood
    • Softmax
    • One-hot vectors
    • Gradient of log likelihood
    • MNIST example
    • Representational power
    • References
  • Multilayer Perceptrons
    • Stacking linear classifiers is useless
    • Introducing nonlinearities
    • Types of nonlinearities (activation functions)
    • Representational power
    • Matrix vs Neuron Pictures
    • Programming Example
    • References
  • Convolutional Neural Networks
    • Motivation
    • Convolution
    • Pooling
    • Normalization
    • Architectures
    • Exercises
    • References
  • Recurrent Neural Networks
    • References
  • Reinforcement Learning
    • References
  • Optimization
    • References
  • Generalization
    • References
 
Knet.jl
  • Docs »
  • Optimization
  • Edit on GitHub

Optimization¶

References¶

  • http://www.deeplearningbook.org/contents/numerical.html (basic intro in 4.3)
  • http://www.deeplearningbook.org/contents/optimization.html (8.1 generalization, 8.2 problems, 8.3 algorithms, 8.4 init, 8.5 adaptive lr, 8.6 approx 2nd order, 8.7 meta)
  • http://andrew.gibiansky.com/blog/machine-learning/gauss-newton-matrix/ (great posts on optimization)
  • https://www.cs.cmu.edu/~quake-papers/painless-conjugate-gradient.pdf (excellent tutorial on cg, gd, eigens etc)
  • http://arxiv.org/abs/1412.6544 (Goodfellow paper)
  • https://d396qusza40orc.cloudfront.net/neuralnets/lecture_slides/lec6.pdf (hinton slides)
  • https://d396qusza40orc.cloudfront.net/neuralnets/lecture_slides/lec8.pdf (hinton slides)
  • http://www.denizyuret.com/2015/03/alec-radfords-animations-for.html
  • http://machinelearning.wustl.edu/mlpapers/paper_files/icml2010_Martens10.pdf
  • http://arxiv.org/abs/1503.05671
  • http://arxiv.org/abs/1412.1193
  • http://www.springer.com/us/book/9780387303031 (nocedal and wright)
  • http://www.nrbook.com (numerical recipes)
  • https://maths-people.anu.edu.au/~brent/pub/pub011.html (without derivatives)
  • http://stanford.edu/~boyd/cvxbook/ (only convex optimization)
Next Previous

Revision dd76cb25.