“Gradient Descent for Unconstrained Smooth Minimization - Quanming Yao [1]”
1. Rudiments
- Taylor's Theorem
- Lipschitz Constant
- Convex and Strong Convex
- Hessian. Condition Number & Bound on Hessian
- Class of Differential Functions
2. Gradient Descent
- Gradient as General Descent Method:
choose the step size to ensure convergence to a stationary point of f(x).
- Gradient Descent under Strong Convex
convergent rate: 1-(1/k)
- Gradient Descent under Weak Convex
convergent rate: A0/(cA0k + 1)
3. Message from Quadratic Programming
first-order gradient descent: O(1/k) rate for weak convex; O(1-1/k)^k for strong convex.
- Heavy ball. Need to know function's parameter.
- Conjugate Gradient. Avoid the knowledge of function parameters.
4. Accelerated Gradient Descent
- Weak Convex
- Strong Convex
5. Newton Type Method
Proof of Lemma 1.2 in [1] referring to Lecture 2.
Very difficult for me to understand everything in the paper now.
没有评论:
发表评论