人工智能教程 - 数学基础课程1.7 - 最优化方法4-7 最优化思路第二步核心,约束条件,KKT

Nissa ·
更新时间:2024-09-21
· 945 次阅读

最优化思路第二步核心

Determination of a search direction dkd_kdk​:

Basis of the methods: Taylor series of f(x) in a small vicinity surrounding point xkx_kxk​:
f(xk+δ)=f(xk)+▽Tf(xk).δ+12δT.▽T.▽2f(xk).δ+O(∣∣δ∣∣3)\LARGE f(x_k+\delta)=f(x_k)+\bigtriangledown ^Tf(x_k).\delta +\frac{1}{2}\delta ^T. \bigtriangledown^T.\bigtriangledown ^2f(x_k).\delta+O(||\delta ||^3)f(xk​+δ)=f(xk​)+▽Tf(xk​).δ+21​δT.▽T.▽2f(xk​).δ+O(∣∣δ∣∣3) steepest descent method: dk=−▽f(xk1.i.e.,dk=−g(xk))d_k=-\bigtriangledown f(x_k1.i.e.,d_k=-g(x_k))dk​=−▽f(xk​1.i.e.,dk​=−g(xk​)) Newton method :dk=−H−1(xk)g(xk)d_k= -H^{-1}(x_k)g(x_k)dk​=−H−1(xk​)g(xk​) 约束条件,KKT Constraint Optimization: The Karush-Kuhn-Tucker(KKT) Conditions

很多情况下,什么也没有,无参考/有约束条件是没有意义的。

The constrained optimization problem
minimize f(x)
subject to: ai(x)=0a_i(x)=0ai​(x)=0 for i = 1,2, …p
cj(x)≥0c_j(x)\geq0cj​(x)≥0 for j = 1,2, …p

Feasible region{x:ai(x)=0x:a_i(x)=0x:ai​(x)=0 for i = 1,2, …p
,and cj(x)≥0c_j(x)\geq0cj​(x)≥0 for j = 1,2, …p}

A point x is said to be feasible if it is in the feasible region (inside or on the boundary). We say an
inequality constraint cj(x)≥0c_j(x) ≥ 0cj​(x)≥0 is active at a feasible point x, if cj(x)=0c_j(x) = 0cj​(x)=0. And inequality constraint cj(x)≥0c_j(x) ≥ 0cj​(x)≥0 is said to be inactive at a feasible point x if cj(x)>0c_j(x) > 0cj​(x)>0

但求一维图的最小值是没有意义的,但是如果可以约束为[a,b]区间内的最小值,则有意义。

KKT conditions (first-order necessary conditions for point x∗x^*x∗ to be a minimum point):
If x∗x^*x∗ is a local minimizer of the constrained problem, then
(a) ai(x∗)=0a_i(x^*)=0ai​(x∗)=0 for 1,2,…,p
(b) aj(x∗)=0a_j(x^*)=0aj​(x∗)=0 for 1, 2,…, q
(c ) there exist Lagrange multipliersλi∗\lambda_i^*λi∗​ for1≤i≤p1\leq i\leq p1≤i≤p and μi∗\mu_i^*μi∗​ for 1≤j≤q1\leq j\leq q1≤j≤q such that
▽f(x∗)=∑i=1pλi∗▽ai(x∗)+∑j=1qμj∗▽ci(x∗)\LARGE\color{red}\bigtriangledown f(x^*)=\sum_{i=1}^p\lambda _i^*\bigtriangledown a_i(x^*)+\sum_{j=1}^q\mu _j^*\bigtriangledown c_i(x^*)▽f(x∗)=∑i=1p​λi∗​▽ai​(x∗)+∑j=1q​μj∗​▽ci​(x∗)
(d) μj∗cj(x∗)=0\mu _j^* c_j(x^*)=0μj∗​cj​(x∗)=0 for 1, 2,…,q
(e) μj∗≥0\mu _j^*\geq 0μj∗​≥0 for 1, 2,…, j μ j q
Ex:

f(x1,x2)=−x1+x2f(x_1,x_2)=-x_1+x_2f(x1​,x2​)=−x1​+x2​ 无意义
加上一下三个约束条件,则有意义了:

c1(x)=x1≥0c_1(x)= x_1\geq 0c1​(x)=x1​≥0 c2(x)=x2≥0c_2(x)= x_2\geq 0c2​(x)=x2​≥0 (active) x2=−x1+1x_2 = -x_1+1x2​=−x1​+1
c3(x)=−x1−x2+1≥0c_3(x)= -x_1-x_2+1\geq 0c3​(x)=−x1​−x2​+1≥0 (active)

x∗=[10]x^*=\begin{bmatrix} 1\\ 0 \end{bmatrix}x∗=[10​]

▽f=[−10]=0[10]+2[01]+1[−1−1]\bigtriangledown f=\begin{bmatrix} -1\\ 0 \end{bmatrix}=0\begin{bmatrix} 1\\ 0 \end{bmatrix}+2\begin{bmatrix} 0\\ 1 \end{bmatrix}+1\begin{bmatrix} -1\\ -1 \end{bmatrix}▽f=[−10​]=0[10​]+2[01​]+1[−1−1​]


作者:KuFun人工智能



最优化 最优化方法 方法 数学 课程 人工智能 优化 教程

需要 登录 后方可回复, 如果你还没有账号请 注册新账号