人工智能教程 - 数学基础课程1.7 - 最优化方法-1 最优化场景,思路

Neoma ·
更新时间:2024-09-21
· 674 次阅读

最优化场景 有些场景,变量很多,很大,微积分难于处理。此类问题可用最优化来解决。

P1(x)=0P_1(x)=0P1​(x)=0
P2(x)=0P_2(x)=0P2​(x)=0
.
.
.
Pm(x)=0P_m(x)=0Pm​(x)=0

x=[x1x2...xn]x=\begin{bmatrix} x_1\\ x_2\\ .\\ .\\ .\\ x_n\\ \end{bmatrix}x=⎣⎢⎢⎢⎢⎢⎢⎡​x1​x2​...xn​​⎦⎥⎥⎥⎥⎥⎥⎤​ 属于线性解方程问题,不属于最优解

最优解的话,可以是3个未知量,满足1000个方程组等等
最难的是:把不是最优解的问题转化成最优化问题。

Ex:
P1(x)=0↔P12(x)=0P_1(x)=0\leftrightarrow P_1^2(x)=0P1​(x)=0↔P12​(x)=0
P2(x)=0↔P22(x)=0P_2(x)=0\leftrightarrow P_2^2(x)=0P2​(x)=0↔P22​(x)=0
.
.
.
Pm(x)=0↔Pm2(x)=0P_m(x)=0\leftrightarrow P_m^2(x)=0Pm​(x)=0↔Pm2​(x)=0

线性方程→\rightarrow→二次方程

∑i=1mPi2(x)=0\sum_{i=1}^mP_i^2(x)=0∑i=1m​Pi2​(x)=0 实际上就是解f(x)=∑i=1mPi2(x)f(x)=\sum_{i=1}^mP_i^2(x)f(x)=∑i=1m​Pi2​(x)的min f(x) Basic optimization problem: minimize f(x) for x∈Rnx \in R^nx∈Rn

Assumption: f(x) is twice continuously differentiable.

此外,如果认为有的方程不重要,有的极其重要的,可使用加权 f(x)=∑i=1mwiPi2(x)    wi>0f(x)=\sum_{i=1}^mw_iP_i^2(x) \ \ \ \ w_i>0f(x)=∑i=1m​wi​Pi2​(x)    wi​>0

拿到结果后,再调整权。实际上是人工智能解决的问题

思路

Basic structure of a numerical algorithm for minimizing f(x)

第一步:找到一个你认为合理的点或者随机产生一个点,然后找一个你认为什么样是达到满意的值ε\varepsilonε,然后设置一个迭代次数k = 0

Step1: choose an initial point x0x_0x0​, set a convergence toleranceε\varepsilonε, and set a counter k = 0

第二步(最关键):确定方向

Step2: determine a search direction dkd_kdk​ for reducing f(x) from point xkx_kxk​.

第三步(次关键):定步长

Step3: determine a step size αk\alpha_kαk​ such that f(xk+αdk)f( x_k+ \alpha d_k)f(xk​+αdk​) is minimized for α≥0\alpha \geq0α≥0, and constuct xk+1=xk+αdkx_{k+1} =x_k+ \alpha d_kxk+1​=xk​+αdk​

第四步:看走了多远。近的话,马上停下来。

Step4: if ∣∣αkdk∣∣<ε||\alpha_k d_k||<\varepsilon∣∣αk​dk​∣∣<ε, stop and output a solution xk+1x_{k+1}xk+1​,else set k := k+1 and repeat from Step 2.

comments:

a) Steps 2 and 3 are key steps of an optimization algorithm,
b) Different ways to accomplish Step 2 leads to different algorithms.
c) Step 3 is a one-dimensional optimization problem and it is often called a line search step.

一阶偏导数 First-order:

First-order necessary condition: If x∗x^*x∗ is a minimum point (minimizer), then ▽(x∗)=0\bigtriangledown (x^*)=0▽(x∗)=0. In other words,
if x∗x^*x∗ is a minimum point, then it must be a stationary point.

▽f=[∂f∂x1...∂f∂xn]\bigtriangledown f=\begin{bmatrix} \frac{\partial f}{\partial x_1}\\ .\\ .\\ .\\ \frac{\partial f}{\partial x_n} \end{bmatrix}▽f=⎣⎢⎢⎢⎢⎡​∂x1​∂f​...∂xn​∂f​​⎦⎥⎥⎥⎥⎤​ 二阶偏导数 矩阵:

Second-order sufficient condition: If ▽(x∗)=0\bigtriangledown (x^*)=0▽(x∗)=0 and H(x∗)H(x^*)H(x∗) is a positive definite matrix, i.e., H(x∗)>0H(x^*)>0H(x∗)>0 ,
then x∗x^*x∗ is a minimum point (minimizer).

H(x)=[∂2f∂x12∂2f∂x1.∂x2∂2f∂x1.∂x3...∂2f∂x1.∂xn∂2f∂x1.∂x2∂2f∂x22................∂2f∂x1.∂xn.....∂2f∂xn2]\LARGE H(x)=\begin{bmatrix} \frac{\partial ^2f}{\partial x_1^2} & \frac{\partial ^2f}{\partial x_1.\partial x_2} &\frac{\partial ^2f}{\partial x_1.\partial x_3} &.&.&.&\frac{\partial ^2f}{\partial x_1.\partial x_n}\\ \frac{\partial ^2f}{\partial x_1.\partial x_2} & \frac{\partial ^2f}{\partial x_2^2} &.\\ .&.&.&.\\ .&.&.&.&.\\ .&.&.&.&.&.\\ \frac{\partial ^2f}{\partial x_1.\partial x_n} &.&.&.&.&.&\frac{\partial ^2f}{\partial x_n^2}\\ \end{bmatrix}H(x)=⎣⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎡​∂x12​∂2f​∂x1​.∂x2​∂2f​...∂x1​.∂xn​∂2f​​∂x1​.∂x2​∂2f​∂x22​∂2f​....​∂x1​.∂x3​∂2f​.....​.....​....​...​∂x1​.∂xn​∂2f​∂xn2​∂2f​​⎦⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎤​

i行j列为∂2f∂xi.∂xj\frac{\partial ^2f}{\partial x_i.\partial x_j}∂xi​.∂xj​∂2f​ 二阶偏导数矩阵为 对称的矩阵,也称为Hessian Matrix 当n = 1 时,H(x) = d2fdx2\frac{d^2f}{dx^2}dx2d2f​ Hessian Matrix和梯度的关系 H(x)=▽(▽Tf(x))\LARGE\color{red}H(x) =\bigtriangledown (\bigtriangledown^Tf(x))H(x)=▽(▽Tf(x))
作者:KuFun人工智能



最优化 方法 数学 课程 最优化方法 人工智能 优化 教程

需要 登录 后方可回复, 如果你还没有账号请 注册新账号