SLAM十四讲----非线性优化----手撕高斯牛顿法（用Eigen库实现）

AI Chen

已于 2024-03-24 15:03:51 修改

阅读量2.4k

点赞数 6

分类专栏： SLAM十四讲文章标签： SLAM十四讲 Eigen 高斯牛顿法最小二乘法非线性优化

于 2022-03-12 23:55:16 首次发布

本文链接：https://blog.csdn.net/qq_39400324/article/details/123451541

版权

SLAM十四讲专栏收录该内容

4 篇文章

订阅专栏

本文是SLAM十四讲学习笔记，用Eigen库手撕高斯牛顿法。

手撕高斯牛顿法（用Eigen库实现）

1.高斯牛顿法介绍
2.高斯牛顿法Eigen库实现
3.参考资料
- 3.1参考书籍
- 3.2参考函数

1.高斯牛顿法介绍

简单的最小二乘问题可以描述如下：
$\begin{aligned} \underset{x}{min}F(x)=\frac {1}{2} ||f(x)||^2 \end{aligned}$
高斯牛顿法是最优化算法中最简单的方法之一，我们用高斯牛顿法来求解这个最小二乘问题。它的思想是将 $f (x)$ 进行一阶的泰勒展开。（请注意，这里是对 $f (x)$ 进行一阶泰勒展开，而不是对目标函数 $F (x)$ 展开。）

首先回忆一下泰勒展开：
$\begin{aligned} f(x)=\frac{f(x_0)}{0!}+\frac{f^{'}(x_0)}{1!}(x-x_0)+\frac{f^{''}(x_0)}{2!}(x-x_0)^{2}+...+\frac{f^{(n)}(x_0)}{n!}(x-x_0)^{n}+R_n(x) \end{aligned}$
$f (x)$ 的一阶泰勒展开为：
$\begin{aligned} f(x+\Delta x)\approx{f(x)}+J(x)^T\Delta x \end{aligned}$
其中 $J(x)^T$ 为 $f (x)$ 关于 $x$ 的导数，为 $\times 1$ 的列向量。我们的目标是寻找增量 $\Delta x$ 使得 $||f(x+\Delta x)||^2$ 达到最小。为了求 $\Delta x$ ,我们需要解一个线性的最小二乘问题：
$\begin{aligned} \Delta x^*=arg\ \underset{\Delta x}{min}\ \frac{1}{2} ||{f(x)}+J(x)^T\Delta x||^2 \end{aligned}$

根据极值条件，将上述目标函数对 $\Delta x$ 求导，并令导数为零。为此，展开目标函数的平方项：
$\begin{aligned} \frac{1}{2} ||{f(x)}+J(x)^T\Delta x||^2&=\frac{1}{2}({f(x)}+J(x)^T\Delta x)^T({f(x)}+J(x)^T\Delta x)\\ &=\frac{1}{2}(||f(x)||^{2}_2+2f(x)J(x)^T\Delta x+\Delta x^TJ(x)J(x)^T\Delta x) \end{aligned}$
求上式关于 $\Delta x$ 的导数，并令其为零：

$\begin{aligned} J(x)f(x)+J(x)J(x^T)\Delta x=0 \end{aligned}$
可以得到如下方程组：
$\begin{aligned} J(x)J(x)^T\Delta x=-J(x)f(x) \end{aligned}$
令 $H(x)=J(x)J(x)^T$ ， $g (x) = - J (x) f (x)$ ，则该方程组可以表示为：
$\begin{aligned} H(x)\Delta x=g(x) \end{aligned}$
这个方程式关于变量 $\Delta x$ 的线性方程，求解这个方程式整个优化问题的核心所在。如果能顺利解出方程，高斯牛顿法的算法步骤可以写作如下：

高斯牛顿法执行步骤
1.给定初始值 $x_0$ ；
2.对于第 $k$ 次迭代，求出当前的雅克比矩阵 $J(x_k)$ 和误差 $f(x_k)$ ；
3.求解增量方程 $H\Delta x_k=g$ ；
4.若 $\Delta x_k$ 足够小，则停止。否则令 $\Delta x_{k+1}=x_k+\Delta x_k$ ，返回第2步;

2.高斯牛顿法Eigen库实现

2.1非线性函数构造

考虑一条满足如下方程的曲线：
$\begin{aligned} y=e^{(ax^2+bx+c)}+w \end{aligned}$
其中 $a, b, c$ 为曲线的参数， $w$ 为高斯噪声，满足 $w\sim(0,\sigma^2)$ 。假设我们有 $N$ 个关于 $x, y$ 的观测数据点，根据这些观测点求出曲线参数，最小二乘函数如下：
$\begin{aligned} \underset{a,b,c}{min} \frac{1}{2} \sum_{i=1}^N||y_i-e^{ax_i^2+bx_i+c}||^2. \end{aligned}$

定义误差模型为：
$e_i=y_i-e^{ax_i^2+bx_i+c}$
那么，可以求出每个误差项对于状态变量的导数：

$\begin{aligned} \frac {\partial e_i}{\partial a}&=-x_i^2e^{ax_i^2+bx_i+c}\\ \frac {\partial e_i}{\partial b}&=-x_ie^{ax_i^2+bx_i+c}\\ \frac {\partial e_i}{\partial c}&=-e^{ax_i^2+bx_i+c} \end{aligned}$
于是有 $J_i=[\frac {\partial e_i}{\partial a},\frac {\partial e_i}{\partial b},\frac {\partial e_i}{\partial c}]^T$ ，增量方程如下：
$(\sum_{i=1}^{100}J_iJ_i^T)\Delta x=(\sum_{i=1}^{100}-J_ie_i)$

2.2Eigen库实现

#include <iostream>
#include <opencv2/opencv.hpp>
#include <Eigen/Dense>

using namespace std;
using namespace Eigen;

int main()
{
	/*第一部分，生成观测数据xi,yi*/
	double ar = 1.0, br = 2.0, cr = 1.0;//真实参数值
	double ae = 2.0, be = -1.0, ce = 5.0;//估计参数值
	int N = 100;//数据点
	double w_sigma = 1.0;//噪声的sigma值
	cv::RNG rng;//opencv随机数产生器
	vector<double> x_data, y_data;//数据
	for (int i = 0; i < N; i++)
	{
		double x = i / 100.0;
		x_data.push_back(x);
		y_data.push_back(exp(ar*x*x + br * x + cr) + rng.gaussian(w_sigma*w_sigma));//加上高斯噪声
	}

	/*第二部分，开始高斯牛顿迭代*/
	int iterations = 100;
	double cost = 0, lastCost = 0;//本次迭代的cost和上一次迭代的cost
	//开始计时间
	chrono::steady_clock::time_point t1 = chrono::steady_clock::now();
	//迭代iterations次
	for (int iter = 0; iter < iterations; iter++)
	{
		Matrix3d H = Matrix3d::Zero();//H=JxJ^T
		Vector3d b = Vector3d::Zero();//b=-J*f
		cost = 0;
		//求解每个观测点的损失，即F=1/2*||f||^2方程中的f
		for (int i = 0; i < N; i++)
		{
			double xi = x_data[i], yi = y_data[i];//第i个数据点
			double error = yi - exp(ae*xi*xi + be * xi + ce);
			Vector3d J;//雅可比矩阵
			J[0] = -xi * xi*exp(ae*xi*xi + be * xi + ce);  // de/da
			J[1] = -xi * exp(ae*xi*xi + be * xi + ce);     // de/db
			J[2] = -exp(ae*xi*xi + be * xi + ce);          // de/dc

			H +=  J*J.transpose();
			b += -error*J;

			cost += error * error;
		}

		//求解线性方程组Hx=b
		Vector3d dx = H.ldlt().solve(b);
		if (isnan(dx[0]))
		{
			std::cout << "result is nan!" << std::endl;
			break;
		}
		//如果当前迭代不能使目标函数减小，则停止迭代
		if (iter > 0 && cost >= lastCost)
		{
			std::cout << "cost: " << cost << ">= lastCost:" << lastCost << ",break." << std::endl;
			break;
		}
		//更新参数
		ae += dx[0];
		be += dx[1];
		ce += dx[2];
		//记录当前迭代次数的代价函数值
		lastCost = cost;
		//输出迭代结果
		std::cout << "iteration:"<<iter+1<<"\ttotal cost:" << cost << "\t\tupdata:" << dx.transpose() << "\t\testimated params:" << ae << "," << be << "," << ce << endl;

	}
	chrono::steady_clock::time_point t2 = chrono::steady_clock::now();
	chrono::duration<double>time_used = chrono::duration_cast<chrono::duration<double>>(t2 - t1);
	std::cout << "solve time cost=" << time_used.count() << "seconds." << endl;
	std::cout << "estimated abc=" << ae << "," << be << "," << ce << endl;
	return 0;
}

2.3输出结果

最终得到参数a、b、c为0.890912、2.1719、0.943629，用时9ms。

iteration:1     total cost:3.19575e+06          updata:0.0455771  0.078164 -0.985329            estimated params:2.04558,-0.921836,4.01467
iteration:2     total cost:376785               updata: 0.065762  0.224972 -0.962521            estimated params:2.11134,-0.696864,3.05215
iteration:3     total cost:35673.6              updata:-0.0670241   0.617616  -0.907497         estimated params:2.04432,-0.0792484,2.14465
iteration:4     total cost:2195.01              updata:-0.522767   1.19192 -0.756452            estimated params:1.52155,1.11267,1.3882
iteration:5     total cost:174.853              updata:-0.537502  0.909933 -0.386395            estimated params:0.984045,2.0226,1.00181
iteration:6     total cost:102.78               updata:-0.0919666   0.147331 -0.0573675         estimated params:0.892079,2.16994,0.944438
iteration:7     total cost:101.937              updata:-0.00117081  0.00196749 -0.00081055              estimated params:0.890908,2.1719,0.943628
iteration:8     total cost:101.937              updata:  3.4312e-06 -4.28555e-06  1.08348e-06           estimated params:0.890912,2.1719,0.943629
iteration:9     total cost:101.937              updata:-2.01204e-08  2.68928e-08 -7.86602e-09           estimated params:0.890912,2.1719,0.943629
cost: 101.937>= lastCost:101.937,break.
solve time cost=0.00926409seconds.
estimated abc=0.890912,2.1719,0.943629