线性回归系数求解-CSDN博客

本文链接：https://blog.csdn.net/suxiaorui/article/details/89505256

最近学了一线性回归，这里记录一下 $y=\theta _{0}+\theta _{1}*x$ 中的系数是怎么求的。

方法一：通过最小二乘法常规等式求解theta

import numpy as np
import matplotlib.pyplot as plt


# 这里相当于是随机X维度X1，rand是随机均匀分布
X = 2 * np.random.rand(100, 1)
# 人为的设置真实的Y一列，np.random.randn(100, 1)是设置error，randn是标准正太分布
y = 4 + 3 * X + np.random.randn(100, 1)
# 整合X0和X1
X_b = np.c_[np.ones((100, 1)), X]

# 常规等式求解theta
theta_best = np.linalg.inv(X_b.T.dot(X_b)).dot(X_b.T).dot(y)
print(theta_best)

# 创建测试集里面的X1
X_new = np.array([[0], [2]])
X_new_b = np.c_[(np.ones((2, 1))), X_new]
y_predict = X_new_b.dot(theta_best)
print(y_predict)

plt.plot(X_new, y_predict, 'r-')
plt.plot(X, y, 'b.')
plt.axis([0, 2, 0, 15])
plt.show()

我们可以看到计算出来的 $\theta _{1}$ 是4.09549294， $\theta _{0}$ 是3.03864514，两个计算出来都还算比较接近上面给的4和3。图像如下：

方法二：运用sklearn库中的LinearRegression运行一下。

import numpy as np
from sklearn.linear_model import LinearRegression


X = 2 * np.random.rand(100, 1)
y = 4 + 3 * X + np.random.randn(100, 1)

lin_reg = LinearRegression()
lin_reg.fit(X, y)
print(lin_reg.intercept_, lin_reg.coef_)

X_new = np.array([[0], [2]])
print(lin_reg.predict(X_new))

方法三：运用keras库

import keras
import numpy as np
import matplotlib.pyplot as plt

x = np.linspace(0,100,30)
y = 3*x + 4 +np.random.randn(30)*6

# plt.scatter(x,y)
# plt.show()
model = keras.Sequential()  #顺序模型
from keras import layers
model.add(layers.Dense(1,input_dim=1))
#编译模型
model.compile(optimizer='adam',
             loss='mse'
)
# #训练模型
model.fit(x,y,epochs=5000)
plt.scatter(x,y,c='r')
plt.plot(x,model.predict(x))
plt.show()