金融学中的贝叶斯方法读书笔记-(Bayesian Methods in Finance)

本文链接：https://blog.csdn.net/AlanGuoo/article/details/106307213

文章目录

第一章介绍
- 频率学派 VS 贝叶斯学派（Frequentist v.s Bayesian)
第二章贝叶斯框架 - 似然函数
第三章先验信息与后验信息、以及预测推断
- 先验：

第一章介绍

频率学派 VS 贝叶斯学派（Frequentist v.s Bayesian)

频率学派认为一个变量的概率分布是确定的，然而贝叶斯学派认为变量的概率分布是不确定的，它会随着新的消息，新的观测而发生改变，先有根据已有的经验假设一个先验概率分布，而后利用观测值得出后验分布

Proponents of the frequentist approachconsider the s ource of
uncertainty to be the r andomness inherent in realizations of a r
andom variable. The probability distributions of variabl es are not
subject to uncertainty. In contrast, Bayesian statistics treats pr
obability distributions as uncertain and subject to modification as
new information becomes available. Uncertainty is implicitly
incorporated by probab ility updating. T he probability beliefs based
on the existing knowledge base take the form of the prior probability.
The posterior probability represents the updated beliefs.

第二章贝叶斯框架 - 似然函数

泊松分布

泊松概率分布公式
$p(X=k)=\frac{\theta^{k}}{k !} e^{-\theta}, \quad k=0,1,2, \ldots$
假设有20个观测值 $x_{1}, x_{2}, \ldots, x_{20}$ ，那么联合概率分布为：
$\begin{aligned} L\left(\theta | x_{1}, x_{2}, \ldots, x_{20}\right) &=\prod_{i=1}^{20} p\left(X=x_{i} | \theta\right)=\prod_{i=1}^{20} \frac{\theta^{x_{i}}}{x_{i} !} e^{-\theta} \\ &=\frac{\theta^{\sum_{i=1}^{20} x_{i}}}{\prod_{i=1}^{20} x_{i} !} e^{-20 \theta} \end{aligned}$
上式可以改为：
$L\left(\theta | x_{1}, x_{2}, \ldots, x_{20}\right) \propto \theta^{\Sigma_{i=1}^{20} x_{i}} e^{-20 \theta}$
利用极大似然估计法（maximum likelihood）得出 最大似然估计:
$\widehat{\theta}=\bar{x}=\frac{\sum_{i=1}^{20} x_{i}}{20}$
即为20个观测值的平均值
上述推断的一个隐含假设是20个观测值相互独立，互不干扰

正态分布

概率分布公式
$f(y)=\frac{1}{\sqrt{2 \pi} \sigma} e^{-\frac{(-x)^{2}}{2 \sigma^{2}}}$

贝叶斯公式

$D)=\frac{P(D | E) \times P(E)}{P(D)}$

贝叶斯推断与二项分布

Beta分布是二项分布的共轭先验

The beta distribution is the conjugate prior distribution for the
binomial parameter θ . This means that the posterior distribution of θ
is also a beta distribution (of course, with updated parameters)

有时候由于样本数据量太大，不同先验分布的选择其实不会造成后验分布的较大差异，因为这时候大样本可以掩盖掉先验信息的作用。

The two posterior estimates and the m aximum-likelihood estimate are the same for all practical purposes. The reason is that the sample
size is so large that the information contained in the data sample
‘‘swamps out’’ the prior information. In Chapter 3, we further
illustrate and comment on the role sample size plays in posterior
inference

第三章先验信息与后验信息、以及预测推断

先验：

$p(\theta | \boldsymbol{y}) \propto L(\theta | \boldsymbol{y}) \pi(\theta)$
where:
- θ = unknown parameter whose inference we are interested in.
- y = a vector (or a matrix) of recorded observations.
- π (θ ) = prior distribution of θ depending on one or more parameters, called hyperparameters.
- L (θ |y) = likelihood function for θ .
- p(θ |y) = posterior (updated) distribution of θ .