Heteroscedasticity异方差

Published: 17 Jul 2019 Category: stats

Heteroscedasticity「异方差」是相对于Homoscedastic「同方差」而言,其定义如下

Heteroscedasticity refers to data with unequal variability (scatter) across a set of second, predictor variables.

如果出现异方差,一般回归分析会受影响。简单讲是variability(常以variance衡量)沿着某个预测变量,不是恒定的,如果作图「预测变量与目标变量」,典型的会呈现锥形的散点图

Heteroscedasticity.png (556×357)

一个例子,比如用家庭年收入预测在奢侈品上的花费,则奢侈品花费即是异方差的。

As expected, there is a strong, positive association between income and spending. Upon examining the residuals we detect a problem – the residuals are very small for low values of family income (almost all families with low incomes don’t spend much on luxury items) while there is great variation in the size of the residuals for wealthier families (some families spend a great deal on luxury items while some are more moderate in their luxury spending). This situation represents heteroscedasticity because the size of the error varies across values of the independent variable.

Refs