F********E 发帖数: 1025 | 1 有什么数值可以评价吗?(看到教科书上有covariance,但感觉不像)
最好有什么类似relative standard deviation的不依赖于原始数据大小的标准吗?
比如:如果是perfect的拟合,为0;如果是一塌糊涂的拟合,为1. | G******y 发帖数: 1831 | 2 不也看MSE和Prediction Error吗?
【在 F********E 的大作中提到】 : 有什么数值可以评价吗?(看到教科书上有covariance,但感觉不像) : 最好有什么类似relative standard deviation的不依赖于原始数据大小的标准吗? : 比如:如果是perfect的拟合,为0;如果是一塌糊涂的拟合,为1.
| c****t 发帖数: 19049 | 3 depending on whether you are doing prediction, inference or just exploration.
since you are in ins, i am guessing you are building a predictive model. in
most cases, you should ask your boss this question instead of anyone else,
or your job would be in danger, because most people in this field in ins.
industry are never-statistics-trained.
theoretically, how well your model perform on the holdout dataset is the key
- more than any statistical measures. just hope your boss at least
understand t
【在 F********E 的大作中提到】 : 有什么数值可以评价吗?(看到教科书上有covariance,但感觉不像) : 最好有什么类似relative standard deviation的不依赖于原始数据大小的标准吗? : 比如:如果是perfect的拟合,为0;如果是一塌糊涂的拟合,为1.
| w**********y 发帖数: 1691 | 4 likelihood
A personal subjective suggestion: simply devide your log likelihood by the
number of your data, then you will have the sense of the goodness of fit.
Mean error
training error and true (predictive) error
-I didn't know how people utilized "cross validation" with 'holdout' data,
until I worked in an insurance company. Theoratically, what they did is not
that perfect.
AIC, BIC
There is no big difference for model evaluation in theory between linear and
non-linear regression. Just harder | w**********y 发帖数: 1691 | 5 I agree with casact.
As I know in my company, people make the judgment subjectively by their
market sense and whether the model fitting is meaningful and explainable. It
is similar when they do feature selection. They just try, try and try. Try
different categorizing, different models, linear or nonlinear, and finally
analyze based on the output tabulation and simply p-value.
They never use some advanced statistical methods. The way they utilize
smoothing methods (Spline, LOWESS..), Simulation ( | s********a 发帖数: 1100 | | c****t 发帖数: 19049 | 7 "with such huge dataset (millions records)"
that's exactly where ins. industry went very wrong (although just one
example)
ins data has dramatic time and space correlations, which greatly distort the
modeling results. That's why no matter whatever method or whatever
software you use, you won't be too far away from an "average" result. since
spatial analysis (or essentially spatial-tempo analysis) is only current in
statistics research and there is no math (probability) development
supporting i |
|