建模型，最后一步发现classification table不均匀，和解？ - Statistics版 - 未名存档

本页内容为未名空间相应帖子的节选和存档，一周内的贴子最多显示50字，超过一周显示500字访问原贴

Statistics版 - 建模型，最后一步发现classification table不均匀，和解？

相关主题
● 想问一个关于评价prediction performance的问题	● sensitivity and specificity
● proc logistic: how to build 2 X 2 classification table	● Help：ROC from R
● 谁来解释解释c-statistic为什么等于AUC	● 请问这句话什么意思？
● How to express cut-off value	● [R] ROC curve怎么指定cutoffs?
● sensitivity and specificity	● R-square of logistic regression
● 请问一个ROC AUC 问题？	● roc curve in R
● 急需帮助，关于比较ROC的问题。	● 梦想公司onsite,壮烈牺牲.发面经
● 报两个offer-updated-附面试心得 (转载)	● 请教一个R里的survivalROC问题！

相关话题的讨论汇总
话题: positive话题: dominant话题: value

进入Statistics版参与讨论

1

(共1页)

b********1 发帖数: 291	1 好不容易造个回归模型，各项指标看起来很美，f, g, p, a, o 各项指标都通过, lift curve, gains chart看着也不赖，以为没事儿了。忽然老板说还要比较observed value/predicted value. 于是我又在output 里面用个 p= option proc freq ; f_dv *i_dv /list missing; title compare predicted value vs observed value; run; 结果实际值与预测值完全对不上，差的十万八千里。于是有用了ctable, 生成 classification table, 才发现我的这个table确实不太对劲。我以往做的 classification table，sensitivity vs specificity 都能在classification 里面 [ .4 ~.6] 同时达到最高，大概70% ~80%. 可今天的table, sensitivity从100%立刻降到40%以下，specificity 直接从0飙升到94%. 奇怪的是我的ROC,lift看起来并不差。有谁懂的过来讨论下？
y**3 发帖数: 267	2 Your dependent variable is Binary(yes or no)?! What is the percentage for Yes? May be it is too rare
g******2 发帖数: 234	3 I think your model is highly dependent on probably 1 binary (or categorical) variable.
t*****a 发帖数: 459	4 你的sample size是多少？多少个independent variable？模型的目的是hypothesis testing还是predict future？评价model看AUC和calibration, 这两个方面都重要。你的模型如果predict出的越高风险组越over-estimate (或者统一往一个方向under-estimate），那最后AUC还是很好， calibration就不行。
b********1 发帖数: 291	5 嗯。我回去再看看。你们做模型， auc一般得多少才算通过？【在 y**3 的大作中提到】 : Your dependent variable is Binary(yes or no)?! What is the percentage for : Yes? : May be it is too rare
w*******9 发帖数: 1433	6 贴个roc 看看 lift [ 【在 b*******1 的大作中提到】 : 好不容易造个回归模型，各项指标看起来很美，f, g, p, a, o 各项指标都通过, lift : curve, gains chart看着也不赖，以为没事儿了。忽然老板说还要比较observed : value/predicted value. 于是我又在output 里面用个 p= option : proc freq ; : f_dv i_dv /list missing; : title compare predicted value vs observed value; : run; : 结果实际值与预测值完全对不上，差的十万八千里。于是有用了ctable, 生成 : classification table, 才发现我的这个table确实不太对劲。我以往做的 : classification table，sensitivity vs specificity 都能在classification 里面 [
b********1 发帖数: 291	7 嗯。谢谢。虽然看不太懂。【在 t*****a 的大作中提到】 : 你的sample size是多少？多少个independent variable？模型的目的是hypothesis : testing还是predict future？ : 评价model看AUC和calibration, 这两个方面都重要。你的模型如果predict出的越高风 : 险组越over-estimate (或者统一往一个方向under-estimate），那最后AUC还是很好， : calibration就不行。
A****1 发帖数: 33	8 i think your data might have rare positive event. if it is binary, one response value ~negative is dominant, the model can only predict the dominant one. so sensitivity =true positive/actual positive is low. lift [ 【在 b*******1 的大作中提到】 : 好不容易造个回归模型，各项指标看起来很美，f, g, p, a, o 各项指标都通过, lift : curve, gains chart看着也不赖，以为没事儿了。忽然老板说还要比较observed : value/predicted value. 于是我又在output 里面用个 p= option : proc freq ; : f_dv i_dv /list missing; : title compare predicted value vs observed value; : run; : 结果实际值与预测值完全对不上，差的十万八千里。于是有用了ctable, 生成 : classification table, 才发现我的这个table确实不太对劲。我以往做的 : classification table，sensitivity vs specificity 都能在classification 里面 [
A****1 发帖数: 33	9 i think your data might have rare positive event. if it is binary, one response value ~negative is dominant, the model can only predict the dominant one. so sensitivity =true positive/actual positive is low. lift [ 【在 b*******1 的大作中提到】 : 好不容易造个回归模型，各项指标看起来很美，f, g, p, a, o 各项指标都通过, lift : curve, gains chart看着也不赖，以为没事儿了。忽然老板说还要比较observed : value/predicted value. 于是我又在output 里面用个 p= option : proc freq ; : f_dv i_dv /list missing; : title compare predicted value vs observed value; : run; : 结果实际值与预测值完全对不上，差的十万八千里。于是有用了ctable, 生成 : classification table, 才发现我的这个table确实不太对劲。我以往做的 : classification table，sensitivity vs specificity 都能在classification 里面 [
A****1 发帖数: 33	10 i think your data might have rare positive event. if it is binary, one response value ~negative is dominant, the model can only predict the dominant one. so sensitivity =true positive/actual positive is low. lift [ 【在 b*******1 的大作中提到】 : 好不容易造个回归模型，各项指标看起来很美，f, g, p, a, o 各项指标都通过, lift : curve, gains chart看着也不赖，以为没事儿了。忽然老板说还要比较observed : value/predicted value. 于是我又在output 里面用个 p= option : proc freq ; : f_dv i_dv /list missing; : title compare predicted value vs observed value; : run; : 结果实际值与预测值完全对不上，差的十万八千里。于是有用了ctable, 生成 : classification table, 才发现我的这个table确实不太对劲。我以往做的 : classification table，sensitivity vs specificity 都能在classification 里面 [
A****1 发帖数: 33	11 i think your data might have rare positive event. if it is binary, one response value ~negative is dominant, the model can only predict the dominant one. so sensitivity =true positive/actual positive is low. lift [ 【在 b*******1 的大作中提到】 : 好不容易造个回归模型，各项指标看起来很美，f, g, p, a, o 各项指标都通过, lift : curve, gains chart看着也不赖，以为没事儿了。忽然老板说还要比较observed : value/predicted value. 于是我又在output 里面用个 p= option : proc freq ; : f_dv i_dv /list missing; : title compare predicted value vs observed value; : run; : 结果实际值与预测值完全对不上，差的十万八千里。于是有用了ctable, 生成 : classification table, 才发现我的这个table确实不太对劲。我以往做的 : classification table，sensitivity vs specificity 都能在classification 里面 [

1

(共1页)

进入Statistics版参与讨论

相关主题
● 请教一个R里的survivalROC问题！	● sensitivity and specificity
● 请教一个R的问题！	● 请问一个ROC AUC 问题？
● How to test the difference between two C statistics （want the P	● 急需帮助，关于比较ROC的问题。
● c-statistic是什么啊?	● 报两个offer-updated-附面试心得 (转载)
● 想问一个关于评价prediction performance的问题	● sensitivity and specificity
● proc logistic: how to build 2 X 2 classification table	● Help：ROC from R
● 谁来解释解释c-statistic为什么等于AUC	● 请问这句话什么意思？
● How to express cut-off value	● [R] ROC curve怎么指定cutoffs?

相关话题的讨论汇总
话题: positive话题: dominant话题: value

未名新帖统计// 7月16日

#	版面	帖数(主题数)
-	全站	4871 (796)
1	Military	3777 (569)
2	Stock	341 (51)
3	Joke	117 (17)
4	History	116 (3)
5	Automobile	100 (9)
6	USANews	55 (9)
7	Midlife	45 (1)
8	Headline	41 (41)
9	Dreamer	33 (13)
10	FleaMarket	32 (20)
11	Living	30 (7)

* 这里只显示发帖超过25的版面，努力灌水吧:-)