由买买提看人间百态

boards

本页内容为未名空间相应帖子的节选和存档,一周内的贴子最多显示50字,超过一周显示500字 访问原贴
Statistics版 - 菜鸟问个logistic regression的问题
相关主题
如何evaluate对binomial distribution的预测模型做logistic regression,cases很少但是predictor很多
问个logistic regression的问题,谢谢!Weighted logistic Regression
问个logistic regression的问题。大包子,求教一道sas的p value.....着急啊 。。。。。。。
统计菜鸟请教问题:关于linear regression问大家一个propensity score matching 的问题
Unexpected statistical analysis results in Cox regression请教: Binary Logistic Regression 用不同的method, 为何结果不一样啊?
normality test mix model问一个Model的问题
regression后的residual是个双峰分布Similar "freq count" statement in SPSS logistic regression
急问negative binomial regression的结果的model significance看哪个参数问个SAS regression问题
相关话题的讨论汇总
话题: regression话题: logistic话题: percentage话题: normality话题: binomial
进入Statistics版参与讨论
1 (共1页)
w*********8
发帖数: 70
1
假设有组binary data,比如说不同年龄人群里面的男女分布。现在我们把性别0,1分
布转化:在25岁的人群里面,n=100, 75female, 那75/100=0.75是我们的第一个值;在
26岁人群里,n=200, 100 female,那100/200是第二个值 and so on....这样的话我们
得到的这组数据就是continuous了。问:这时候可以用linear regression来处理这组
数据吗?首先,做过normality test了,satisfied. 其他几个assumption也满足。
多谢!
k*******a
发帖数: 772
2
不过你怎么解释你的model?
linear regression的y取值范围不可能限制与(0,1)
a****r
发帖数: 1486
3
这跟logistics regression有啥太大区别吗?

【在 w*********8 的大作中提到】
: 假设有组binary data,比如说不同年龄人群里面的男女分布。现在我们把性别0,1分
: 布转化:在25岁的人群里面,n=100, 75female, 那75/100=0.75是我们的第一个值;在
: 26岁人群里,n=200, 100 female,那100/200是第二个值 and so on....这样的话我们
: 得到的这组数据就是continuous了。问:这时候可以用linear regression来处理这组
: 数据吗?首先,做过normality test了,satisfied. 其他几个assumption也满足。
: 多谢!

w*********8
发帖数: 70
4
这个应该不是问题。Y取值范围可以限定在(0,1),就在(0,1)之间follow一个
normall distribution啊。

【在 k*******a 的大作中提到】
: 不过你怎么解释你的model?
: linear regression的y取值范围不可能限制与(0,1)

M****e
发帖数: 178
5
I think this is kind of a typical case for logistic regression with binomial
responses. If you reduce the data to percentage, in some sense you are
losing information. For example, in one population, n=200, n(female)=100,
you have percentage of 0.5; in another population, n=4, n(female)=2, you
still get percentage of 0.5. But apparently you have more certainty that the
first population has a "real" female percentage of 0.5. The data of these
two populations will add different "weights" to logistic regression model.
There seem also other reasons avoiding using percentage in statistical
analysis... (I forget)
BTW, how did you do normality test? For linear regression, response
variables do not need to be normally distributed (residuals after fitting
the model do)
A*******s
发帖数: 3942
6
i am curious what is the reason for difficulty of modeling proportions. is
it because hard to assume its distribution?

binomial
the
these

【在 M****e 的大作中提到】
: I think this is kind of a typical case for logistic regression with binomial
: responses. If you reduce the data to percentage, in some sense you are
: losing information. For example, in one population, n=200, n(female)=100,
: you have percentage of 0.5; in another population, n=4, n(female)=2, you
: still get percentage of 0.5. But apparently you have more certainty that the
: first population has a "real" female percentage of 0.5. The data of these
: two populations will add different "weights" to logistic regression model.
: There seem also other reasons avoiding using percentage in statistical
: analysis... (I forget)
: BTW, how did you do normality test? For linear regression, response

s*r
发帖数: 2757
7
你本来有300个observations
现在只有两个了

【在 w*********8 的大作中提到】
: 假设有组binary data,比如说不同年龄人群里面的男女分布。现在我们把性别0,1分
: 布转化:在25岁的人群里面,n=100, 75female, 那75/100=0.75是我们的第一个值;在
: 26岁人群里,n=200, 100 female,那100/200是第二个值 and so on....这样的话我们
: 得到的这组数据就是continuous了。问:这时候可以用linear regression来处理这组
: 数据吗?首先,做过normality test了,satisfied. 其他几个assumption也满足。
: 多谢!

w*********8
发帖数: 70
8
多谢!
至于normlity test. 我的理解是,normality assumption是指 response variable
follows normal distribution given each sub-population. 就是说
Y|X~N. 很多test for normality啊,比如S-W test.

binomial
the
these

【在 M****e 的大作中提到】
: I think this is kind of a typical case for logistic regression with binomial
: responses. If you reduce the data to percentage, in some sense you are
: losing information. For example, in one population, n=200, n(female)=100,
: you have percentage of 0.5; in another population, n=4, n(female)=2, you
: still get percentage of 0.5. But apparently you have more certainty that the
: first population has a "real" female percentage of 0.5. The data of these
: two populations will add different "weights" to logistic regression model.
: There seem also other reasons avoiding using percentage in statistical
: analysis... (I forget)
: BTW, how did you do normality test? For linear regression, response

w*********8
发帖数: 70
9
所以power下降?

【在 s*r 的大作中提到】
: 你本来有300个observations
: 现在只有两个了

D*D
发帖数: 236
10
actual proportion is between 0 and 1 but linear regression can give
predictions beyond [0,1]
This is how it was done before logistic regression came into use.

【在 A*******s 的大作中提到】
: i am curious what is the reason for difficulty of modeling proportions. is
: it because hard to assume its distribution?
:
: binomial
: the
: these

相关主题
normality test mix model做logistic regression,cases很少但是predictor很多
regression后的residual是个双峰分布Weighted logistic Regression
急问negative binomial regression的结果的model significance看哪个参数大包子,求教一道sas的p value.....着急啊 。。。。。。。
进入Statistics版参与讨论
w*********8
发帖数: 70
11
如果我们假设足够多的年龄段那?
恩,不过那还是要少了很多obs.

【在 s*r 的大作中提到】
: 你本来有300个observations
: 现在只有两个了

s*r
发帖数: 2757
12
应该没啥变化,本来binomial就是多个被努力的和
不过covariate也只能用aggregated value了

【在 w*********8 的大作中提到】
: 所以power下降?
A*******s
发帖数: 3942
13
oh, i thought he referred to the difference between binary response {0, 1} and continuous response bounded in (0, 1), not the difference between logit and identity link functions.

【在 D*D 的大作中提到】
: actual proportion is between 0 and 1 but linear regression can give
: predictions beyond [0,1]
: This is how it was done before logistic regression came into use.

D*********2
发帖数: 535
14

re.
likelihood一样的

【在 s*r 的大作中提到】
: 应该没啥变化,本来binomial就是多个被努力的和
: 不过covariate也只能用aggregated value了

w*********8
发帖数: 70
15
问了老师,老师说,if you have ungrouped data, you cannot do that. if you
have grouped data, you can do it through percentages.
p***r
发帖数: 920
16
I think it must mean both are okay when you programming, but the sample
size is not negligible.

【在 w*********8 的大作中提到】
: 问了老师,老师说,if you have ungrouped data, you cannot do that. if you
: have grouped data, you can do it through percentages.

1 (共1页)
进入Statistics版参与讨论
相关主题
问个SAS regression问题Unexpected statistical analysis results in Cox regression
Linear regression model 问题请教normality test mix model
正态分布,请教!regression后的residual是个双峰分布
紧急求助一个LOGISTIC REGRESSION 问题.急问negative binomial regression的结果的model significance看哪个参数
如何evaluate对binomial distribution的预测模型做logistic regression,cases很少但是predictor很多
问个logistic regression的问题,谢谢!Weighted logistic Regression
问个logistic regression的问题。大包子,求教一道sas的p value.....着急啊 。。。。。。。
统计菜鸟请教问题:关于linear regression问大家一个propensity score matching 的问题
相关话题的讨论汇总
话题: regression话题: logistic话题: percentage话题: normality话题: binomial