boards

本页内容为未名空间相应帖子的节选和存档,一周内的贴子最多显示50字,超过一周显示500字 访问原贴
Statistics版 - Gamma distribution for power analysis,请大拿指点
相关主题
gamma distribution + mgf
求助:Power analysis for poisson regression?
Proc Genmod 模型
关于lasso的variable selection问题
proc genmod 结果的问题
++++求解惑PROC GENMOD里的working matrix 问题+++++
how to do data fitting to find distribution (转载)
How to test a sample is negative binomial distribution
问几题sas adv, 多谢了!
请问SAS advanced macro global 和local
相关话题的讨论汇总
话题: gamma话题: proc话题: scale话题: data
进入Statistics版参与讨论
1 (共1页)
y*****w
发帖数: 1350
1
各位distribution大拿好,我最近有个sample size calculation project。是count
data,具体说是number of patient visits, with number of episodes for each
number of visit。请各位帮着参谋看一下我做的是否合适,给些意见。
Sample data如下:
# patient visit episode
---------------------------------
1 156
2 287
3 589
4 899
5 1535
6 1408
7 1017
8 870
9 743
10 498
... ...
一般来说分析count data用Poisson distribution / Negative binomial
distribution / Gamma distribution。我用的程序是SAS里面的PROC POWER with
the TWOSAMPLEWILCOXON statement。它里面没有negative binomial distribution的
选项,所以只好放弃negative binomial distribution。另外,Poisson distribution
要求mean=variance,此点在我的DATA里面不满足,所以也只好放弃Poisson
distribution。下面就是如何做Gamma distribution了。
1)PROC POWER with the TWOSAMPLEWILCOXON statement里面要求输入Gamma
distribution PDF的shape parameter和scale parameter。我用PROC GENMOD得到mean
和shape parameter(注意,根据http://support.sas.com/kb/33/068.html,PDF里面的shape parameter是GENMOD里面的scale parameter)。然后根据公式mean=shape*scale,算出PDF里面的scale paramter。
2)To visualize it, 我作了比较Gamma distribution和原始distribution的曲线图(
见图,是另一个类似data,visit从5开始)。
下面的是得到Gamma distribution PDF的shape parameter和scale parameter以及作图
的code:
proc genmod data=mydata;
model visit = / dist=gamma link=log;
freq episode;
ods output parameterestimates=pe;
output out=outmean pred=mu;
run;
proc transpose data=pe out=tpe;
var estimate;
id parameter;
run;
data _null_;
set outmean;
call symputx("mean", mu);
stop;
run;
data _null_;
set tpe;
call symputx("scale", scale);
stop;
run;
data pmf;
do t = 1 to 80; /*1 to max(x) */
Y = pdf("GAMMA", t, &scale, &mean/&scale);
output;
end;
run;
data mydata_tomerge (drop=visit);
set mydata;
t=visit;
run;
proc sort data=mydata_tomerge; by t; run;
proc sort data=pmf; by t; run;
data mydata_merge;
merge pmf (in=a) mydata_tomerge;
by t;
if a;
run;
proc gplot data=mydata_merge;
plot (Y percent)*t / vaxis=axis1 haxis=axis2 legend=legend1 noframe
overlay;
title1 font=&font height=4 "Gamma Distribution for Patient Visits" move=(+
50,+0) ;
run;
3)分别算处理组和对照组,然后把各自的shape parameter和scale parameter输入
TWOSAMPLEWILCOXON,设置好power,就可以算出sample size。下面是PROC POWER的
code:
proc power;
twosamplewilcoxon
vardist("gamma_original") = gamma (&scale_orig, &mean_orig/&scale_orig)
vardist("gamma_alternative") = gamma (&scale_alter, &mean_alter/&scale_
alter)
variables = "gamma_original" | "gamma_alternative"
power = 0.80 0.90
ntotal = .;
run;
我最初想用lognormal distribution,但是那得整geometric mean & std,需要log
transformation来log transformation去的,好像误差挺大的,而且看distribution也
不是完全符合lognormal,就放弃了。
y*****w
发帖数: 1350
2
statcompute, songkun, todayonline, weiwei859, please give your opinions.
Thanks!
y*****w
发帖数: 1350
3
Actually the intercept in the ODS output is the estimated log mean of the
fitted gamma distribution, so I don't need the output statement in PROC
GENMOD. The pertinent code becomes:
proc genmod data=mydata;
model visit = / dist=gamma link=log;
freq episode;
ods output parameterestimates=pe;
run;
proc transpose data=pe out=tpe;
var estimate;
id parameter;
run;
data _null_;
set tpe;
call symputx("logmean", intercept);
call symputx("scale", scale);
stop;
run;
data pmf;
do t = 1 to 80; /*1 to max(x) */
Y = pdf("GAMMA", t, &scale, exp(&logmean)/&scale);
output;
end;
run;
o**m
发帖数: 828
4
wh y not nb model ?

【在 y*****w 的大作中提到】
: 各位distribution大拿好,我最近有个sample size calculation project。是count
: data,具体说是number of patient visits, with number of episodes for each
: number of visit。请各位帮着参谋看一下我做的是否合适,给些意见。
: Sample data如下:
: # patient visit episode
: ---------------------------------
: 1 156
: 2 287
: 3 589
: 4 899

y*****w
发帖数: 1350
5
Because as I mentioned, SAS PROC POWER with the TWOSAMPLEWILCOXON statement
, which I used for the sample size calculation, has no NEGB option.
Actually I found I don't need PROC GENMOD in this context, because I found
that lognormal distribution better fits my data than Gamma distribution,
whereas PROC GENMOD does not have the lognormal option as a distribution. I
found the following SAS example using PROC UNIVARIATE very useful in that it
provides simple and straightforward statistics and graphs to help make a
decision on distribution:
http://support.sas.com/documentation/cdl/en/procstat/63104/HTML

【在 o**m 的大作中提到】
: wh y not nb model ?
1 (共1页)
进入Statistics版参与讨论
相关主题
请问SAS advanced macro global 和local
求助,SAS ADV 130 中94
SAS 求助: filenames
Poisson and Binomial
在sas 里面,poisson regression model里面,response variable可以是binary吗?
请问poisson regression和 Log-linear model之间的联系?
need help SAS Macro
test count data distribution in SAS
请教个问题
请教logistic regression问题
相关话题的讨论汇总
话题: gamma话题: proc话题: scale话题: data