G*****u 发帖数: 1222 | 1 Time Series Analysis: Univariate & Multivariate Methods, by William W.S. Wei |
|
x**g 发帖数: 807 | 2 Use PROC GREPLAY 中的treplay命令。 |
|
y*******g 发帖数: 115 | 3 你要是用SAS 9.2的话,那么你就可以用 Proc SGPLOT了,那个做出图来很不错。
|
|
m*********n 发帖数: 413 | 4 SGPLOT is better.
But you could try
ods rtf file='XXX' startpage=no;
or other ODS destinationo should have the startpage option
histograms, |
|
y*****w 发帖数: 1350 | 5 If I have a "change from baseline to follow-up" variable Y as the dependent
variable, and its baseline counterpart X and another baseline variable Z as
the independent variables, then for the model Y=X Z, the coefficient for X
was negative whereas the coefficient for Z was positive. Nevertheless,
univariably Y was negatively associated with X but there was no relation
between Y and Z. On the other hand, X and Z were highly positively
correlated, which makes it confusing when interpreting the mod |
|
m*******s 发帖数: 469 | 6 proc univariate
proc freq
proc glm
proc lifetest
proc phreg
proc mixed
真正弄懂玩透这几个SAS命令,药厂95%的project可以应付了。 |
|
a********a 发帖数: 3176 | 7 You can also use MODE in proc univariate to get the most prevaling value.
和C |
|
s******n 发帖数: 95 | 8 Multivariate ANOVA & Univariate Repeated-Measures ANOVA, are they the same
thing?
Any idea is welcome, thank you so much! |
|
q**j 发帖数: 10612 | 9 就像sas 里面的 proc univariate 或者R里面的fivenumber。多谢了。 |
|
g*******y 发帖数: 380 | 10 Hi,all
I use the following code to generate the cdf plot:
proc univariate data=temp1 noprint;
var PPB;
by zone;
class type;
cdfplot PPB/overlay vref = 5 95
cvref = black
vreflabels = '5%' '95%' ;
run;
The goal it to get the overlapped plots of cumulative distribution frequency
of "PPB" in two different type by each zone.
Now I got those plots by zone, and the title is "cumulative distribution
function for PPB", how cou |
|
t*********e 发帖数: 313 | 11 i actually ended with randomly selecting one data set for the univariate
table. Any better idea is very welceome |
|
j*****7 发帖数: 4348 | 12 http://www.flcdatacenter.com/CasePerm.aspx
排除了个别小时工
Year 2008:
The UNIVARIATE Procedure
Variable: WAGE_OFFER_FROM
Quantiles (Definition 5)
Quantile Estimate
100% Max 175000.0
99% 140000.0
95% 118080.0
90% 104467.0
75% Q3 90000.0
50% Median 74370.0
25% Q1 60500.0
10% 53000.0
5% 46057.4
1% 38303.0
0% Min 29400.0
Extreme Observations |
|
s*r 发帖数: 2757 | 13 hist m/MIDPOINTS=0 to 101 by 2 ; |
|
S******y 发帖数: 1123 | 14 Thanks. Sir!
Do you know what is the default of midpoints for
..
hist m;
..
?
or how does SAS decides on default? |
|
S******y 发帖数: 1123 | 15 Thanks. Sir!
Do you know what is the default of midpoints for
..
hist m;
..
?
or how does SAS decides on default? |
|
n******e 发帖数: 476 | 16 是药厂?CRO?
主要就是 SAS base 部分,data step,proc freq, proc univariate, proc means。
复杂点的有 transpose, merge。然后 sql 的各种 join。画图的基本知识。
模型知道一点 anova, t-test, linear regression 足够了。不知道也没太大关系,但
是面试的时候如果有统计员,他们不懂太多编程,只好问统计,知道一点比较好。
good luck! |
|
w*********y 发帖数: 7895 | 17 我没有经验啦。因为我找工作的时候,看到有DATA MANAGEMENT总是出现,
就GOOGLE了一下相关的东西。找到一些DATA CLEANSING的文章,上面提到
稍微复杂的方法。
我记忆中,你提到的这个是比较常用来检查TYPO的。还有用PROC UNIVARIATE来
检查TYPO。还有就是用那些什么SAS FUNTION来检查的,还有各种GRAPH图像之类的。
name |
|
p**********l 发帖数: 1160 | 18 What we learnt during a lecture was to use sphericity test to test if the
within subject variance-covariance matrix has a type H ( Huynh & Feldy)
structure, or others call it HF structure.
Covariance matrix is of type H iff its quadratic form with an orthogonal
contrast matrix.
H0: = sigma^2 I.
Ha: = unstructured form.
If the test is nonsignificant, use univariate test for within-subject
effects and since they are more powerful then the multivariate test.
If the test is signifi |
|
d*******o 发帖数: 493 | 19 1) 如何有效地用SAS做multiple merge?有什么好方法?
1. array 2. sort-sort-merge;3. proc sql; 4. proc format; 5. hash object
Coding efficiency: 3>2>4>1>5
I/O resource: 5>4>1>3>2
flexibility: 3>>others
2) large-scale database data cleaning 的常用方法?
Using proc sql to access database via DBMS. Then use it to check outlier/
missing/invalid/duplicate values, do hard
coding correction, and update integrity constraint. Also use proc sort/means
/freq/univariate/datasets/compare/rank and SAS functions
(date/regular express |
|
s*******y 发帖数: 2977 | 20 Rsqure大不一定是overfitting,跟你的number of variables and sample size都有关
系。建议看一看Frank Harrell的 regression modeling strategies。
lz的variable很多,建议fit model之前先检查colinearity,对于highly correlated
的variable,keep一个(比如说univariate fitting里rsquare最好的那个),然后再
用stepwise或penalized variable selection.不过2007年好像有个加拿大人写的一篇
文章做了很详细的simulation,比较了backward selection加或不加bootstrapping都
不会给出很好的结果,嘿嘿,eye-dropping conclusions。 |
|
f******h 发帖数: 46 | 21 谢谢,刚刚上面回了一贴,说处理过multicollinearity以后的变量们发现并不是最好
的pool。。。我觉得是我multicollinearity处理方式不对头。我是在每一步去掉VIF最
大的那个变量,我也注意到这样的方式,很容易导致把那些和dependent variable的
correlation最大的predictor都去掉了。。。很ft
我想试试你的方法,在每组corr很大的变量中保留那个univariate R^2最大的。但是这
里也有问题:1)因为变量非常多,这种大corr的组合并不是mutually exclusive的,
就是说组和组的不同变量之间也很难避免一些corr很大,当然,这个可以考虑用
cluster analysis来交给sas解决;2)另一个问题是每组保留一个可靠吗?还是说在
经验上这样的做法是一种惯例?
correlated |
|
p********a 发帖数: 5352 | 22 ☆─────────────────────────────────────☆
jhsph07 (银杏) 于 (Wed Feb 24 14:47:45 2010, 美东) 提到:
http://www.flcdatacenter.com/CasePerm.aspx
排除了个别小时工
Year 2008:
The UNIVARIATE Procedure
Variable: WAGE_OFFER_FROM
Quantiles (Definition 5)
Quantile Estimate
100% Max 175000.0
99% 140000.0
95% 118080.0
90% 104467.0
75% Q3 90000.0
50% Median 74370.0
25% Q1 60500.0
10% 53000.0
5% 46057.4
1% 38303.0
0 |
|
o****o 发帖数: 8077 | 23 why not try PROC UNIVARIATE? |
|
o****o 发帖数: 8077 | 24 data BPressure;
do patientID=1 to 20;
x=ranpoi(4, 10);
do j=1 to x;
Systolic=rannor(888);
diastolic=ranuni(8888);
output;
drop j;
end;
end;
run;
proc sort data=BPressure; by patientID; run;
ods select none;
ods output ExtremeValues=XPval;
proc univariate data=BPressure nextrval=4;
by PatientID;
var Systolic Diastolic;
output out=_mean mean=sysmean Diamean
median=sysmedian diamedia |
|
B******y 发帖数: 9065 | 25 PROC REPORT, MEANS, FREQ, GLM and LOGISTICS,包括换汤不换药的有UNIVARIATE,
MIXED, GENMOD等,基本上涵括了90%以上的工作。剩下的10%就为PROC LIFETEST,
PHREG,POWER,PLAN等等了。 |
|
g********0 发帖数: 90 | 26 proc univariate data=xx plot;
id xx;
var xx;
run; |
|
S*****U 发帖数: 99 | 27 data test;
do i=1 to 10000;
disease=ranpoi(0,5);
control=ranpoi(0,2);
output;
end;
run;
proc univariate data=test plot;
var disease control;
run;
这个好像不行啊,不过写过上面两位, 包子已发啦 |
|
s***r 发帖数: 1121 | 28 thanks. I sent you 4 baozi (20 dollars). one more question:
I also need to run the regression like this:
b1 = e1
b1 = r1
b1 = f1
b1= e2
b1= r2
b1= f2
b2= e1
b2= r1
b2 f1
...
...
that is, I also need to run univariate regression. Can you help me with the
macro? Many thanks. |
|
s*****r 发帖数: 790 | 29 您的巨著,开头的地方,您说
"We believe that a mathematical expectation E(X) can determine the location
of a distribution of the X;"
Can you define what you mean by the location of a distribution? And who
believe that? can you provide any reference about it?
for your important properties,
Property 1.1: Uniqueness, any random variable in a population is unique and
differs from others.
in what sense do you mean uniqueness? Assume X is a random variable
following a univariate standard normal distribution. anoth |
|
s**f 发帖数: 365 | 30 请问,用SAS的proc logistic做ordinal regression,dependent variable有几个level。proc logistic是默认这几个level是linear(univariate的regression)的关系,还是nonlinear(multivariate的regression)的关系?
我今天找SAS的help里面找了半天也没找到,请问哪里有解释?
知道的大牛请一定帮我一下!太感谢了!!! |
|
s********9 发帖数: 74 | 31 outcome: A
independet variables: B C D E F G H
univariable logistic regression: B C D E F G H all have significant
influence on A.
multivariable logistic regression: Only B has significant influence on A.
Is factor B the only factor should be considered as A's influence factor. |
|
s********9 发帖数: 74 | 32 If there are independet variables like income, food cost, house renting/
payment, education cost ...... They are correlated, but each of them is
important. If each of them significant in univariable analysis, but only
income is significant in multivariable analysis. What could be the
conclusion for the impact factors for the outcome? |
|
|
l**********9 发帖数: 148 | 34 扑哧..四大天王..还有这种说法...
我记得univariate里面有normal test这个option |
|
c*******n 发帖数: 300 | 35 univariate 的话,shapiro wilk test 足够好了。 程序采用Royston的近似方法, 结果中包括test statistics 和p value。 |
|
T*******I 发帖数: 5138 | 36 试试用SAS的ODS系统输出检验结果,然后再用数据步进行处理,就可以得到你想要的结
果。不过,正如楼上有人建议的,你要根据你的样本量来确定使用哪一种检验的结果。
请参考Univariate Procedure. |
|
y*****w 发帖数: 1350 | 37 If your count data does not include zero values, you can use PROC UNIVARIATE
with the HISTOGRAM statement to get goodness-of-fit test statistics for
several distributions, however those distributions do not include NB.
http://support.sas.com/documentation/cdl/en/procstat/63104/HTML
I have recently run a project in which I had to choose a best distribution
for my non-zero historical count data for the purpose of sample size
calculation. I applied the HISTOGRAM statement to compare GAMMA, LOGNORMA... 阅读全帖 |
|
w*******9 发帖数: 1433 | 38 It's a good idea but the chisq stuff is quite sensitive to the bins you use.
UNIVARIATE
, |
|
S********a 发帖数: 359 | 39 我有5个datasets, 分别是a1 a5 a7 a30 a180, 想把每个文件重复做一个procedure
%macro normcheck(k) ;
proc univariate data=&k ;
var resid;
qqplot;
histogram / normal;
run;
%mend ;
%normcheck(a1);
但是这样的话,我要不停手动更换%normcheck()里的文件名,
怎么能够一次赋值就出五个procedure呢,感觉需要用do loop,怎么修改code呢?
包子答谢!! |
|
a*****3 发帖数: 601 | 40 ods html ;
%macro normcheck(namelist) /parmbuff ;
%let i = 1;
%let name_p = %scan(&namelist, &i);
%do %while ( &name_p NE );
proc univariate data = &name_p ;
var resid;
qqplot;
histogram / normal;
run;
%let i= %eval(&i+1);
%let name_p = %scan(&namelist,&i);
%end;
%mend ;
option mlogic mprint;
%normcheck(a1 a5 a7 a30 a180)
ods html close;
quit; |
|
s******y 发帖数: 352 | 41 the purpose of using %nrstr is to delay the macro resolution. if there is no
%nrstr, the marco will be resolved within the datastep. the statements in
the macro will be generated and put into the statement stack. after
finishing the data step, the statement will be popped up for execution.
the statements in the stack without using %nrstr:
proc univariate data=a1 ;
var resid;
qqplot;
histogram / normal;
run;
But with the %nrstr, the % sign would be quoted and invisible to the macro
com... 阅读全帖 |
|
a****a 发帖数: 3411 | 42 新手问一个宏的问题
我想根据continuous variable的percentile value做一个categorical variable,比
方说有100个categories的categorical variable。
如果分组很多,输入不方便,修改一次变量名累也累死。
如何修改下面这个宏,能够实现划分任意多的category?
多谢 (包子不多2个)
%macro quint(dsn,var,quintvar);
proc univariate noprint data=&dsn;
var &var;
output out=quintile pctlpts=25,50,75,100 pctlpre=pct;
run;
data _null_;
set quintile;
call symput('q1',pct25) ;
call symput('q2',pct50) ;
call symput('q3',pct75) ;
call symput('q4',pct100) ;
run;
data &dsn;
set &dsn;
if &var =. ... 阅读全帖 |
|
a*****3 发帖数: 601 | 43 我想再原来的代码上改,先不用proc rank。问题是如何用宏生成10,20,30,40,50,60,
70,80,90,100? 这样可以放到univariate里面去。谁给个提示? |
|
q**j 发帖数: 10612 | 44 summary()里面没有standard deviation, number of obs etc. 有没有想SAS proc
univariate那样的,不管需要不需要把一大堆东西都列出来的命令。多谢。 |
|
d******e 发帖数: 7844 | 45 univariate spline
MSSQL |
|
s******e 发帖数: 101 | 46 两周前报考sas base exam,从不甚重视,到感觉危机可能不过,再到今天顺利通过,
我从统计版得到了很多有用的信息,现在该是我回报的时候了:)
如果你对sas 一窍不通,但是不太想系统的学习sas,只想搞几套题随便做做混个通过
,相信我,你会比系统的学一遍花更多的时间。我是从sas 50题开始看的,虽然有详细
的解答,我还是觉得这些规则简直是太诡异了,和我们平时用的R和matlab没有什么共
同点。所以在suffer了半天之后,我放弃了,转而去图书馆借了几本书。当然,看那本
官方的Little SAS book是最直接的选择。但一来,那本书是电子版,打印下来废纸张
,二来,那本书不是每个命令段都有输出结果,对初学者来讲有点太难。所以那本适合
在对sas有一定了解后看。我很庆幸我找了一本非常深入浅出的入门书,data analysis
using sas, by C. Y. Joanne Peng. 当然还会有其他的很好的教材。我的结论是,要
找一本自己看着比较舒服比较乐意继续看下去的教材来学习。可以一点都不看那些个文
字。我基本上第一遍只是把所有的命令都在sas里面运行了一遍,熟... 阅读全帖 |
|
a********i 发帖数: 205 | 47 data sum;
input year id $ sequence x;
datalines;
2000 id1 1 35
2001 id1 2 60
2002 id1 3 80
2005 id1 1 76
2006 id1 2 95
2001 id2 1 108
2002 id2 2 87
2004 id2 1 76
2007 id2 1 84
2009 id2 1 98
2000 id3 1 123
2001 id3 2 198
2002 id3 3 82
2003 id3 4 90
2004 id3 5 23
2008 id3 1 90
;
run;
... 阅读全帖 |
|
o****o 发帖数: 8077 | 48 ZT from :
http://www.globalstatements.com/sas/jobs/technicalinterview.htm
*****************************************
SAS Technical Interview Questions
You can go into a SAS interview with more confidence if you know that you
are prepared to respond to the kind of technical questions that an
interviewer might ask you. I do not provide the specific answers here, both
because these questions can be asked in a variety of ways and because it is
not my objective to help those who have little actual int... 阅读全帖 |
|
M******8 发帖数: 22 | 49 One suggested test for trend in a univariate population is as follows:
Group the unordered scores into nonoverlapping groups of 3 adjacent
observations. The test statistic D is the number of monotonic (either
increasing or decreasing) triples. Assume for now that N is always a
multiple of 3 (no partial triples).
Example (triples are underlined)
33, 16, 20 61, 44, 38 29, 40, 56, 22, 43, 31
no order decrease increase no order D = 2
H0: random ordering (independent, rando... 阅读全帖 |
|
n*****n 发帖数: 3123 | 50 分类效果不好的原因是用t-test或者其他的univariate的选出来的redundancy很高,很
可能50个里面只有两三个是有用的,其他的跟这两三个correlation很高,就是说没有
提供什么额外的信息。
你可以用些multivariate 的方法,比如用lasso选,不过维数太高,不知道lasso能不
能上去。还可以用基于svm的方法。有很多文章讨论高纬下的variable selection. 你
可以search下看看。
40000 |
|