第2页 - 关于univariate的讨论汇总 - 话题女王

全部话题 - 话题: univariate

G*****u
发帖数: 1222

来自主题: Statistics版 - Time Series Analysis book

Time Series Analysis: Univariate & Multivariate Methods, by William W.S. Wei

x**g
发帖数: 807

来自主题: Statistics版 - 问个proc univariate 打印的问题

Use PROC GREPLAY 中的treplay命令。

y*******g
发帖数: 115

来自主题: Statistics版 - 问个proc univariate 打印的问题

你要是用SAS 9.2的话，那么你就可以用 Proc SGPLOT了，那个做出图来很不错。

m*********n
发帖数: 413

来自主题: Statistics版 - 问个proc univariate 打印的问题

SGPLOT is better.
But you could try
ods rtf file='XXX' startpage=no;
or other ODS destinationo should have the startpage option

histograms,

y*****w
发帖数: 1350

来自主题: Statistics版 - A question on linear regression

If I have a "change from baseline to follow-up" variable Y as the dependent
variable, and its baseline counterpart X and another baseline variable Z as
the independent variables, then for the model Y=X Z, the coefficient for X
was negative whereas the coefficient for Z was positive. Nevertheless,
univariably Y was negatively associated with X but there was no relation
between Y and Z. On the other hand, X and Z were highly positively
correlated, which makes it confusing when interpreting the mod

m*******s
发帖数: 469

来自主题: Statistics版 - 做药厂工作最常用的统计知识技术是什么？

proc univariate
proc freq
proc glm
proc lifetest
proc phreg
proc mixed
真正弄懂玩透这几个SAS命令，药厂95%的project可以应付了。

a********a
发帖数: 3176

来自主题: Statistics版 - SAS 问题求助

You can also use MODE in proc univariate to get the most prevaling value.

和C

s******n
发帖数: 95

来自主题: Statistics版 - Are these two tests the same thing?

Multivariate ANOVA & Univariate Repeated-Measures ANOVA, are they the same
thing?
Any idea is welcome, thank you so much!

q**j
发帖数: 10612

来自主题: Statistics版 - 请问在matlab里面如何迅速的看到descriptive statistics.

就像sas 里面的 proc univariate 或者R里面的fivenumber。多谢了。

g*******y
发帖数: 380

来自主题: Statistics版 - Modify the label of cdf plot in SAS?

Hi,all
I use the following code to generate the cdf plot:
proc univariate data=temp1 noprint;
var PPB;
by zone;
class type;
cdfplot PPB/overlay vref = 5 95
cvref = black
vreflabels = '5%' '95%' ;
run;
The goal it to get the overlapped plots of cumulative distribution frequency
of "PPB" in two different type by each zone.
Now I got those plots by zone, and the title is "cumulative distribution
function for PPB", how cou

t*********e
发帖数: 313

来自主题: Statistics版 - How to get summary statistics from multiple imputed data sets

i actually ended with randomly selecting one data set for the univariate
table. Any better idea is very welceome

j*****7
发帖数: 4348

来自主题: Statistics版 - 根据劳工部的PERM统计数字， statistician的收入09年比08年有

http://www.flcdatacenter.com/CasePerm.aspx
排除了个别小时工
Year 2008:
The UNIVARIATE Procedure
Variable: WAGE_OFFER_FROM
Quantiles (Definition 5)
Quantile Estimate
100% Max 175000.0
99% 140000.0
95% 118080.0
90% 104467.0
75% Q3 90000.0
50% Median 74370.0
25% Q1 60500.0
10% 53000.0
5% 46057.4
1% 38303.0
0% Min 29400.0
Extreme Observations

s*r
发帖数: 2757

来自主题: Statistics版 - odd resulted graph from PROC UNIVARIATE (histogram) results

hist m/MIDPOINTS=0 to 101 by 2 ;

S******y
发帖数: 1123

来自主题: Statistics版 - odd resulted graph from PROC UNIVARIATE (histogram) results

Thanks. Sir!
Do you know what is the default of midpoints for
..
hist m;
..
?
or how does SAS decides on default?

S******y
发帖数: 1123

来自主题: Statistics版 - odd resulted graph from PROC UNIVARIATE (histogram) results

Thanks. Sir!
Do you know what is the default of midpoints for
..
hist m;
..
?
or how does SAS decides on default?

n******e
发帖数: 476

来自主题: Statistics版 - healthcare data analyst, 电面准备啥呐？

是药厂？CRO?
主要就是 SAS base 部分，data step，proc freq, proc univariate, proc means。
复杂点的有 transpose, merge。然后 sql 的各种 join。画图的基本知识。
模型知道一点 anova, t-test, linear regression 足够了。不知道也没太大关系，但
是面试的时候如果有统计员，他们不懂太多编程，只好问统计，知道一点比较好。
good luck!

w*********y
发帖数: 7895

来自主题: Statistics版 - interview questions about data management

我没有经验啦。因为我找工作的时候，看到有DATA MANAGEMENT总是出现，
就GOOGLE了一下相关的东西。找到一些DATA CLEANSING的文章，上面提到
稍微复杂的方法。
我记忆中，你提到的这个是比较常用来检查TYPO的。还有用PROC UNIVARIATE来
检查TYPO。还有就是用那些什么SAS FUNTION来检查的，还有各种GRAPH图像之类的。

name

p**********l
发帖数: 1160

来自主题: Statistics版 - 弱弱的问一下关于one-way repeated ANOVA

What we learnt during a lecture was to use sphericity test to test if the
within subject variance-covariance matrix has a type H ( Huynh & Feldy)
structure, or others call it HF structure.
Covariance matrix is of type H iff its quadratic form with an orthogonal
contrast matrix.
H0: = sigma^2 I.
Ha:  = unstructured form.
If the test is nonsignificant, use univariate test for within-subject
effects and since they are more powerful then the multivariate test.
If the test is signifi

d*******o
发帖数: 493

来自主题: Statistics版 - 贡献SAS Programmer 面试问题并求答案

1) 如何有效地用SAS做multiple merge？有什么好方法？
1. array 2. sort-sort-merge;3. proc sql; 4. proc format; 5. hash object
Coding efficiency: 3>2>4>1>5
I/O resource: 5>4>1>3>2
flexibility: 3>>others
2) large-scale database data cleaning 的常用方法？
Using proc sql to access database via DBMS. Then use it to check outlier/
missing/invalid/duplicate values, do hard
coding correction, and update integrity constraint. Also use proc sort/means
/freq/univariate/datasets/compare/rank and SAS functions
(date/regular express

s*******y
发帖数: 2977

来自主题: Statistics版 - 抓狂！为啥选出来的predictor都这么差

Rsqure大不一定是overfitting，跟你的number of variables and sample size都有关
系。建议看一看Frank Harrell的 regression modeling strategies。
lz的variable很多，建议fit model之前先检查colinearity，对于highly correlated
的variable，keep一个（比如说univariate fitting里rsquare最好的那个），然后再
用stepwise或penalized variable selection.不过2007年好像有个加拿大人写的一篇
文章做了很详细的simulation，比较了backward selection加或不加bootstrapping都
不会给出很好的结果，嘿嘿，eye-dropping conclusions。

f******h
发帖数: 46

来自主题: Statistics版 - 抓狂！为啥选出来的predictor都这么差

谢谢，刚刚上面回了一贴，说处理过multicollinearity以后的变量们发现并不是最好
的pool。。。我觉得是我multicollinearity处理方式不对头。我是在每一步去掉VIF最
大的那个变量，我也注意到这样的方式，很容易导致把那些和dependent variable的
correlation最大的predictor都去掉了。。。很ft
我想试试你的方法，在每组corr很大的变量中保留那个univariate R^2最大的。但是这
里也有问题：１）因为变量非常多，这种大corr的组合并不是mutually exclusive的，
就是说组和组的不同变量之间也很难避免一些corr很大，当然，这个可以考虑用
cluster analysis来交给sas解决；２）另一个问题是每组保留一个可靠吗？还是说在
经验上这样的做法是一种惯例？

correlated

p********a
发帖数: 5352

来自主题: Statistics版 - [合集] 根据劳工部的PERM统计数字， statistician的收入09年比08年

☆─────────────────────────────────────☆
jhsph07 (银杏) 于 (Wed Feb 24 14:47:45 2010, 美东) 提到:
http://www.flcdatacenter.com/CasePerm.aspx
排除了个别小时工
Year 2008:
The UNIVARIATE Procedure
Variable: WAGE_OFFER_FROM
Quantiles (Definition 5)
Quantile Estimate
100% Max 175000.0
99% 140000.0
95% 118080.0
90% 104467.0
75% Q3 90000.0
50% Median 74370.0
25% Q1 60500.0
10% 53000.0
5% 46057.4
1% 38303.0
0

o****o
发帖数: 8077

来自主题: Statistics版 - proc sql: find 4 highest and mean, median

why not try PROC UNIVARIATE?

o****o
发帖数: 8077

来自主题: Statistics版 - proc sql: find 4 highest and mean, median

data BPressure;
do patientID=1 to 20;
x=ranpoi(4, 10);
do j=1 to x;
Systolic=rannor(888);
diastolic=ranuni(8888);
output;
drop j;
end;
end;
run;
proc sort data=BPressure; by patientID; run;
ods select none;
ods output ExtremeValues=XPval;
proc univariate data=BPressure nextrval=4;
by PatientID;
var Systolic Diastolic;
output out=_mean mean=sysmean Diamean
median=sysmedian diamedia

B******y
发帖数: 9065

来自主题: Statistics版 - cro里面比较常用的SAS proc都有什么

PROC REPORT, MEANS, FREQ, GLM and LOGISTICS，包括换汤不换药的有UNIVARIATE，
MIXED, GENMOD等，基本上涵括了90％以上的工作。剩下的10％就为PROC LIFETEST，
PHREG，POWER，PLAN等等了。

g********0
发帖数: 90

来自主题: Statistics版 - 请教如何做一个类似 boxplot 的图，包子酬谢

proc univariate data=xx plot;
id xx;
var xx;
run;

S*****U
发帖数: 99

来自主题: Statistics版 - 请教如何做一个类似 boxplot 的图，包子酬谢

data test;
do i=1 to 10000;
disease=ranpoi(0,5);
control=ranpoi(0,2);
output;
end;
run;
proc univariate data=test plot;
var disease control;
run;
这个好像不行啊，不过写过上面两位，包子已发啦

s***r
发帖数: 1121

来自主题: Statistics版 - SAS Regression Macro 问题请教 (有包子)

thanks. I sent you 4 baozi (20 dollars). one more question:
I also need to run the regression like this:
b1 = e1
b1 = r1
b1 = f1
b1= e2
b1= r2
b1= f2
b2= e1
b2= r1
b2 f1
...
...
that is, I also need to run univariate regression. Can you help me with the
macro? Many thanks.

s*****r
发帖数: 790

来自主题: Statistics版 - 我问陈大师几个问题

您的巨著，开头的地方，您说
"We believe that a mathematical expectation E(X) can determine the location
of a distribution of the X;"
Can you define what you mean by the location of a distribution? And who
believe that? can you provide any reference about it?
for your important properties,
Property 1.1: Uniqueness, any random variable in a population is unique and
differs from others.
in what sense do you mean uniqueness? Assume X is a random variable
following a univariate standard normal distribution. anoth

s**f
发帖数: 365

来自主题: Statistics版 - 请教一个ordinal regression的问题！

请问，用SAS的proc logistic做ordinal regression，dependent variable有几个level。proc logistic是默认这几个level是linear（univariate的regression）的关系，还是nonlinear（multivariate的regression）的关系？
我今天找SAS的help里面找了半天也没找到，请问哪里有解释？
知道的大牛请一定帮我一下！太感谢了！！！

s********9
发帖数: 74

来自主题: Statistics版 - Does multivariable logistic regression allow correlated independent variables?

outcome: A
independet variables: B C D E F G H
univariable logistic regression: B C D E F G H all have significant
influence on A.
multivariable logistic regression: Only B has significant influence on A.
Is factor B the only factor should be considered as A's influence factor.

s********9
发帖数: 74

来自主题: Statistics版 - Does multivariable logistic regression allow correlated independent variables?

If there are independet variables like income, food cost, house renting/
payment, education cost ...... They are correlated, but each of them is
important. If each of them significant in univariable analysis, but only
income is significant in multivariable analysis. What could be the
conclusion for the impact factors for the outcome?

d*******o
发帖数: 493

来自主题: Statistics版 - 如何同时测试2000组数据是否正太分布

proc univariate的四大天王

l**********9
发帖数: 148

来自主题: Statistics版 - 如何同时测试2000组数据是否正太分布

扑哧..四大天王..还有这种说法...
我记得univariate里面有normal test这个option

c*******n
发帖数: 300

来自主题: Statistics版 - 如何同时测试2000组数据是否正太分布

univariate 的话，shapiro wilk test 足够好了。程序采用Royston的近似方法，结果中包括test statistics 和p value。

T*******I
发帖数: 5138

来自主题: Statistics版 - 如何同时测试2000组数据是否正太分布

试试用SAS的ODS系统输出检验结果，然后再用数据步进行处理，就可以得到你想要的结
果。不过，正如楼上有人建议的，你要根据你的样本量来确定使用哪一种检验的结果。
请参考Univariate Procedure.

y*****w
发帖数: 1350

来自主题: Statistics版 - test count data distribution in SAS

If your count data does not include zero values, you can use PROC UNIVARIATE
with the HISTOGRAM statement to get goodness-of-fit test statistics for
several distributions, however those distributions do not include NB.
http://support.sas.com/documentation/cdl/en/procstat/63104/HTML
I have recently run a project in which I had to choose a best distribution
for my non-zero historical count data for the purpose of sample size
calculation. I applied the HISTOGRAM statement to compare GAMMA, LOGNORMA... 阅读全帖

w*******9
发帖数: 1433

来自主题: Statistics版 - test count data distribution in SAS

It's a good idea but the chisq stuff is quite sensitive to the bins you use.

UNIVARIATE
,

S********a
发帖数: 359

来自主题: Statistics版 - 【大包子】问个macro的问题

我有5个datasets, 分别是a1 a5 a7 a30 a180, 想把每个文件重复做一个procedure
%macro normcheck(k) ;
proc univariate data=&k ;
var resid;
qqplot;
histogram / normal;
run;
%mend ;
%normcheck(a1);
但是这样的话，我要不停手动更换%normcheck()里的文件名，
怎么能够一次赋值就出五个procedure呢，感觉需要用do loop，怎么修改code呢？
包子答谢！！

a*****3
发帖数: 601

来自主题: Statistics版 - 【大包子】问个macro的问题

ods html ;
%macro normcheck(namelist) /parmbuff ;
%let i = 1;
%let name_p = %scan(&namelist, &i);
%do %while ( &name_p NE );
proc univariate data = &name_p ;
var resid;
qqplot;
histogram / normal;
run;
%let i= %eval(&i+1);
%let name_p = %scan(&namelist,&i);
%end;
%mend ;
option mlogic mprint;
%normcheck(a1 a5 a7 a30 a180)
ods html close;
quit;

s******y
发帖数: 352

来自主题: Statistics版 - 【大包子】问个macro的问题

the purpose of using %nrstr is to delay the macro resolution. if there is no
%nrstr, the marco will be resolved within the datastep. the statements in
the macro will be generated and put into the statement stack. after
finishing the data step, the statement will be popped up for execution.
the statements in the stack without using %nrstr:
proc univariate data=a1 ;
var resid;
qqplot;
histogram / normal;
run;
But with the %nrstr, the % sign would be quoted and invisible to the macro
com... 阅读全帖

a****a
发帖数: 3411

来自主题: Statistics版 - 请教一个macro的问题

新手问一个宏的问题
我想根据continuous variable的percentile value做一个categorical variable，比
方说有100个categories的categorical variable。
如果分组很多，输入不方便，修改一次变量名累也累死。
如何修改下面这个宏，能够实现划分任意多的category？
多谢 (包子不多2个)
%macro quint(dsn,var,quintvar);
proc univariate noprint data=&dsn;
var &var;
output out=quintile pctlpts=25,50,75,100 pctlpre=pct;
run;
data _null_;
set quintile;
call symput('q1',pct25) ;
call symput('q2',pct50) ;
call symput('q3',pct75) ;
call symput('q4',pct100) ;
run;
data &dsn;
set &dsn;
if &var =. ... 阅读全帖

a*****3
发帖数: 601

来自主题: Statistics版 - 请教一个macro的问题

我想再原来的代码上改，先不用proc rank。问题是如何用宏生成10,20,30,40,50,60,
70,80,90,100? 这样可以放到univariate里面去。谁给个提示？

q**j
发帖数: 10612

来自主题: Statistics版 - R问题：detailed summary。

summary()里面没有standard deviation, number of obs etc. 有没有想SAS proc
univariate那样的，不管需要不需要把一大堆东西都列出来的命令。多谢。

d******e
发帖数: 7844

来自主题: Statistics版 - 已知两组数据,x,y 要找出function, f(x)=y

univariate spline

MSSQL

s******e
发帖数: 101

来自主题: Statistics版 - SAS base 水过攻略

两周前报考sas base exam，从不甚重视，到感觉危机可能不过，再到今天顺利通过，
我从统计版得到了很多有用的信息，现在该是我回报的时候了：）
如果你对sas 一窍不通，但是不太想系统的学习sas，只想搞几套题随便做做混个通过
，相信我，你会比系统的学一遍花更多的时间。我是从sas 50题开始看的，虽然有详细
的解答，我还是觉得这些规则简直是太诡异了，和我们平时用的R和matlab没有什么共
同点。所以在suffer了半天之后，我放弃了，转而去图书馆借了几本书。当然，看那本
官方的Little SAS book是最直接的选择。但一来，那本书是电子版，打印下来废纸张
，二来，那本书不是每个命令段都有输出结果，对初学者来讲有点太难。所以那本适合
在对sas有一定了解后看。我很庆幸我找了一本非常深入浅出的入门书，data analysis
using sas, by C. Y. Joanne Peng. 当然还会有其他的很好的教材。我的结论是，要
找一本自己看着比较舒服比较乐意继续看下去的教材来学习。可以一点都不看那些个文
字。我基本上第一遍只是把所有的命令都在sas里面运行了一遍，熟... 阅读全帖

a********i
发帖数: 205

来自主题: Statistics版 - 请教一个sas求和的问题

data sum;
input year id $ sequence x;
datalines;
2000 id1 1 35
2001 id1 2 60
2002 id1 3 80
2005 id1 1 76
2006 id1 2 95
2001 id2 1 108
2002 id2 2 87
2004 id2 1 76
2007 id2 1 84
2009 id2 1 98
2000 id3 1 123
2001 id3 2 198
2002 id3 3 82
2003 id3 4 90
2004 id3 5 23
2008 id3 1 90
;
run;
... 阅读全帖

o****o
发帖数: 8077

来自主题: Statistics版 - SAS Technical Interview Questions

ZT from :
http://www.globalstatements.com/sas/jobs/technicalinterview.htm
*****************************************
SAS Technical Interview Questions
You can go into a SAS interview with more confidence if you know that you
are prepared to respond to the kind of technical questions that an
interviewer might ask you. I do not provide the specific answers here, both
because these questions can be asked in a variety of ways and because it is
not my objective to help those who have little actual int... 阅读全帖

M******8
发帖数: 22

来自主题: Statistics版 - 这个nonparametric已经把我想傻了，大牛小牛们帮我看看吧～～

One suggested test for trend in a univariate population is as follows:
Group the unordered scores into nonoverlapping groups of 3 adjacent
observations. The test statistic D is the number of monotonic (either
increasing or decreasing) triples. Assume for now that N is always a
multiple of 3 (no partial triples).
Example (triples are underlined)
33, 16, 20 61, 44, 38 29, 40, 56, 22, 43, 31
no order decrease increase no order D = 2
H0: random ordering (independent, rando... 阅读全帖

n*****n
发帖数: 3123

来自主题: Statistics版 - 新手请教一个分类问题

分类效果不好的原因是用t-test或者其他的univariate的选出来的redundancy很高，很
可能50个里面只有两三个是有用的，其他的跟这两三个correlation很高，就是说没有
提供什么额外的信息。
你可以用些multivariate 的方法，比如用lasso选，不过维数太高，不知道lasso能不
能上去。还可以用基于svm的方法。有很多文章讨论高纬下的variable selection. 你
可以search下看看。

40000

#	版面	帖数(主题数)
-	全站	4871 (796)
1	Military	3777 (569)
2	Stock	341 (51)
3	Joke	117 (17)
4	History	116 (3)
5	Automobile	100 (9)
6	USANews	55 (9)
7	Midlife	45 (1)
8	Headline	41 (41)
9	Dreamer	33 (13)
10	FleaMarket	32 (20)
11	Living	30 (7)

topics

未名新帖统计// 7月16日

历史上的今天