d******r 发帖数: 1389 | 1 来自主题: Statistics版 - sas一问 data x;
input
ID: $;
datalines;
0121K
1021I
0102H
;
proc contents data=x;run;
data new;
set x;
ID_n=input(ID,5.0);
proc contents data=new;run;
这样的,我也是新手,不知道有没有什么问题 |
|
x*z 发帖数: 67 | 2 data xxx;
input one;
dataline;
8
.
.
6
.
4
.
.
.
7
.
.
.
;
Question:
How can I make the missing . to the last non-missing?
like
8
8
8
6
6
4
4
4
4
7
7
7
7
Thank you!
|
|
x*z 发帖数: 67 | 3 Another similar question;
data xxx;
input one $;
dataline;
H1
.
.
H1
.
H1
.
.
H1
.
.
.
;
Question:
How can I identify each 'H1'?
For example, create another variable
like
H1 a
. a
. a
H1 b
. b
H1 c
. c
. c
H1 d
. d
. d
. d
;
Thank you! |
|
x*z 发帖数: 67 | 4 I will try it, gutenacht. But I don't quite understand it at this moment.
One more question, please help. Thank you!
data xxx;
input one;
dataline;
.
.
.
8
.
.
6
.
4
.
.
.
7
.
.
.
8
;
Question:
How can I make the missing . to the 'next' non-missing?
like
8
8
8
8
6
6
6
4
4
7
7
7
7
8
8
8
8
Thank you! |
|
C******t 发帖数: 72 | 5 Maybe this is what you want. Just be cautious that the variable type is
changed to character to accomodate "stop".
data origin;
input uniqueID $ observedate $ observer $ growthrate $;
datalines;
11111111 20020908 John 0.22
11111111 20021210 John 0.21
22222222 19980101 Peter 0.33
22222222 19980401 Peter 0.33
33333333 19950505 Smith 0.34
;
run;
data addstop;
set origin;
if last.uniqueID then do;
output;
growthrate='stop';
output;
end;
else output;
by uniqueID;
run; |
|
b******e 发帖数: 539 | 6 加一行code在input statement前:
infile datalines truncover; |
|
p********a 发帖数: 5352 | 7 data test;
input v1 $ v2 $ v3 $ v4 $ v5 $;
datalines;
a b c d e
7 3 4 6 9
f g h i j
10 3 23 4 6
run;
data test1(keep=var1 var2);
retain n1 n2 n3 n4 n5 '';
set test;
array nn(5) n1-n5;
array vv(5) v1-v5;
if mod(_N_,2)=1 then do;
n1=v1;n2=v2;n3=v3;n4=v4;n5=v5;
end;
else if mod(_N_,2)=0 then do;
do i=1 to 5;
var1=nn(i);var2=vv(i); output;
end;
end;
run;
proc print;
run; |
|
l*******n 发帖数: 13 | 8 data mydata;
input A B;
datalines;
1 1
1 1
1 3
1 3
1 5
1 23
2 4
2 4
2 6
2 6
2 12
2 22
;
proc rank data=mydata out=results ties=low;
by A;
var B;
ranks NewB ;
run;
proc print data=results;
by A;
run;
BUT i got this, why? |
|
o****o 发帖数: 8077 | 9 data mydata;
input A B;
datalines;
1 1
1 1
1 3
1 3
1 5
1 23
2 4
2 4
2 6
2 6
2 12
2 22
;
run;
proc freq data=mydata noprint order=internal;
table A*B/out=order(drop=COUNT PERCENT);
run;
data order;
set order; by A;
retain Order;
if first.A then Order=1; else Order+1;
run;
proc sql;
create table outdata as
select a.*, b.Order
from mydata as a join order as b
on a.A=b.A & a.B=b.B
order by a.A, a.B
;
quit; |
|
m***6 发帖数: 884 | 10 What about just manually count?
data mydata;
input A B;
datalines;
1 1
1 1
1 3
1 3
1 5
1 23
2 4
2 4
2 6
2 6
2 12
2 22
;
run;
proc sort data = mydata; by A B; run;
data d; retain a b x; set mydata; by a b;
if _n_=1 then x=0; if first.b then x=x+1;
run; |
|
o******6 发帖数: 538 | 11 ☆─────────────────────────────────────☆
careerchange (Stupid) 于 (Tue Feb 17 15:06:39 2009) 提到:
In the following, I used proc sql twice. They look very
similar, except the first asks for treatment time mean, and the
second asks treatment time median. But the result is 2 rows
for the 1st sql and 6 rows for the 2nd sql. Anybody knows
why?
How to get 2 rows from 2nd sql? Thanks a lot.
data source_data;
input treatment time;
datalines;
0 124
0 234
0 23
1 34
1 46
1 44
;
run;
proc sql;
|
|
f*******e 发帖数: 51 | 12 use transpose
data old;
input aabbbbCC ghhjjhCC jggettCC ghggghBB ;
datalines;
1 2 3 4
4 3 2 1
;
run;
proc transpose data=old
out=new;
run;
data new1;
set new;
if substr(reverse(_name_),1,2)="CC" then output;
run;
proc transpose data=new1
out=new2 (drop=_name_);
run; |
|
z****e 发帖数: 2024 | 13 赞!
another question,
data tmp2;
input day mon year;
date=mdy(mon, day, year);
output;
datalines;
25 12 2005
1 1 1960
21 10 1946
;
how to keep only the date variable in the output dataset? |
|
f*******e 发帖数: 51 | 14 data tmp2 (keep=date);
input day mon year;
date=mdy(mon, day, year);
output;
datalines;
25 12 2005
1 1 1960
21 10 1946
;
run;
as desired (in data step, only keep log(n), but not n)? |
|
o******6 发帖数: 538 | 15 ☆─────────────────────────────────────☆
acervulina (acervulina) 于 (Tue Mar 10 13:46:16 2009) 提到:
I have two questions as following, if you know the reason, please explain a
little bit.
Thanks
111.
data exp;
input obs name$ level;
datalines;
1 frank 1
2 joan 2
3 sui 2
4 jose 3
5 burt 4
6 kelly .
7 juan 1
;
run;
data exp1;
set exp;
if level=. then exper='Unknown';
else if level=1 then exper='Low';
else if level=2 or 3 then exper='Medium';
else exper='High';
run;
Which of the following value |
|
z**k 发帖数: 378 | 16 你说的都是非法的,'01Jan1960'd这样的用法是在给变量赋值,应该是在coding时就给
出的,所以
格式比较死板,你可以写 x=1000 为什么还要写 x='1,000'n 呢。
用datalines或者infile方式读数据的话就比较灵活了,1,000可以用comma8.格式来读
取,不
过"1993-09-07"我就不清楚了,似乎SAS要求年份要在末尾,either mmddyy or ddmmyy |
|
z**k 发帖数: 378 | 17 汗,要说清楚
data table_a;
input id $ date $ char $;
datalines;
ID1 date1 A
ID1 date2 A
ID1 date3 L
ID1 date4 L
ID1 date5 A
ID1 date6 A
ID2 date7 B
;
data table_b;
set table_a;
by id char notsorted;
if first.id then inx=0;
if first.char then inx+1;
;
run;
:) |
|
z**k 发帖数: 378 | 18 如果只写一个data step,要麻烦很多。高手解答一下吧
data table_c (drop=oldid oldchar);
length oldid $8.;
length oldchar $8.;
retain oldchar "-";
retain oldid "-";
input id $ date $ char $;
if id ne oldid then do
inx=0;
oldid=id;
end;
if char ne oldchar then do;
inx+1;
oldchar=char;
end;
datalines;
ID1 date1 A
ID1 date2 A
ID1 date3 L
ID1 date4 L
ID1 date5 A
ID1 date6 A
ID2 date7 B
; |
|
z**k 发帖数: 378 | 19 data a;
input v1 $ v2;
datalines;
aa .
aa 1234
aa .
bb .
bb 12345
bb 12345
cc .
cc .
cc 123456
;
proc sql;
select t1.v1 as variable1, t2.v2 as variable2
from (select v1 from a) as t1,
(select distinct v1, v2 from a
where v2 is not missing) as t2
where t1.v1 = t2.v1;
如果你可以保证每个group至少有一个非missing值的话 |
|
t**i 发帖数: 688 | 20 options symbolgen mprint mlogic ;
Data b;
Input ID apple orange banana papaya ;
Datalines;
1 1 2 3 4
2 5 6 7 8
;
Run;
%let fruit = good ;
%let fruit1 = apple ;
%let fruit2 = orange ;
%let fruit3 = banana ;
data a;
set b;
do I = 1 to 3;
keep &&fruit&i ;
end;
run; |
|
g******h 发帖数: 266 | 21 刚刚试过,下面的程序能run:
data price;
input price;
datalines;
3.5
2.3
2.1
2.9
3.4
;
data ploting;
set price;
line_number=_n_;
proc gplot data=ploting;
plot price*line_number;
run; |
|
h*****d 发帖数: 295 | 22 Sorry can not input Chinese now.
The original data is saved in a txt file. The last characters of some of the
datalines are missing. SAS jump to next line to read the first character
when such missing value occurs.
Any idea how to make SAS know there is a characer missing at the end of
these lines?
Thanks and bow~
data is like:
a134M
b246F
c321F
d234
e345M
g456
h987M
..
code I am trying
data temp;
infile xxxx;
input id $ 1. score 3.0 gender $ 1.;
run;
the output of the data is like
a134M
b246F
c |
|
a***r 发帖数: 420 | 23 我理解错了,继续抛砖引玉:
data a(keep=a);
input A $ 15. B C $;
datalines;
11/asdsd/890.00 89 gh
123/yuu/8.9 89 ji
;
run;
data a;
set a;
file "e:\temp.txt";
put a;
run;
data b;
infile "e:\temp.txt" dlm='/';
input var1 var2 $ var3;
run; |
|
n******0 发帖数: 298 | 24 This is my data. There are 100 observations in the data set.
data exp;
input t @@;
datalines;
0.46008226 0.41727405 0.20728763 0.49158278 0.05372043 0.74767695 0.
11532318 0.59817944 0.28878044 0.34212305
...........
;
run; |
|
l**********s 发帖数: 255 | 25 data one;
input id x y;
datalines;
1 1 0
2 0 1
3 1 1
4 0 0
5 1 0
;
以上是我的数据(data one),我想建一个新的数据如下(data two),第一列是data one
里的变量名(x,y),第二列是"1"出现的次数,第三列是"1"出现的次数和总数(5)的比例
. 该怎么算呢?多谢多谢。
data two------
x 3 60%
Y 2 40% |
|
l**********s 发帖数: 255 | 26 另外如果数据(DATA ONE)稍改动下,x有MISSING DATA,算比例的时候不想把MISSING
DATA算在内, 也就是说总数是5-1=4,又该怎么办呢?
data one;
input id x y;
datalines;
1 1 0
2 0 1
3 1 1
4 0 0
5 . 0
; |
|
l**********s 发帖数: 255 | 27 data one;
input id x y Z A B C;
datalines;
1 1 0 1 1 1 1
2 0 1 1 1 1 1
3 1 1 1 1 1 1
4 0 0 1 1 1 1
5 1 0 0 0 0 0
;
再问下,其实我的真实数据里类似X,Y的数据有100多个,该怎么加上MACRO用呢?MACRO
是我攻克了好几次都失败的地方,试验了半天总出错。
我把DATA ONE又改动了一下,多了几个变量Z, A ,B,C。牛人们能否再教导下,加上
MACRO,怎么把变量 X,Y, Z, A,B,C一下子算出来呢? |
|
o****o 发帖数: 8077 | 28 data one;
input id x y Z A B C;
datalines;
1 1 0 1 1 1 1
2 0 1 1 1 1 1
3 1 1 1 1 1 1
4 0 0 1 1 1 1
5 1 0 0 0 0 0
;
run;
proc contents data=one out=onevars(keep=name varnum where=(upcase(name)^='
ID')) noprint; run;
data _null_;
set onevars end=eof;
if _n_=1 then call execute('proc freq data=one noprint;');
call execute('table '||name||' /out'||compress('=_'||name)||'(where=('|
|name||'=1));');
if eof then call execute ('run;');
run;
options source ;
data _null_;
c |
|
l**********s 发帖数: 255 | 29 我的数据如下(data three),想把数据从小到大排序,变成data four那样,但是数据类
型不是数字型,我能想到的办法就是把变量变成数字型的(trim),然后排序。有牛牛们
没有更简单的办法?多谢。
data three;
input name $1-5;
datalines;
a1b10
a1b11
a1b12
a1b1
a1b2
a1b3
;
run;
******************************
data four----
a1b1
a1b2
a1b3
a1b10
a1b11
a1b12 |
|
f*******e 发帖数: 51 | 30 data test;
input customer $ item $;
datalines;
a bacon
a apple
b bacon
c cream
c apple
c bacon
;
run;
data test;
set test;
num=1;
run;
proc transpose data=test out=test1(drop=_name_);
by customer;
var num;
id item;
run;
data test1;
set test1;
array num _numeric_;
do over num;
if num=. then num=0;
end;
run; |
|
a***r 发帖数: 420 | 31 a "tu" method that works somehow
data example;
input firstdg 1.;
datalines;
12
23
244
...
;
run;
a "tu" method that works somehow |
|
A****t 发帖数: 141 | 32 data one;
input id $;
datalines;
12
23
244
5678
99
100
;
run;
data two;
set one;
a=substr(id,1,1);
keep a;
run; |
|
u*****a 发帖数: 54 | 33 data course;
input exam;
datalines;
50.1
;
run;
proc format;
value score 1 - 50 = 'Fail'
51 - 100 = 'Pass';
run;
proc report data =course nowd;
column exam;
define exam / display format=score.;
run;
这个为什么返回的是50.1 而不是 pass 呢
Many thanks! |
|
b*******g 发帖数: 513 | 34 You are correct, this program works:
data one;
input name $ gender $ age income $ county $;
datalines;
name gender age income county
jason M 26 40K MORRIS
ADAM M 21 100K CLIFF
JORGE M 30 20K WARREN
ERIKA F 21 43K CLIFF
;run;
data two;
set one;
if county="CLIFF" then delete;
run;
proc print data=two;
quit; |
|
l**********s 发帖数: 255 | 35 对不起,对不起。其实想把数据改成如下的样子(DATA ONE),算“0“。把CODE 稍微改
动了一下(以前算"1",所以这次只是把"0"该成"1",肯定没改对,所以出错。最后生成
的如下_NEW。
data one;
input id X Y A B C;
datalines;
1 0 1 1 1 1 1
2 0 1 1 1 1 1
3 0 1 1 1 1 1
4 0 1 1 2 1 1
5 0 1 0 0 0 0
;
run; |
|
x********u 发帖数: 64 | 36 比如说有2个变量 x1 x2
x1 有2个level
x2 有3个level
然后有个resp变量y
有2个level
应该怎么run 那个proc freq呢?
data test;
input y $ x1 $ x2 $ count ;
datalines;
N e b 2398
N e m 3686
N e t 3004
N m b 4549
N m m 7653
N m t 4853
Y e b 58
Y e m 82
Y e t 77
Y m b 130
Y m m 172
Y m t 114
;
run; |
|
s*******2 发帖数: 791 | 37 我有如下dataset Test
data Test;
input input $ outcome $ @@;
datalines;
A 0 A 0 A 0
A 1 A 1 A 1
A 2 A 2 A 2
B 0 B 0 B 0
B 1 B 1 B 1
B 2 B 2 B 2
;
怎么样可以得到下面的数据 (outcome按照0,1,2的顺序)?谢谢
Obs input outcome
1 A 0
2 A 1
3 A 2
4 A 0
5 |
|
n***p 发帖数: 508 | 38 I do not do well in programming. I have a clusmy way, not sure if this is
what you need.
data Test;
input input $ outcome $ @@;
datalines;
A 0 A 0 A 0
A 1 A 1 A 1
A 2 A 2 A 2
B 0 B 0 B 0
B 1 B 1 B 1
B 2 B 2 B 2
;
run;
proc sort data = test out= test_sorted nodupkeys;
by input outcome;
run;
data a b;
set test_sorted;
if input = 'A' then output a;
else output b;
run;
data aaa;
set a a a;
run;
data bbb;
set b b b;
run;
data combine;
|
|
s*******2 发帖数: 791 | 39 谢谢你。 我运行了你的这个code输出的结果就是我想要的。可是有一个问题。 我给出
的Test刚好是18个observations,所以通过proc sort去掉了duplicate rows, 就剩A 0
A 1 A 2 B 0 B 1 B 2.然后再stack dataset三次得到我想要的结果。可是如果
我给非3的倍数的observations,怎么办?
例如 16个observations:
data Test;
input input $ outcome $ @@;
datalines;
A 0 A 0
A 1 A 1 A 1
A 2 A 2 A 2
B 0 B 0 B 0
B 1 B 1
B 2 B 2 B 2
;
run;
得到的结果应该是
Obs input outcome
1 A 0
|
|
g********d 发帖数: 2022 | 40 呵呵,以为又是新手问初级问题,没认真看,抱歉。你很礼貌啊,赞一个。
data Test;
input input $ outcome $ @@;
datalines;
A 0 A 0 A 0
A 1 A 1 A 1
A 2 A 2 A 2
B 0 B 0 B 0
B 1 B 1 B 1
B 2 B 2 B 2
;
run;
proc sort;
by input outcome ;run;
proc print;run;
data test1;
set test;
by input outcome;
retain x 0;
if input="A" and outcome=0 then do;
if first.outcome then x=0;x+1;end;
else if input="A" and outcome=1 then do;
if first.outcome then x=0;x+1;end;
else if input="A" and outcome=2 then do;
if first |
|
s*******2 发帖数: 791 | 41 谢谢 gosummerod 和 sherryyyf
看来我的first.和last. retain掌握的还是不够好。
我原来想写下面的code (uncomplete),但是要将data step (sequent test_sorted)
运行3遍,再append一起,然后sort by input counter.但是现在看来达不到我预想的结
果。首先,counter的值不是从0-15而是0-6;其次,如果我运行>=3次,最后id=5,8,16
的row是没有办法creat到我的test_New中的。
虽然上面的各位已经帮我解决了这个问题,但是还是很纠结我自己的code,谁能帮我看
看哪里错了?帮忙改一下吧。谢谢了。
proc datasets library=work;
delete sequent test_sorted test_New;
run;
data Test;
input input $ outcome @@;
datalines;
A 0 A 0
B 0 B 0 B 0
A 1 A 1 A 1
A 2 A |
|
b*********e 发帖数: 29 | 42 data Test;
input input $ outcome $ @@;
datalines;
A 0 A 0 A 0
A 1 A 1 A 1
A 2 A 2 A 2
B 0 B 0 B 0
B 1 B 1 B 1
B 2 B 2 B 2
;
run;
proc sort data=test;
by input outcome;
run;
data test2; set test;
input class @@;
cards;
1 2 3 1 2 3
1 2 3 1 2 3
1 2 3 1 2 3
;
run;
proc sort data = test2;
by input class;
run;
我忘了如何把一个dataset中的三个变量中的两个存到另外一个dataset中了。
可以考虑用sql. |
|
h******e 发帖数: 1791 | 43 一个繁琐的方法:
data test;
input var1 $ var2 var3;
datalines;
a . 1.1
a 5 .
a 6 .
b 5 0
b 5 0
b 7 .
;
run;
data t1;
set test;
by var1;
if first.var1 then var4 = 1;
else if first.var1 = 0 and last.var1 = 0 then var4 = 2;
else if last.var1 then var4 = 3;
run;
proc transpose data = t1 out = t2;
by var1 var2;
id var4;
var var3;
run;
data t3;
set t2;
if _1 > 0 or _2 >0 or _3 >0 then do;
_1 = 1.1;
_2 = 1.1;
_3 |
|
R******d 发帖数: 1436 | 44 不知道我理解对了不
data test;
input var1 $ var2 var3;
datalines;
a . 1.1
a 5 .
a 6 .
b 5 0
b 5 0
b 7 .
;
run;
proc sql noprint;select distinct var1 into:vars separated by ' ' from test;quit;
proc sql noprint;select count(distinct var1) into:nvar from test; quit;
%macro test;
%do i=1 %to &nvar;
proc sql noprint;
create table tmp as
select *,max(var3>0) as index,max(var3) as max from test where var1="%qscan(&vars,&i,' ')";
quit;
proc append base |
|
R******d 发帖数: 1436 | 45 回头看一下,前面写得有点脱裤子放屁
data test;
input var1 $ var2 var3;
datalines;
a . 1.1
a 5 .
a 6 .
b 5 0
b 5 0
b 7 .
;
run;
proc sql noprint;
create table result as
select *, max(var3) as max, max(var3)>0 as index from test group by var1;
quit;
data result;
set result;
if index=1 then var3=max;
run; |
|
D******n 发帖数: 2836 | 46 God ,sas is killing me....
why is the following not working?
data a1;
input a $ cap;
datalines;
5.4 0.8
;
run;
data a2;
set a1;
call execute('b = put(cap,'||strip(a)||');');
run;
proc print;run; |
|
D******n 发帖数: 2836 | 47 cool, it works.
what is open code?
actually what i want is
=========================
data a1;
input a $ cap;
datalines;
5.4 0.8
5.4 0.9
5.3 1.8
;
run;
data _null_;
set a1 end=eof;
if _n_=1 then do; call execute('data a2;'); end;
call execute('b = put('||cap||','||strip(a)||');output;');
if eof then call execute('run;');
run;
<----output------->
Obs b
1 |
|
p********a 发帖数: 5352 | 48 data a;
input id status day;
datalines;
1 1 0
1 1 5
1 1 10
1 2 15
1 3 20
1 3 25
1 2 30
1 4 35
run;
data b(keep=ID Pre_status End_status Pre_day End_day);
retain Pre_status End_status Pre_day End_day;
set a;
if _N_=1 then do; Pre_status=status; Pre_day=day;end;
if status ne pre_status then do;End_status=Status;End_day=day;output; Pre_
status=status; Pre_day=day; end;
run;
proc print;
run; |
|
x*********o 发帖数: 7 | 49 I run the following sas code,but I cannot get the correct answer. Anybody
can help me?
data test3;
input employee_name $1-4;
if employee_name='Sue' then input age 7-8;
else input idnum 10-11;
datalines;
Ruth 39 11
Jose 32 22
Sue 30 33
John 40 44
;
run;
proc print data=test3;
run; |
|
p********a 发帖数: 5352 | 50 data a;
input A I L R;
datalines;
a 0 1 2
a 0 3 4
a 1 5 6
a 1 2 3
a 0 4 5
a 0 5 9
run;
data b(drop=I1 L1 R1);
set a;
retain I1 L1 R1;
if _N_=1 then do; I1=I;L1=L;R1=R;end;
else do; if I=I1 then do;L=L1;output;end;
else do;I1=I;L1=L;R1=R;end;
end;
run; |
|