s******r 发帖数: 1524 | 1 baozi pls.
data new;
merge old old(firstobs=2 rename=(id=id2) ) old(firstobs=3 rename=(id=id3))
old(firstobs=4 rename=(id=id4)) old(firstobs=5 rename=(id=id5))
;
if mod(_n_,5) ne 1 then delete;
run;
It would works fine with small dataset. If huge dataset, probably had better
to create 5 temp files and then merge them. |
|
l**********8 发帖数: 305 | 2
************************************************************
上个完整的LOG, 谢谢大家了
****************************************************************
NOTE: PROCEDURE PRINTTO used (Total process time):
real time 0.00 seconds
cpu time 0.01 seconds
25
26 *-----------------------------------------------------;
27 %macro process(year,yr,file,first,numobs);
28
29 data cartemp;
30 set loc.carl&year.
31 (firsto... 阅读全帖 |
|
F*******1 发帖数: 75 | 3 请教一个SAS 数据读入的问题. 我有个样本文件. 见附件.
我只需要第6行到第11行的数据. 在data statement 中, 我可以用firstobs=6 来定位
起事行. 那用什么来定位末位行呢? 我试了lastobs=11 不work. 谢谢!
如果我先读入所有数据,再提取需要的数据.我的sas script 如下. 但结果不对, 第7行
到第11行的数据丢了. 是什么原因呢? 谢谢!
data RT1;
%let _EFIERR_ = 0; /* set the ERROR detection macro variable */
infile "\\mkscsas01\saswork\20090108_asm_rtmcp_final.csv" delimiter =
',' MISSOVER DSD lrecl=55010 firstobs=6 ;
format Pnode $12. MCPType $12.;
INPUT Pnode $ Zone $ MCPType $ HE1 - HE24;
run; |
|
v********9 发帖数: 35 | 4 merge test(rename=(v1=Var1) firstobs=1 obs=3)
test(rename=(v1=Var2) firstobs=4 obs=6) ; |
|
j******o 发帖数: 127 | 5 data c;
if _n_=1 then do;
set b (rename=(nu=nu1 ran=ran1) firstobs=1 obs=1);
set b (rename=(nu=nu2 ran=ran2) firstobs=2 obs=2);
end;
set a;
run; |
|
d******9 发帖数: 404 | 6 How to in data step???? Do you mean FirstObs=????
It will not work. I failed with it in my previous experience: When using
wild card, FirstObs= will skip the non-data lines ONLY for the 1st external
file, but for all all the other remaining files, it will NOT skip. So, it
failed.
That's why I alert the LOUZHU to be careful when using wildcard. |
|
b2 发帖数: 427 | 7 【 以下文字转载自 Statistics 讨论区 】
发信人: b2 (维生素), 信区: Statistics
标 题: 关于SAS读数据紧急求助,包子答谢,谢谢了先
发信站: BBS 未名空间站 (Sun Jun 13 00:05:40 2010, 美东)
我又一组数据,20个变量,大约700万个观测值存在csv里面。
1 双击文件用excel打开一部分,会损坏原始文件么?
2 这二十个变量再csv文件里被分布在3个列里面,用|分割,具体情况是:
1)每个列里面含有的变量数不同,即 对于某些观测值column 1可能含5个变量,而对
于其他的观测,
第一列可能有8个变量;
2)某些观测,有的变量内容每分隔在不同的列里面;
3)同一变量内部,每个观测值的长度也不同;
我试过
data _null;
infile 'path' dsd firstobs=2 dlm=',' dlm='|';
input v1 $ v2 $ ... v20 $;
run;
或者
1 读入SAS;2输出每个列到新的csv文件中;但是每个列有含有不同数量的变量数。比
较麻烦,
请问有谁能指点我一下,谢谢! |
|
F*******1 发帖数: 75 | 8 Site1 - Site1000 are character variables. If I put Site1-Site1000 $ in the
INPUT statement as shown below, only Site1000 is showing as character
variable. Can somebody help me this? Thanks a lot!
data WindSummary;
%let _EFIERR_ = 0; /* set the ERROR detection macro variable */
infile myfile delimiter = ',' MISSOVER DSD lrecl=102590 firstobs=2;
format DataItem $60. ;
INPUT DataItem $ Site1-Site1000 $;
run; |
|
A**P 发帖数: 260 | 9 要读入下面的数据格式:
Date,Open,High,Low,Close,Volume,Adj Close
05Mar2009,47.56,51.95,46.98,,0,50.17
04Mar2009,48.02,48.83,45.02,47.56,0,47.56
为了正确处理第一行的missing value,使用了DSD option。程序如下:
data index.vix;
infile "Z:\public\vix.csv" dlm=',' dsd firstobs=2;
input Date anydtdte. Open High Low Close Volume AdjClose;
run;
SAS always assign missing values to variable Open. Can anyone help? |
|
p********a 发帖数: 5352 | 10 YAHOO的股市DATA?你把DATE FORMAT改成DATE9.就可以了
看看俺的MACRO
%macro getdata(tic);
FILENAME myurl URL "http://ichart.finance.yahoo.com/table.csv?s=&tic";
DATA &tic;
INFILE myurl FIRSTOBS=2 missover dsd;
format date yymmdd10.;
INPUT Date: yymmdd10. Open High Low Close Volume Adj_Close ;
if date>=today()-180;
RUN; |
|
d*****g 发帖数: 4081 | 11 有这么一句我想翻到R里面用
INFILE LOCATION(**.txt) truncover FIRSTOBS = 2 DLM = ',.';
高手给解释一下吧!!谢谢了 |
|
w*********e 发帖数: 1 | 12 不知道各位XDJM用SAS有没有碰到LOST CARD的情况?
今天要读进一个csv文件,有将近60个variables:
data;
infile "C:/..../.csv" dsd dlm="," firstobs=2;
input ID $ ......;
run;
结果报告
...LRECL=256
LSOT CARD.
*************
在infile语句加入LRECL=400;貌似能读入所有variables,但是ID却变成数字的形势,
小数点后有两个零。
哪位高手能给答疑?十分感谢! |
|
t**i 发帖数: 688 | 13 我的数据是tab格式化的,几万列。都是字符变量,尽管是数字构成的。请教如何确保
SAS读入的时候采用字符型变量格式?
下面这个好像不行。
infile “myfile”firstobs=1 lrecl=1000000 truncover;
input v1 - v50000 $ ; |
|
t**i 发帖数: 688 | 14 Example code? Like the following?
data test;
infile 'myfile' firstobs=1 dlm=',' lrecl=10000000 truncover;
length v1 - v50000 :$12 ;
input v1 - v50000;
run; |
|
|
g********d 发帖数: 2022 | 16 服了,这么简单的问题看折腾的。
%macro split;
%do i=1 %to 10000 %by 100;
data dataset&i ; set one (firstobs=&i obs=%eval(99+&i));run;
run;
%end;
%mend;
%split; |
|
D******n 发帖数: 2836 | 17 %macro split;
%do i=1 %to 10000 %by 100;
data _null_ ; set one (firstobs=&i obs=%eval(99+&i)); file 'smallfile'&i; pu
t (_all_) (+0);run;
run;
%end;
%mend;
%split; |
|
o******6 发帖数: 538 | 18 这个最简单有效了啊
%macro split;
%do i=1 %to 10000 %by 100;
data _null_ ;
set temp (firstobs=&i obs=%eval(99+&i));
file "...\data%eval((&i-1)/100+1).txt";
put (_all_) (+0);
run;
%end;
%mend;
%split;
then datastep subsetting using where statement is the most efficient one?
load 10000 observation once, 100 observation 100 times. |
|
y*****t 发帖数: 1367 | 19 前五个的话用
data abcd;
set data (obs=5);
run;
最后五个如果知道数据有多少行的话(比如100行)就可以用:
data abcd;
set data (firstobs=96);
run; |
|
A*******s 发帖数: 3942 | 20 用macro吧,要不也可以用file statement with filevar option,更麻烦点.
%macro abc;
%let fname=result;
%do i=1 %to 1000;
data &fname&i;
set result(firstobs=%eval((&i-1)*1599+1) obs=%eval(&i*1599));
run;
%end;
%mend;
%abc |
|
w*******n 发帖数: 469 | 21 proc sort data=one; by id; run;
%macro search()
%do i=1 %to num;
%searchone(&i);
%end;
%mend;
%macro searchone(index);
data oneobs;
set DatA(firstobs=&index obs=&index);
run;
data one;
if 1=1 then delete;
run;
data dataB one;
merge oneobs(in=inone) dataB(in B);by id;
if inone & B then output one;
if inone & B then delete;
output dataB;
data match;
set match one; run;
%mend; |
|
p********a 发帖数: 5352 | 22 %macro getdata(tic);
FILENAME myurl URL "http://ichart.finance.yahoo.com/table.csv?s=&tic";
DATA &tic;
INFILE myurl FIRSTOBS=2 missover dsd;
format date yymmdd10.;
INPUT Date: yymmdd10. Open High Low Close Volume Adj_Close ;
*if date>=today()-180;
RUN;
%mend;
%getdata(SPY); |
|
b2 发帖数: 427 | 23 我又一组数据,20个变量,大约700万个观测值存在csv里面。
1 双击文件用excel打开一部分,会损坏原始文件么?
2 这二十个变量再csv文件里被分布在3个列里面,用|分割,具体情况是:
1)每个列里面含有的变量数不同,即 对于某些观测值column 1可能含5个变量,而对
于其他的观测,
第一列可能有8个变量;
2)某些观测,有的变量内容每分隔在不同的列里面;
3)同一变量内部,每个观测值的长度也不同;
我试过
data _null;
infile 'path' dsd firstobs=2 dlm=',' dlm='|';
input v1 $ v2 $ ... v20 $;
run;
或者
1 读入SAS;2输出每个列到新的csv文件中;但是每个列有含有不同数量的变量数。比
较麻烦,
请问有谁能指点我一下,谢谢! |
|
j**********e 发帖数: 442 | 24 您真是太谦虚了。这样确实就可以work了。给您发了20个伪币,聊表寸心。
有个问题是:根据上面的代码,在company和000007之间只能有:而不能有任何空格。有没有办法允许在:后有空格呢?
我才搞清楚,原来是中文的问题。中文一个字占两格,所以在读取冒号后面的值时有困难(到底为啥困难我也不明白)。我把冒号删除了然后在英文输入法环境下再添上冒号就可以了。但是文件太多,一个一个这样弄太费时间。大家有好办法吗?
附上整个宏(测试用,所以num只从0到1):
%macro input_file;
%do num=0 %to 1;
data file_sub;
infile "C:\research\A (&num).txt"
firstobs=1
delimiter=":" truncover;
input col_1 $20. ;
n=_N_;
if n=1 then company=scan(col_1,2,':');
retain company;
if n=4 then date=in |
|
j**********e 发帖数: 442 | 25 不好意思,不行啊。
原始数据的一部分:
date
4-Dec-08
18-Dec-04
13-Sep-07
15-Sep-07
8-May-07
程序:
data date;
infile 'C:\research\date.csv' firstobs=2 dsd missover;
input date date9.;
format date yymmdd10.;
run;
读出来的:
.
2004-12-18
2007-09-13
2007-09-15
.
就是说第一和第五个observation读不出来。
还有,如何输出20041218类型的日期变量?
多谢指教! |
|
D******n 发帖数: 2836 | 26 because sas didnt intend to have any logic in the first place, its logic was
just line by line processing. As more and more ppl used SAS and ppl asked f
or more and more functionalities , SAS developers just keep adding those wei
rd syntax/logic to SAS, just like patches.
one example is , you have firstobs in data step, but you dont have lastobs,
because obs= is it!
Why? because SAS developers first added obs=, as users wanted to print first
couple of lines, and then ppl wanted to get started no |
|
p********a 发帖数: 5352 | 27 infile .........firstobs=2; |
|
p*****o 发帖数: 543 | 28 i only saw dsd before......
so would you mind telling me how to use it?
INFILE "test.txt" DLM=',' DSD MISSOVER FIRSTOBS=2 sds='"'---is it right? |
|
b*****e 发帖数: 223 | 29 数据有 500 columns,大多数是 numeric,个别是 char。其实只需要其中十个左右
columns。我想这样弄的,试图把所有 columns 都读成 char,但是不行。有没有什么
好办法?我不想把 char column 一个个人为找出来单独列出来读,因为其实大部分都
用不到。
或者,有什么只读我需要的那些 columns 的读数据的方法?以前没读过这么多列的数据
,没经验
data ALLPAGE;
infile "\....\My Documents\MYDATA.txt" delimiter='09'x
firstobs=4 obs=1410 dsd lrecl=10000 missover;
input COL1-COL500 ; /* 这样 char col 都是空白 */
input COL1-COL500 $ ; /* 这样不行? */
input COL1-COL222 COL223 $ ..... COLxxx-COL500; /* 嫌麻烦 */
run; |
|
o****o 发帖数: 8077 | 30 data new;
merge one one(firstobs=2 in=_2);
by &key;
&statement;
if _2;
run;
more baozi pls |
|
D******n 发帖数: 2836 | 31 nice, i have done the similar thing.
hehe, but my way is more unix like. my script has options.
viewsas -firstobs 10 -numberofobs 100 -variable "year age" xxx.sasbdat |
|
o****o 发帖数: 8077 | 32 data new;
merge old old(firstobs=2 rename=(LD=LD2));
Change=LD2-LD;
drop LD2;
run; |
|
a********i 发帖数: 205 | 33 多谢各位大神
总结一下还是这样写最好:
data old;
input patient LD;
datalines;
1001 5
1001 45
1001 70
1002 10
1002 20
1002 35
run;
proc sort data=old;
by patient;
run;
Data new(drop=LD2);
merge old old(firstobs=2 rename=(LD=LD2));
by patient;
change=LD2-LD;
if last.patient then change=.;
run; |
|
o****o 发帖数: 8077 | 34 你们想啥呢?
直接firstobs=2,文件到头再输出以.值不久完了么? |
|
D******n 发帖数: 2836 | 35 data _new;
merge
_old(keep=A firstobs=2)
_old(drop=A )
;
run;
should work but it is not tested. |
|
o****o 发帖数: 8077 | 36 data _new;
merge _old(in=_1)
_old(rename=(A=B) firstobs=2)
;
if _1;
run; |
|
t*****2 发帖数: 94 | 37 ID Name Sex Age Date Height Weight ActLevel Fee
2458 Murray, W M 27 1 72 168 HIGH 85.2
2462 Almers, C F 34 3 66 152 HIGH 124.8
There is an embeded comma in the names. I tried many input methods, but it
still didn't work. Can anybody help me out here?
The code I used:
DATA newadmit;
INFILE 'C:\Users\labuser\Desktop\newadmit.txt' FIRSTOBS = 2 OBS = max;
INPUT ID 1-4 Name : $40. Sex $ 26 Age Date Height Weight ActLevel ... 阅读全帖 |
|
u*********r 发帖数: 1181 | 38 在run 一个程序,用csv 文件读入数据
但是 发现程序隔行读数据,本来灭个变量有24个数据,最后读了12个
不晓得发生什么问题,我是SAS 蝌蚪
请大牛指教
现贴开始的一段code
%let _EFIERR_ = 0; /* set the ERROR detection macro variable */
infile 'k:\Kaiwang\run20\run20.CSV' delimiter = ',' firstobs=2 dsd ;
informat LVCYP1A1 best32. ; |
|
t******m 发帖数: 58 | 39 郁闷了。。楼上2为大虾的办法我之前都尝试过,都不行,sex都是missing的。我现在
贴点code出来,恳请大虾继续指点。。。
OPTIONS LS=132 PS=10000 NOCENTER;
/*
Fromat
DIN=DIN
CD=Claim Date
DS=Days Supply
NU=Number of Units
UP=UNIT_PRICE
ICP=Ingredient Cost Paid
DFP=Dispensing Fee Paid
TAP=Total Amount Paid
RPS=Random Pharmacy Store
RPN=Random Patient Number
PDOB=Patient DOB
PG=Patient Gender
*/
data f1;
/*INFORMAT CD YYMMDD13.2; FORMAT CD YYMMDD10.;
INFORMAT PDOB YYMMDD13.2; FORMAT PDOB YYMMDD10.;*/
INFORMAT UP ICP DFP TAP DOLLAR7.2; FORMAT UP ICP DFP... 阅读全帖 |
|
l*********s 发帖数: 5409 | 40 I am having some very weird bug while trying to write a macro that can
expend the short hand notion like var1--var11 used in SAS.
The "shorthand" macro works fine on its own, but fails to work when called
by the "formula" macro. The error message seems to say that "the set
statement in the data step is not valid or not in proper order", what's
going on?
Many thanks!
////////////////////////////////////////////////////////////////////
//////////////////////////////////////////////////////////////... 阅读全帖 |
|
x*******u 发帖数: 500 | 41 我试过firstobs, 不行, 它还是保持character的形式, 不会变成date, time的格
式。
除非我删掉前面的。 |
|
a*****3 发帖数: 601 | 42 一步步来,挣个包子真不容易呀
filename read "你的文件名"
data target ;
infile read firstobs=6;
input char1 $ 1-10
char2 $ 15-25
char3 $ 30 -33
; run; |
|
x*******u 发帖数: 500 | 43 从数据上看第一个变量的长度是10, 但是用你的code读出来结果是这样的:
char1 char2 char3
1 0 / 3 0 0 0 9 .1 4 1
中间还是有空格。
我用proc import读入数据后, log里面是这样的:
data WORK.READASC ;
%let _EFIERR_ = 0; /* set the ERROR detection macro variable */
infile 'myfile.csv' delimiter = ',' MISSOVER DSD lrecl=32767 firstobs=7 ;
informat VAR1 $21. ;
informat VAR2 $23. ;
informat VAR3 $9. ;
format VAR1 $21. ;
format VAR2 $23. ;
format VAR3 $9. ... 阅读全帖 |
|
s******r 发帖数: 1524 | 44 data score;
do score=1 to 88;
output;
end;
run;
%macro subgrp(set=,var=, grp=);
proc sort data=&set;by &var;run;
data _null_;set &set end=eof;
if eof then call symput ('total',_n_);
run;
%let step=%eval(&total/&grp);
%if %eval(&step*&grp) < &total %then %let step=%eval(&step+1);
%do i =0 %to %eval(&grp-1);
%let first_obs=%eval(&i*&step+1);
%let obs=%eval(&first_obs+&step -1);
data &set&i;set &set(firstobs=&first_obs obs=&obs);run;
%end;
%mend;
%subgrp(set=score,var=score, grp=5);
bz pls. |
|
c****y 发帖数: 94 | 45 Try this:
data fun;
infile "C:\Documents and Settings\Toledo_Edison_MailingList_03_09_2012 7MWH
.txt" dsd firstobs=3 dlm=",";
length Marketing_ID $7 customer_Name $20 Mail_address $30 Mail_City $10
Mail_state $2 post_code $5;
input Marketing_ID $ Customer_Name $ Mail_Address $ Mail_City $ Mail_State
$ post_code $;
run; |
|
s******y 发帖数: 352 | 46 %let dirpath=d:;
filename csvfile pipe "dir /b &dirpath.\*.csv";
data allcsv;
infile csvfile lrecl=1000;
input;
fname=catx('',"&dirpath.",_infile_);
infile dummy filevar=fname filename=myfile end=done firstobs=4
dsd truncover;
do while(not done);
input date :date9. Tier :$50. Ccy :$50.
Doc :$50. Sd1y :percent. Sd2y :percent.;
output;
end;
put 'Done with ' myfile=;
run;
external |
|
k******u 发帖数: 250 | 47 创建一个新的variable z,用如下code
data bank;
infile 'C:bankdata.txt' firstobs =2;
input Name $ 1-15
Acct $ 16-20
x 21-26
y 27-30;
z = x * y;
run;
proc print data = bank;
run;
这个程序works good,
但是当我把infile 去掉,用datalines;的statement输入x y的值,同时在data中计算
z=x*y,却被告知statement is not valid。
为什么呢? |
|
R******d 发帖数: 1436 | 48 我有一个很大的数据集,要计算25000个变量和将近3000个变量之间的相关性。
结果出来,发现有很多空行没有相关系数,比如L19, L33这样的。如果把为空的数据点
(L19, L33)单独拿出来,则可以算出结果。请问这是什么问题?
谢谢了。
data mydata;
infile "/dir/file" firstobs=2 lrecl=2000000;
input C1-C25000 L1-L3000;
run;
proc corr data=mydata outp=corr(where=(_NAME_ ne "") drop=_TYPE_) noprint;
var C1-C25000;
with L1-L3000;
run;
proc export data=corr
outfile="/dir/out"
dbms=tab replace;
run; |
|
l**********8 发帖数: 305 | 49 为啥log总是报错,说找不到file work.cartemp
%let dataloc = /disk/agedisk1/medicare/data/20pct/car;
libname bworkhos "/disk/agebulk5/medicare.work/lbaker-DUA23466/ausmita/temp"
;
libname temp "/disk/agebulk5/medicare.work/lbaker-DUA23466/lbaker/data/xw";
ods listing file = "/disk/agebulk5/medicare.work/lbaker-DUA23466/ausmita/
temp/carrier_12212015.lst";
proc printto
log="/disk/agebulk5/medicare.work/lbaker-DUA23466/ausmita/temp/carrier_
12212015.log";
run;
*------------------------------------------------... 阅读全帖 |
|
l**********8 发帖数: 305 | 50 就是debug不出来,大神们救我啊
******************* 完整的 CODE ***************
%let dataloc = /disk/agedisk1/medicare/data/20pct/car;
libname bworkhos "/disk/agebulk5/medicare.work/lbaker-DUA23466/ausmita/temp"
;
*libname temp "/disk/agebulk5/medicare.work/lbaker-DUA23466/lbaker/data/xw";
ods listing file = "/disk/agebulk5/medicare.work/lbaker-DUA23466/ausmita/
temp/carrier_12222015.lst";
proc printto
log="/disk/agebulk5/medicare.work/lbaker-DUA23466/ausmita/temp/carrier_
12222015.log";
run;
*--------------... 阅读全帖 |
|