s****y 发帖数: 76 | 1 【 以下文字转载自 Statistics 讨论区 】
发信人: smiley (smiley), 信区: Statistics
标 题: help with SAS sql
发信站: BBS 未名空间站 (Fri Nov 29 22:27:30 2013, 美东)
Hi friends,
I have a question on how to eliminate duplicate rows only when there is a
null value for duplicate ids:
id score
1 55
1 .
2 .
3 74
3 80
4 67
5 40
How do I get rid of second row only (id=1 and score=.)?
Thanks much!!! |
c*****a 发帖数: 177 | 2 不知道下面这样是否最优,权当抛砖引玉。另外用SAS里的SORT PROC应该更方便。
select * from tmp_t
where not (score is null and id in
(select t2.id
from tmp_t t2
inner join (select id from tmp_t
where score is not null) t1
on t1.id=t2.id
where t2.score is null)) |
s****y 发帖数: 76 | 3 谢谢解答, proc sort只能去掉第一个ob,given duplicate id吧? |
c*****a 发帖数: 177 | 4 试下下面这个方法:
proc sort data=id_score;
by id descending score;
run;
data data_out(drop=keep);
set id_score;
by id descending score;
retain keep;
if first.id then
if score=. then keep=1;
else keep=0;
if ^first.id & keep=0 & score=. then delete;
run; |