K*******5 发帖数: 98 | 1 I need help for the SAS code Or if you have any idea. Thank you very much.
2 data sets: cases, controls.
Want to create a new data set, providing a random sample of controls at a
ratio of 5:1 matched for each observation from cases. do not worry about the
number of controls. I have enough controls for each case from my original
data.
Data cases;
Input site$ year$ id$;
Datalines;
A01 1996 c01
A02 1996 c02
A02 2000 c03
A03 1999 c04
……..
;
Data controls;
Input site$ year$ id;
Datalines;
A01 1996 1
A01 1996 2
A01 1996 3
A01 1996 4
A01 1999 5
A02 1996 6
A02 1996 7
A02 1996 8
A02 1996 9
A02 1996 10
A02 1996 11
A02 1996 12
A02 1997 13
A02 1997 14
A02 1997 15
A02 1997 16
A02 1997 17
A02 2000 18
……..
; |
t*****w 发帖数: 254 | 2 Data controls;
Input site$ year$ id;
Datalines;
A01 1996 1
A01 1996 2
A01 1996 3
A01 1996 4
A01 1999 5
A02 1996 6
A02 1996 7
A02 1996 8
A02 1996 9
A02 1996 10
A02 1996 11
A02 1996 12
A02 1997 13
A02 1997 14
A02 1997 15
A02 1997 16
A02 1997 17
A02 2000 18
;
data ratio;
set controls;
x = RAND('BERNOULLI',0.2) ;
if x eq 1 then output;
;
run; |
K*******5 发帖数: 98 | 3 Thank you for your help.
It looks like i did not make my question clear. I need random samples from
CONTROLS. The samples from CONTROLS need to match each observation from
CASES by site and year. The ratio is 5:1. 5 controls to 1 case.
【在 t*****w 的大作中提到】 : Data controls; : Input site$ year$ id; : Datalines; : A01 1996 1 : A01 1996 2 : A01 1996 3 : A01 1996 4 : A01 1999 5 : A02 1996 6 : A02 1996 7
|
a****g 发帖数: 8131 | 4 how about that
for each case, do a matched sampling of 5 controls
iterate all cases
clearly this will be an calculation expensive
alternatively, suppose you have 100 cases,
ask for a random sampling of 500 controls with equal weight on the labels
I haven't done any sampling for a lot time thus cannot write a code for you.
just some thoughts to share with you.
the
【在 K*******5 的大作中提到】 : I need help for the SAS code Or if you have any idea. Thank you very much. : 2 data sets: cases, controls. : Want to create a new data set, providing a random sample of controls at a : ratio of 5:1 matched for each observation from cases. do not worry about the : number of controls. I have enough controls for each case from my original : data. : Data cases; : Input site$ year$ id$; : Datalines; : A01 1996 c01
|