a********a 发帖数: 346 | 1 I have a correlation output dataset like the following,
Variable A11 B23 C124
A11 1.00000 0.21692 0.06551
B23 0.21692 1.00000 0.05283
C124 0.06551 0.05283 1.00000
How can I write a SAS program or R program to do a combination for the
variables listed and get a output like following?
Variable correlation
A11A11 1.00000
B23A11 0.21692
C124A11 0.06551
B23B23 1.00000
B23C124 0.05283
C1 |
t**i 发帖数: 688 | 2 R functions that may be useful:
upper/lower triangle of a matrix: upper.tri() or lower.tri()
vectorize a matrix: as.vector() |
c*******o 发帖数: 3829 | 3 Use array. Here you go:
data two (keep=CV correlation rename=(CV=Variable));
length CV $ 8.;
array list1{3} $ _temporary_ ('A11',' B23', 'C124');
array list2{*} A11 B23 C124;
set one;
do i =1 to 3;
CV= trim(left(Variable)) || trim(left(list1{i}));
correlation= list2{i};
output;
end;
run; |
a********a 发帖数: 346 | 4 Wonderful idea. Thanks guys.
To Capriccio, do you know how to remove the 'repeated' observations such as
A11B23 and B23A11, which basically are same in terms of correlation? I only
want to keep one of them. Thanks. |
d********h 发帖数: 2048 | 5 try to use proc transpose; |
k********e 发帖数: 448 | 6 tosi already told you -- use upper.tri() or lower.tri() in R
as
only
【在 a********a 的大作中提到】 : Wonderful idea. Thanks guys. : To Capriccio, do you know how to remove the 'repeated' observations such as : A11B23 and B23A11, which basically are same in terms of correlation? I only : want to keep one of them. Thanks.
|
c*******o 发帖数: 3829 | 7 Just change the do loop: do i =_N_ to 3;
as
only
【在 a********a 的大作中提到】 : Wonderful idea. Thanks guys. : To Capriccio, do you know how to remove the 'repeated' observations such as : A11B23 and B23A11, which basically are same in terms of correlation? I only : want to keep one of them. Thanks.
|
k********e 发帖数: 448 | 8 忍不住想说,lz是不是该稍微再努力一点。。。还是太urgent了?
【在 c*******o 的大作中提到】 : Just change the do loop: do i =_N_ to 3; : : as : only
|
f**********e 发帖数: 48 | 9 #R code, simple implementation, your data frame is df
cl<-colnames(df)
rw<-rownames(df)
cl_len<-length(cl)
rw_len<-length(rw)
new_rname<-character(cl_len*rw_len)
new_data<-numeric(cl_len*rw_len)
index<-0
for (i in 1:cl_len){
for (j in 1:rw_len){
index<-index+1
new_rname[index]<-paste(rw[j],cl[i])
new_data[index]<-df[i,j]
}
}
data.frame(new_rname,new_data)
|
y****n 发帖数: 46 | 10 data old;
input Variable$ A11 B23 C124;
cards;
A11 1.00000 0.21692 0.06551
B23 0.21692 1.00000 0.05283
C124 0.06551 0.05283 1.00000
;
run;
data new(where=(cv ne 1));
length name $8;
retain variable;
array cr(3) A11 B23 C124;
set old;
do i=1 to 3;
new_var=vname(cr(i));
name=cats(variable,new_var);
cv=cr(i);
output;
end;
keep name cv;
run; |
a********a 发帖数: 346 | 11 Thank you guys.
To Capriccio, I do not understand why 'do i =_N_ to 3' can remove the
repeated observation. Could you explain a little bit?
To keylimepie, I did try to program in SAS, but I could not succeed. Thanks |