j***3 发帖数: 142 | 1 if I have table like this:
1 1.234
2 1.657
3 1.564
4 2.343
..
,,
the first column is order information and the second column is value.
is there a way to sample all possible continuous subset of x row from the table?
thanks |
r********0 发帖数: 65 | 2 How to define your "continuous subset" here? |
l***a 发帖数: 12410 | 3 if there is no duplicate and the order var has no missing, try this
%macro subset;
proc sql;
select count(order) into :n
from data0;
quit;
%do i=1 %to &n.;
%do j=1 %to &n.-&i.+1;
data data&i._&j.;
set data0;
if order>=&j. and order<=&n-&j+1;
run;
%end;
%end;
%mend subset;
table?
【在 j***3 的大作中提到】 : if I have table like this: : 1 1.234 : 2 1.657 : 3 1.564 : 4 2.343 : .. : ,, : the first column is order information and the second column is value. : is there a way to sample all possible continuous subset of x row from the table? : thanks
|
j***3 发帖数: 142 | 4 thanks for the reply,
continuous set just means one block of rows from the table.
I was think if there are ways not use (or use less) loops because the
dataset is huge. and R is not efficient handling loop |
r********0 发帖数: 65 | 5 row <- seq(1:100)
row <- sample(row) ## it will randomize the 100 numbers
then you can pick like first n numbers(depends on the size of each subset
you need) as your subset index.
c <- row[1:n]
data[c,]
if you want all the possible sizes of subset i'm afraid you still need to
use loop.
Hope it might help you |
D******n 发帖数: 2836 | 6 thats totally n(n+1)/2 subsets.
table?
【在 j***3 的大作中提到】 : if I have table like this: : 1 1.234 : 2 1.657 : 3 1.564 : 4 2.343 : .. : ,, : the first column is order information and the second column is value. : is there a way to sample all possible continuous subset of x row from the table? : thanks
|
j***3 发帖数: 142 | 7 maybe I did not express myself clear enough,
what I need is a subsets that has continuous row number, the solution
rabbit1860 give is random row number.
the solution libra give is in SAS ? the syntax looks strange to me.
DaShagen, I think it is basically a sliding window of size x and step 1.
|
g********r 发帖数: 8017 | 8 你的N有多大?如果是几万量级的,loop速度也可以接受.
for(i in 1:(n-m+1))
{
do.something(a[i:(i+m-1),])
}
【在 j***3 的大作中提到】 : maybe I did not express myself clear enough, : what I need is a subsets that has continuous row number, the solution : rabbit1860 give is random row number. : the solution libra give is in SAS ? the syntax looks strange to me. : DaShagen, I think it is basically a sliding window of size x and step 1. :
|
D******n 发帖数: 2836 | 9 result<-lapply(1:nrow(data)-x+1,function(t) data[t:(t+x-1)])
【在 j***3 的大作中提到】 : maybe I did not express myself clear enough, : what I need is a subsets that has continuous row number, the solution : rabbit1860 give is random row number. : the solution libra give is in SAS ? the syntax looks strange to me. : DaShagen, I think it is basically a sliding window of size x and step 1. :
|
j***3 发帖数: 142 | 10 Thanks goldmember, it is in the order if billions so I'd best avoid loop.
Thanks DaShagen, I think it might be the best solution.
an alternative is use rollapply in zoo package, but lapply is more
straightforward. |