【欢迎进来讨论】for loop in R - Statistics版

本页内容为未名空间相应帖子的节选和存档，一周内的贴子最多显示50字，超过一周显示500字访问原贴

Statistics版 - 【欢迎进来讨论】for loop in R

相关主题
● 问个R里面avoid for loop的问题(sapply,lapply...)	● How can I loop through a list of strings as variables in a
● 问个R的问题	● R问题请教。
● R program help	● How can I do this in R?
● 问R和C的循环语句	● 请问R里apply和sapply有什么区别
● 请教：如何能加速R codes 运行？	● 怎样用apply对多种endpoint构建linear model
● 这个R LOOP错在那里了	● [合集] 申请SAS PROGRAMMER职位,要懂哪些东西?
● 大牛指点下面的R Code 怎么用Loop来实现	● [合集] 怎么合并(merge)很多外部的文件？
● 【急】一个基本的R的问题，求助。谢谢！大包子答谢	● do loop 的一道题

相关话题的讨论汇总
话题: apply话题: loop话题: way话题: lapply话题: res

进入Statistics版参与讨论

(共1页)

f***a
发帖数: 329

照例，还是我先胡说几句，:-)
在R里面能不用for loop就不应该用，尽量用vectorize的方式搞定一切。
对matrix/data.frame的row or col做运算，就用apply；（btw, same for array）
要对list, data.frame(essentially it is a list), vector的element做运算就用
lapply, sapply；
对不同id做运算，用tapply
下面是我的问题。
1）
# Way I:
for(i in 1:n){
res[i] <- myfunction(a[i], b[i], c[i])
}
# Way II:
res <- apply(cbind(a,b,c), 1, function(t)
myfunction(t[1], t[2], t[3])
)
这两种方法equivalent还是way II好一些呢？
2)
# Way I:
for(i in 1:n){
input <- i
...... # some heavy calculation
res[i] <- output
}
# Way II:
res <- lapply(1:n, function(t){
input <- t
...... # some heavy calculation
output
}
)
这两种方法equivalent还是way II好一些呢？
3）
# Way I:
for(i in 1:n){
input <- res[i-1]
... # some calculation
res[i] <- output
}
有办法不用for loop解决吗？
4)
# Way I:
for(i in 1:n){
res[[i]] <- read.table( paste("file_",i,".txt", sep="") )
}
# Way II:
res <- lapply(1:n, function(t)
read.table( paste("file_",t,".txt", sep="") )
)
不是做数学运算，还是干些其他一些事情呢(譬如IO data)？效果一样？
大家发表下自己的看法吧，或者有什么用apply vs for的经验也说说。

w*****t
发帖数: 49

apply is much faster than for.

f***a
发帖数: 329

In general vectorized computation case, no double apply is faster than
regular for loop (as I stated in the beginning). But in case of my
questions, is the efficiency of apply still that significant? (BTW, what's
the internal procedure/algorithm that makes apply more efficient over for
loop?)
And to extend the discussion, "for loop" can be replaced by {foreach}
looping in sense of parallel computing. In this case, how efficient is it
comparing with parallel-type "apply" functions in {snow}, {multicore}
packages?
Hope someone can share experience...

【在 w*****t 的大作中提到】

: apply is much faster than for.

P****D
发帖数: 11146

3)我也想知道。坐下等。

M*P
发帖数: 6456

都已经用R了，还在乎这个？

【在 f***a 的大作中提到】

: In general vectorized computation case, no double apply is faster than
: regular for loop (as I stated in the beginning). But in case of my
: questions, is the efficiency of apply still that significant? (BTW, what's
: the internal procedure/algorithm that makes apply more efficient over for
: loop?)
: And to extend the discussion, "for loop" can be replaced by {foreach}
: looping in sense of parallel computing. In this case, how efficient is it
: comparing with parallel-type "apply" functions in {snow}, {multicore}
: packages?
: Hope someone can share experience...

d******e
发帖数: 7844

呵呵，同样是用R，速度能相差一百倍，你信么？

【在 M*P 的大作中提到】

: 都已经用R了，还在乎这个？

f***a
发帖数: 329

It seems the actual looping of "lapply" is done internally in C code and "
apply" isn't really faster than writing a loop. The main advantage of "apply
" is it simplifies code writing?
colMeans/rowSums() and vectorization of a function are faster than a loop
though.
Anyway, I think, for algorithm with heavy computation involved, C/C++ should
be employed to handle computing part. And I strongly recommend {Rcpp} which
provides much much better API than the original one in R.
(My previous questions remain unanswered.... T_T)

M*V
发帖数: 11

Sometimes while can be used for loops, which is faster than if. For
computationally intense task, maybe it's good to link with C. Just my 2
cents.

r*g
发帖数: 3159

R 里面 Apply 就是for loop. 说apply比for快那是迷信。引自r 作者
apply() is just a wrapper for a for loop. So it is not faster that at
least one implementation using a for loop: it may be neater and easier to
understand than an explicit for loop.

P****D
发帖数: 11146

！！！！！！！

【在 r*g 的大作中提到】

: R 里面 Apply 就是for loop. 说apply比for快那是迷信。引自r 作者
: apply() is just a wrapper for a for loop. So it is not faster that at
: least one implementation using a for loop: it may be neater and easier to
: understand than an explicit for loop.

相关主题
● 这个R LOOP错在那里了	● How can I loop through a list of strings as variables in a
● 大牛指点下面的R Code 怎么用Loop来实现	● R问题请教。
● 【急】一个基本的R的问题，求助。谢谢！大包子答谢	● How can I do this in R?
进入Statistics版参与讨论

D******n
发帖数: 2836

這好比說，所有計算機語言比機器碼快都是迷信，因為這些都是機器碼的wrapper而已。

to

【在 r*g 的大作中提到】

g********r
发帖数: 8017

试了一下，还真是
> a<-matrix(rnorm(10000000),ncol=100)
> dim(a)
[1] 100000 100
> r<-1:100000
> system.time(for(i in 1:100000) r[i]<-mean(a[i,]))
user system elapsed
2.239 0.053 2.276
> system.time(r2<-apply(a,1,mean))
user system elapsed
2.731 0.048 2.763
> system.time(r3<-rowMeans(a))
user system elapsed
0.033 0.001 0.034

已。

【在 D******n 的大作中提到】

: 這好比說，所有計算機語言比機器碼快都是迷信，因為這些都是機器碼的wrapper而已。
:
: to

P****D
发帖数: 11146

这没错吧。你不是讽刺吧……

已。

【在 D******n 的大作中提到】

: 這好比說，所有計算機語言比機器碼快都是迷信，因為這些都是機器碼的wrapper而已。
:
: to

D******n
发帖数: 2836

如果他說的for loop是指R的for loop，那我對R 的apply family很驚訝失望。
他說的apply 就如for loop 就順理成章了。
我理解的是general for loop，就是說，譬如Apply是用C寫的，那怎麼都會有for loop
在code裏
面的。

【在 P****D 的大作中提到】

: 这没错吧。你不是讽刺吧……
:
: 已。

F****n
发帖数: 3271

From what I learned, in R apply <= loop in terms of performance. The
apply family are a bunch of convenient tools *BUILT ON loop*. You
don't need to be 驚訝失望, because actually "loops" in R is not that
slow. Most perceived performance issues in R loops are not related to
loops themselves, but more or less due to the R data objects, which are
immutable and must be indexed.

loop

【在 D******n 的大作中提到】

: 如果他說的for loop是指R的for loop，那我對R 的apply family很驚訝失望。
: 他說的apply 就如for loop 就順理成章了。
: 我理解的是general for loop，就是說，譬如Apply是用C寫的，那怎麼都會有for loop
: 在code裏
: 面的。

F****n
发帖数: 3271

Again, it is a common misunderstanding that in R apply is faster than
loops. They are the same.
Enhancing performance using vectorization means using built-in optimized
functions such as rowMeans, not using apply

【在 f***a 的大作中提到】

: 照例，还是我先胡说几句，:-)
: 在R里面能不用for loop就不应该用，尽量用vectorize的方式搞定一切。
: 对matrix/data.frame的row or col做运算，就用apply；（btw, same for array）
: 要对list, data.frame(essentially it is a list), vector的element做运算就用
: lapply, sapply；
: 对不同id做运算，用tapply
: 下面是我的问题。
: 1）
: # Way I:
: for(i in 1:n){

D******n
发帖数: 2836

After a little research,for apply it is true, but not so for the entire
"Apply Family"
R loop -> apply
C code -> lapply -->sapply
|
+------>tapply
C code -> mapply
I haven't tested it yet, but i guess for other members of the apply
family , they do much better than for loop.

optimized

【在 F****n 的大作中提到】

: Again, it is a common misunderstanding that in R apply is faster than
: loops. They are the same.
: Enhancing performance using vectorization means using built-in optimized
: functions such as rowMeans, not using apply

F****n
发帖数: 3271

Yeah, you are right, but I think since
C code -> R loop too,
lapply == loop >= apply
In other words lapply is faster than apply but not necessarily better than
loop.

【在 D******n 的大作中提到】

: After a little research,for apply it is true, but not so for the entire
: "Apply Family"
: R loop -> apply
: C code -> lapply -->sapply
: |
: +------>tapply
: C code -> mapply
: I haven't tested it yet, but i guess for other members of the apply
: family , they do much better than for loop.
:

D******n
发帖数: 2836

你才是不知所云。先不说你理解是错的。明明讨论问题，说这些话干什么？
明明讲R，哪来政治正确？

roughly

【在 r*g 的大作中提到】

P****D
发帖数: 11146

哦，原来你惊讶失望是嫌R中的apply family没优化。

loop

【在 D******n 的大作中提到】

D******n
发帖数: 2836

Ya, because people keep saying the apply family is faster, so i was
surprised to find out (see floor 17), apply alone is not C code(or other non
R script) based while other family members are.
just type apply in R , you can find out it is totally R code with a loop.
.....
else for (i in 1L:d2) {
tmp <- FUN(array(newX[, i], d.call, dn.call), ...)
if (!is.null(tmp))
ans[[i]] <- tmp
}
.....
for lapply it is like this
function (X, FUN, ...)
{
FUN <- match.fun(FUN)
if (!is.vector(X) || is.object(X))
X <- as.list(X)
.Internal(lapply(X, FUN))
}

【在 P****D 的大作中提到】

: 哦，原来你惊讶失望是嫌R中的apply family没优化。
:
: loop

(共1页)

进入Statistics版参与讨论

相关主题
● do loop 的一道题	● 请教：如何能加速R codes 运行？
● 问一道sas base题	● 这个R LOOP错在那里了
● R help!!!	● 大牛指点下面的R Code 怎么用Loop来实现
● 求教sas base online tutor 的quiz中的一题	● 【急】一个基本的R的问题，求助。谢谢！大包子答谢
● 问个R里面avoid for loop的问题(sapply,lapply...)	● How can I loop through a list of strings as variables in a
● 问个R的问题	● R问题请教。
● R program help	● How can I do this in R?
● 问R和C的循环语句	● 请问R里apply和sapply有什么区别

相关话题的讨论汇总
话题: apply话题: loop话题: way话题: lapply话题: res

#	版面	帖数(主题数)
-	全站	4871 (796)
1	Military	3777 (569)
2	Stock	341 (51)
3	Joke	117 (17)
4	History	116 (3)
5	Automobile	100 (9)
6	USANews	55 (9)
7	Midlife	45 (1)
8	Headline	41 (41)
9	Dreamer	33 (13)
10	FleaMarket	32 (20)
11	Living	30 (7)

boards

未名新帖统计// 7月16日

历史上的今天