topics

全部话题 - 话题: runif
1 (共1页)
f***a
发帖数: 329
1
From the description of "runif()"
"runif will not generate either of the extreme values unless max = min or
max-min is small compared to min, and in particular not for the default
arguments."
It seems 0 won't be generated, so ceiling(runif(1)*N) should works OK.
I*****a
发帖数: 5425
2
你这个不算是吧。
n = 1000 # training size
ntest = 1000 # test size; make this big only for illustration
id.train = 1:n
id.test = (n + 1):(n + ntest)
ratio = 0.99
n0 = round(n * ratio)
n1 = n - n0
nsimu = 100
res = NULL
for (i in 1:nsimu){
p = c(runif(n0, 0, 0.5), runif(n1, 0.5, 1), runif(ntest, 0.6, 1) )
y = sapply(p, function(x){rbinom(n = 1, size = 1, prob = x)})
x = log(p / (1 - p)) # beta is c(0, 1)
dat = data.frame(x = x, y = y)
f... 阅读全帖
o*****p
发帖数: 2977
3
我用R编的数据
> d1 <- rep(2:6,c(1,2,25,2,1))
> d1
[1] 2 3 3 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 5 5 6
> d2 <- rep(0:8,4)
> d2
[1] 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8
> ks.test(d1,d2)
Two-sample Kolmogorov-Smirnov test
data: d1 and d2
D = 0.34767, p-value = 0.03566
alternative hypothesis: two-sided
Warning message:
In ks.test(d1, d2) : cannot compute exact p-value with ties
> wilcox.test(d1,d2)
Wilcoxon rank sum test with continuity cor... 阅读全帖
V*********n
发帖数: 198
4
来自主题: Quant版 - 请教一题
A simple simulation would show otherwise. I'm wondering why?
Simple R code:
nIter <- 200000
a <- runif(nIter)
b <- rep(0,nIter)
c <- rep(0,nIter)
ind <- rep(0,nIter)
for ( i in 1:nIter)
{
b[i] <- runif(1, 0, 1 - a[i])
c[i] <- 1 - a[i] - b[i]

# 两边之和大于第三边,两边之差小于第三边。
if( (a[i] + b[i] > c[i]) && ( abs(a[i]-b[i]) < c[i]) )
{
ind[i] <- 1
}
}
summary(ind)
k***n
发帖数: 997
5
来自主题: Quant版 - 请教一题
I changed your codes and it gives 0.25
a,b are independently sampled from the segment, but b depends on a in your
codes
nIter <- 80000
c <- rep(0,nIter)
ind <- rep(0,nIter)
for ( i in 1:nIter)
{
a = runif(1)
b = runif(1)
c =max(a,b) ;d=min(a,b)
# 两边之和大于第三边,两边之和小于第三边。
if( (d-.5)<0 &&(c-.5) > 0 && d-c>-.5 )
{
ind[i] <- 1
}
}
sum(ind)/nIter
f***a
发帖数: 329
6
来自主题: Statistics版 - 问个面试问题
Here you are. coded in R. It is easy to read and get the idea directly from
the code. Have fun!
###############
n <- 10 # how many Bernolli variables
m <- 6 # how many "1"s you want
nout <- 100 # how many such samples you want
p <- runif(n) # pre-specified probability of Bernolli variables
pp <- p/sum(p)
out <- matrix(0,nout,n)
for(irun in 1:nout)
{
ind <- numeric(m)
iter <- 1
while(iter<=m)
{
res <- sum(runif(1) > cumsum(pp))+1
if(!(res %in% ind)) {ind[iter] <- res; iter <- iter+1}
}
out[irun,i... 阅读全帖
p*********g
发帖数: 226
7
来自主题: Statistics版 - 问个面试问题
俺先确认对你的算法理解正确:
S = empty
loop until |S| = m
sample i with prob pi / sum_{j=1}^n p_j
if i is not in S
add i to S
end
end
return S

n <- 10 # how many Bernolli variables
m <- 6 # how many "1"s you want
nout <- 100 # how many such samples you want
p <- runif(n) # pre-specified probability of Bernolli variables
pp <- p/sum(p)
out <- matrix(0,nout,n)
for(irun in 1:nout)
{
ind <- numeric(m)
iter <- 1
while(iter<=m)
{
res <- sum(runif(1) > cumsum(pp))+1
if(!(res %in% ind)) {ind[iter] <- ... 阅读全帖
f***a
发帖数: 329
8
来自主题: Statistics版 - 问个面试问题
Here you are. coded in R. It is easy to read and get the idea directly from
the code. Have fun!
###############
n <- 10 # how many Bernolli variables
m <- 6 # how many "1"s you want
nout <- 100 # how many such samples you want
p <- runif(n) # pre-specified probability of Bernolli variables
pp <- p/sum(p)
out <- matrix(0,nout,n)
for(irun in 1:nout)
{
ind <- numeric(m)
iter <- 1
while(iter<=m)
{
res <- sum(runif(1) > cumsum(pp))+1
if(!(res %in% ind)) {ind[iter] <- res; iter <- iter+1}
}
out[irun,i... 阅读全帖
p*********g
发帖数: 226
9
来自主题: Statistics版 - 问个面试问题
俺先确认对你的算法理解正确:
S = empty
loop until |S| = m
sample i with prob pi / sum_{j=1}^n p_j
if i is not in S
add i to S
end
end
return S

n <- 10 # how many Bernolli variables
m <- 6 # how many "1"s you want
nout <- 100 # how many such samples you want
p <- runif(n) # pre-specified probability of Bernolli variables
pp <- p/sum(p)
out <- matrix(0,nout,n)
for(irun in 1:nout)
{
ind <- numeric(m)
iter <- 1
while(iter<=m)
{
res <- sum(runif(1) > cumsum(pp))+1
if(!(res %in% ind)) {ind[iter] <- ... 阅读全帖
k*******a
发帖数: 772
10
我用R,用你的方法算了一下,似乎得不到你要的图形
(不过我的假设,小球size忽略不计)给定一个初始条件,然后随机生成很多时间点,
算这个时间每个小球的位置,我取了第9个和第10个小球的距离。
x<-runif(20)*20
x<-sort(x)
v<-rnorm(20,0,10)
d<-c()
for (i in 1:10000){
t<-runif(1)*100000
pos<-x+v*t
pos<-pos%%40
pos<-pos*(pos<20)+(40-pos)*(pos>=20)
pos<-sort(pos)
d<-c(d,pos[10]-pos[9])
}
s*********e
发帖数: 1051
11
来自主题: Statistics版 - R 有点令人失望
> n <- 1000000
> set.seed(2013)
> ldf <- data.frame(id1 = sample(n, n), id2 = sample(n / 100, n, replace =
TRUE), x1 = rnorm(n), x2 = runif(n))
> rdf <- data.frame(id1 = sample(n, n), id2 = sample(n / 100, n, replace =
TRUE), y1 = rnorm(n), y2 = runif(n))
>
> # METHOD 1: MERGE
> system.time(join1 <- merge(ldf, rdf, by = c("id1", "id2")))
user system elapsed
54.028 11.229 65.673
>
> # METHOD 2: PLYR
> # library(plyr)
> # system.time(join2 <- plyr::join(ldf, rdf, by = c("id1", "id2"), type =... 阅读全帖
d******e
发帖数: 7844
12
这说明你没有理解问题所在。
> n = 100000
> X = matrix(runif(n*2),n,2)
> y0 = sign((X[,1]<0.1)-0.5)
> y = (y0*sign(runif(n)-0.1)+1)/2
> sum(y==1)
[1] 17998
> sum(y==0)
[1] 82002
> out = glm(y~X,family="binomial")
> yhat=sign(cbind(X,rep(1,n))%*%out$coefficients>0)
> sum((yhat==1)*(y==1))
[1] 2
> sum(yhat==y)
[1] 82003
> idx1 = which(y==1)
> idx0 = which(y==0)[1:length(idx1)]
> out = glm(y[c(idx0,idx1)]~X[c(idx0,idx1),],family="binomial")
> yhat=sign(cbind(X,rep(1,n))%*%out$coefficients>0)
> sum((yhat==1)*(y==1... 阅读全帖
I*****a
发帖数: 5425
13
n = 1000000
y = runif(n)
x = runif(n)
fit = lm(y ~ x)
summary(fit)
d******e
发帖数: 7844
14
来自主题: Statistics版 - quant analyst 一道概率的面试题
十轮之后,剩女收入基本都比剩男高了,所以能match的也就很少了。
n = 10 %轮数
m = 100 %男人数目
rep.num = 1e5 %Simulation次数
z = rep(0,rep.num)
for(j in 1:rep.num){
a = runif(m)
b = runif(m)
len = m;
for(i in 1:n){
idx.rnd = sample(len,len)
idx.acc = which(a len = length(idx.acc)
a = a[idx.acc]
b = b[idx.rnd[idx.acc]]
}
z[j] = len
}
我simulation的结果3轮是
> summary(z)
Min. 1st Qu. Median Mean ... 阅读全帖
d******e
发帖数: 7844
15
来自主题: Statistics版 - quant analyst 一道概率的面试题
话说你们真的懂MCMC么?
n = 10
m = 100
rep.num = 1e4
z = rep(0,rep.num)
for(j in 1:rep.num){
a = runif(m)
b = runif(m)
len = m;
for(i in 1:n){
idx.rnd = sample(len,len)
idx.rmn = which(a len = length(idx.rmn)
a = a[idx.rmn]
b = b[idx.rnd[idx.rmn]]
}
z[j] = len
}
.....
g******2
发帖数: 234
16
来自主题: Statistics版 - 请教一个随机分布的问题
1. do you know the initial position of each particle of A?
2. The probability formula you provided is not probability, but a density.
If you calculate probability, the probability for any given one pair to be
annihilated is always less than 0.5. The probability for an A particle to be
annihilated with any B particle is probably the right probability you want
to consider, in which case you should use the formula I wrote above.
3. I think my suggestion above should be valid, either use a random
su... 阅读全帖
c*****t
发帖数: 1879
17
来自主题: Programming版 - 两个我永远都不想碰的语言
你就没看我 link 的文章。其指出,R 的 scope 是个非常大的问题。
f = function() {
if (runif(1) > .5)
x = 10
x
}
The x being returned by this function is randomly local or global.
k*******a
发帖数: 772
18
刚刚编程模拟了一下,比较接近104:)
R code
x<-c()
for (i in 1:100000)
{
sum<-0
while(sum<101)
{
sum<-sum+floor(runif(1)*10)+1
}
x<-c(x,sum)
}
mean(x)=103.9928
l******n
发帖数: 9344
19
N = 10000
alpha = 10
set.seed(12345)
samples = alpha-log(1-runif(N))
z = 1/(sqrt(2*pi))*exp(-alpha)*exp(samples-samples^2/2)
mean(z)
sd(z)

thank
a********a
发帖数: 346
20
来自主题: Statistics版 - help in R
I have a program as following,
beta1=2
beta2=8
#for (i in 1:2){
obs=2
x1=matrix(NA,obs,1)
x2=matrix(NA,obs,1)
for (g in 1:obs){
x1[g,]=runif(1,1,2)
x2[g,]=rnorm(1,1)
}
data=cbind(x1,x2)
data
#d=rbind(data[i])
#}
I run the program 2 times, and get the data like following each time,
> data
[,1] [,2]
[1,] 1.452696 0.2718456
[2,] 1.514587 1.2971293
> data
[,1] [,2]
[1,] 1.172726 -0.9674824
[2,] 1.159079 0.5483838
how can I combine these data by row, i.e. I want to get d
s*****n
发帖数: 2174
21
来自主题: Statistics版 - help in R
beta1=2
beta2=8
result <- NULL
for (i in 1:2){
obs=2
x1=matrix(NA,obs,1)
x2=matrix(NA,obs,1)
for (g in 1:obs){
x1[g,]=runif(1,1,2)
x2[g,]=rnorm(1,1)
}
data=cbind(x1,x2)
result <- rbind(result, data)
}
result
D*********2
发帖数: 535
22
来自主题: Statistics版 - 求助 R sample in matrix form
不好意思麻烦下各位R高手~
我现在有两个矩阵,A, B, 还有一个同样的概率矩阵,要从A,B中抽样建一个新的矩阵U
。U中的每一个element, 都是从A ,B中的对应位置依bernoulli(P)抽样,P也是同样的
对应位置。
按理说是很整齐的格式。但R里面这个sample好像只能处理按元素处理?
有没有什么省时的方法,谢谢谢谢。
> (A <- matrix(1:12, 4, 3))
[,1] [,2] [,3]
[1,] 1 5 9
[2,] 2 6 10
[3,] 3 7 11
[4,] 4 8 12
> (B <- matrix(rep(c(1:3),4), 4, 3))
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 2 3 1
[3,] 3 1 2
[4,] 1 2 3
> (P <- matrix(runif(12), 4, 3))
[,1] [,2] [,
o****o
发帖数: 8077
23
来自主题: Statistics版 - 求助 R sample in matrix form
for (1), you can do
(matrix(runif(12), 4, 3)>P)*1

it
h***t
发帖数: 2540
24
来自主题: Statistics版 - 请教一个R的命令
如果我用R产生了随机数,然后把它分成几个bins,如果数在每个bin里面的number of
observation,最好能执行效率高一点
e.g x<-runif(1000)
how to get the # in each bin from [0,0.1],[0.1,0.2],....[0.9,1]
多谢
s*****n
发帖数: 2174
25
来自主题: Statistics版 - 请教一个R的命令
x <- runif(1000)
tapply(x, round(x + 0.05, 1), length)
z**k
发帖数: 378
26
来自主题: Statistics版 - 考大家一个R问题
为什么不可以。。。
> system.time(for (i in 1:1000) x = runif(1000))
user system elapsed
0.17 0.00 0.16
D******n
发帖数: 2836
27
这个离一个命令也只差了一点了。。。lol
y=runif(100);
y=c(y,1-sum(y));
lol
a***r
发帖数: 420
28
呵呵,放了个假,半个月没干活了,有点找不到感觉^_^
可是这是什么??咋没看懂哩(⊙o⊙)
我写的是
A <-runif(n,0,100)
A <-A/sum(A)
z**k
发帖数: 378
29
来自主题: Statistics版 - 问一个时间序列的问题
Try it yourself:
#!/usr/bin/R
a <- runif(1000)
k <- 10
a.f <- filter(a, filter=rep(1/k,k))
opar <- par(mfrow=c(2,1))
acf(a)
acf(na.omit(a.f))
par(opar)
z**k
发帖数: 378
30
来自主题: Statistics版 - R 问题
u <- runif(1000)
x <- as.integer(u>.5)
f***a
发帖数: 329
31
Rcode:
ceiling(runif(1)*100000)
f***a
发帖数: 329
32
来自主题: Statistics版 - 问个面试问题
回来了回来了。
重新想了下,这个其实就是在一堆iid variables之间加了一个constraint。貌似
sample起来不难。
以最简单的n=2,m=1为例:
Without constraint, outputs space is {(0,0),(0,1),(1,0),(1,1)}.
The corresponding probability space is {(1-p1)*(1-p2), ..., p1*p2}.
With constraint, outputs space is O={(0,1),(1,0)}.
The corresponding probability space is P={(1-p1)*(p2), p1*(1-p2)}.
Under the constrain, standardize the probability space into
P.std={P1/(P1+P2),P2/(P1+P2)}.
Then under constrain, output (0,1) has the probability P1/(P1+P2) to be
sam... 阅读全帖
f***a
发帖数: 329
33
来自主题: Statistics版 - 问个面试问题
回来了回来了。
重新想了下,这个其实就是在一堆iid variables之间加了一个constraint。貌似
sample起来不难。
以最简单的n=2,m=1为例:
Without constraint, outputs space is {(0,0),(0,1),(1,0),(1,1)}.
The corresponding probability space is {(1-p1)*(1-p2), ..., p1*p2}.
With constraint, outputs space is O={(0,1),(1,0)}.
The corresponding probability space is P={(1-p1)*(p2), p1*(1-p2)}.
Under the constrain, standardize the probability space into
P.std={P1/(P1+P2),P2/(P1+P2)}.
Then under constrain, output (0,1) has the probability P1/(P1+P2) to be
sam... 阅读全帖
a***r
发帖数: 420
34
来自主题: Statistics版 - 再问随机取样的programming
其实以前我没问过
不过考古发现众多此类话题的遗迹,我景仰的songkun,dashagen,bullren等前辈纷纷
留下墨宝,甚至竟然还有陈大师的高见(一年多前的贴啊,伤不起!!!)
可惜还是没搞定><,只好再问了
我有n的subject,想从中取q%个,(nq/100取整)
sas里可以surveyselect
r里可以sample (r真好。。。)
在C/C++里怎么实现呢??
因为dataset太大,用R和sas都不可行
Typically,我想做的是
#
...
main (int argc,char * argv[]) {
int arg=0;
n=atoi(argv[++arg]);
p=atoi(argv[++arg]);
double AOD;
srand((unsigned)time(NULL));
for (int i =0; i {
AOD=((double) rand() / (RAND_MAX+1)) ;
if (AOD

do something I need to matr... 阅读全帖

n********g
发帖数: 218
35
来自主题: Statistics版 - 有个R问题想请交大家
我不太懂R,所以问个白痴的问题,请大家帮帮忙
我用R在RUN一个LOOP循环5次,下面是我的部分CODES:
for i in seq(5)
{ n=100
x=runif(n,-1,1)
y=1+x+rnorm(n,0,1)

# 通过一个fit 得到 S(x)
S(x)=fit$y
Z=qnorm(S(x))
fit2=lm(d~z)
等等
}
这个循环中S(x)可能会产生数据0而导致Z会有Inf,这在fit2中就出现了错误。所以
请交大家怎么写个条件句,如果出现S(x)有0或者Z有Inf的话就重新run一次,直到最后
5次循环后的结果都没有上面的问题。
先谢谢啦!!!
j*****7
发帖数: 7
36
来自主题: Statistics版 - 有个R问题想请交大家
j<-1
while(j<=5)
{ n=100
x=runif(n,-1,1)
y=1+x+rnorm(n,0,1)
# 通过一个fit 得到 S(x)
if(出现S(x)有0或者Z有Inf的话) next;
else{ j<-j+1
S(x)=fit$y
Z=qnorm(S(x))
fit2=lm(d~z)
等等
}
}
d*******o
发帖数: 493
37
来自主题: Statistics版 - 其实R有的方面也很BT
一直有个R问题没弄明白,想请问songkun还有其它大虾一下: 为什么整数的vector
copy-on-change 要弄两个copy?中间过度的哪个copy是干什么的?
> # Floating
> rm(list = ls())
> a <- runif(1:100)
> tracemem(a)
[1] "<0x026028e0>"
> b <- a
> b[1] <- 1
tracemem[0x026028e0 -> 0x02535490]:
>
> # Integer
> rm(list = ls())
> a <- 1:100
> tracemem(a)
[1] "<0x027dafe8>"
> b <- a
> b[1] <- 1
tracemem[0x027dafe8 -> 0x02788728]:
tracemem[0x02788728 -> 0x025b39e0]:

值(
a****y
发帖数: 91
38
来自主题: Statistics版 - Question for Stratify sampling.
I am trying to understand the sampling from the following description. Does
anyone know how they get the sample stratum sizes: 10,5,10,4,6. Thanks a lot!
Generates artificial data (a 235X3 matrix with 3 columns: state, region,
income).
# The variable "state" has 2 categories (nc and sc).
# The variable "region" has 3 categories (1, 2 and 3).
# The sampling frame is stratified by region within state.
data=rbind(matrix(rep("nc",165),165,1,byrow=TRUE),matrix(rep("sc",70),70,1,
byrow=TRUE))
data=cbi... 阅读全帖
z**k
发帖数: 378
39
来自主题: Statistics版 - Adaboost M1 - what's wrong with this code?
My understanding is, for Adaboost M1, the loss function mean(-y*F) is always
strictly decreasing, but this is not the case for the following code. Can
anyone help?
I m following the example of Hastie ESL-II chapter 10.1.
sorry cannot type Chinese here. Thank you very much for help.
#================R Script====================
## Data using example given in T. Hastie, ESL, chapter 10.1
dta <- matrix(rnorm(20000), 2000, 10)
pred <- apply(dta, 1, function(x) sum(x^2))
y <- (pred > qchisq(0.5, 10))... 阅读全帖
d******e
发帖数: 7844
40
来自主题: Statistics版 - 关于Quantile regression
你这种情况和直接估y的quantile等价。
> y=log(1+runif(1000))
> summary(y)
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0005815 0.2272000 0.4037000 0.3856000 0.5589000 0.6908000
> log(1+0.5)
[1] 0.4054651
> log(1+0.25)
[1] 0.2231436
> log(1+0.75)
[1] 0.5596158
1 (共1页)