s****r 发帖数: 2386 | 1 There's no for loop or while loop in sql? Or macro, or transpose to rows? I vaguely recall we used to use sas for some wierd calls at my first job. |
|
|
e***n 发帖数: 286 | 3 【 以下文字转载自 Computation 讨论区 】
发信人: erain (红花会大老板), 信区: Computation
标 题: 紧急求问: 是否可以将一个对称不定矩阵 A 分解为 A = B * B'
发信站: BBS 未名空间站 (Sat Apr 21 16:47:52 2007)
Urgent!
For any symmetric definite matrix, for sure we can factor it with Cholesky
method. How about symmetric indefinite matrix? I need factor such a matrix
A exactly into the product form
A = B * B'
where B' is the transpose of B and B is some n x n matrix ( not necessarily
to be triangular).
I know we can factor it with a LDLT method and fur |
|
p*******n 发帖数: 4824 | 4 展开后从中间开始乘,然后把对角上相等的常量提取出来。
比如最中间两个分别是9x27和27x9的矩阵,相乘后就是(transpose(x)*y + 1)*I_{9x9}
,依次类推,我想大概上面那个解答对的可能性就很高
就是不知道有没有更简单的表达式 |
|
f*****h 发帖数: 1 | 5
The simplest way is as follows:
1) copy the data;
2) paste using paste special and choose transpose.
Not surea about the API |
|
z**********8 发帖数: 2049 | 6 HLOOKUP
VLOOKUP
TRANSPOSE
INDIRECT
MATCH
INDEX
CHOOSE
COUNTIF
COUNTIFS
COUNTBLANK
AVERAGEIF
AVERAGEIFS
FREQUENCY |
|
|
s********e 发帖数: 893 | 8 加的那个Time key是按照for1 ,for2,for3数值大小排序吗? |
|
w****w 发帖数: 521 | 9 select ID,for1 as forecast, now() as timekey
from input
union
select ID,for2 as forecast, now() as timekey
from input
union
select ID,for3 as forecast, now() as timekey
from input
再order一下。 |
|
s********e 发帖数: 893 | 10 不是高手,下面这个在Oracle下是可以的。供参考。
select id, forecast, rank() over
(partition by id order by forecast) timekey
from
(
select id, for1 forecast
from input
union select id, for2
from input
union select id, for3
from input
); |
|
s********e 发帖数: 893 | 11 如果你原始数据中允许同一个ID的for1,for2,for3有可能有相同的value,比如有两
个 10 20.1,你还想1,2,3排的话,就把rank换成row_number |
|
|
y****w 发帖数: 3747 | 13 输入输出都是文本,干嘛折腾db?给你个我手头的unpivot shell版
#!/usr/bin/ksh
#echo "only processing file with below format:"
#echo "root v1,v2,v3,v4...."
#echo "root v1 v2 v3 v4...."
#echo '----------------------------------'
infile=$1
[[ ! -f $infile ]] && echo "not a file!" && return -1
tmpf=$(mktemp)
cat $infile |sed '/^$/d' | while read rt val
do
echo "$val" | sed 's/ /\
/g' | sed 's/,/\
/g' > $tmpf
cat $tmpf | sort -n | xargs -i echo "$rt {}"
rm -f $tmpf
done
|
|
p*****n 发帖数: 387 | 14 多谢大牛们出谋划策~~
受益匪浅!
包子马上转账。
第一次弄包子,希望能顺利搞定~~ |
|
p*****n 发帖数: 387 | 15 多谢!
很受启发!
我的SQL实在太烂了。得多向你学习~~ |
|
p*****n 发帖数: 387 | 16 多谢!
shell版本更容易跨平台实现!3x! |
|
|
|
|
c*****d 发帖数: 6045 | 20 赞,以前我都没想到这个用法 xargs -i echo "$rt {}" |
|
D*********h 发帖数: 73 | 21 没有aggregated field, 就是把行转成列,把列转成行。现在数据如下:
ID Income Outcome Pending
1 2000 1000 50
2 3000 2000 20
希望数据呈现:
ID 1 2
Income 2000 3000
Outcome 1000 2000
Pending 50 20
我做了research, 貌似用PIVOT,但都挺麻烦的,有哪位大侠有简单的coding method吗
? 用sql, 在sql server 上。谢谢! |
|
d****n 发帖数: 12461 | 22 这种问题都是bad design. 要转让前台去转。 |
|
d****n 发帖数: 12461 | 23 现在的developer不学数据库原理太可恶了。除非你是vertica或者mongo,否则rdbms应
该是open for roq insertion, close for column addition. |
|
h***u 发帖数: 214 | 24 怎末copy
this is a test
paste 成
t
h
i
s
i
s
a
t
e
s
t
谢谢 |
|
i***e 发帖数: 3219 | 25 put cursor at the beginning. "M-x replace-regexp". When prompted, type in "
\(.\)", and return. When prompted again type in "\1^q^j", then return. |
|
G*****7 发帖数: 1759 | 26 function fast_grid_lookup
%% prepare the data
% randomly generate the query points
num_points = 20e6;
points = rand(num_points,2);
% specify the evenly-spaced mesh grid
num_divs = 1028; % per side
num_cells = num_divs^2;
grid_spacing = 1/(num_divs-1);
% you do not have to instantiate the grid points by
% grid = meshgrid(linspace(0, 1, num_div), linspace(0, 1, num_div));
%% find the enclosing cell of each point
tic;
points_in_cell = ceil(points/grid_spacing)+1; % damn you, 1-based matlab
toc;
%% ... 阅读全帖 |
|
|
m*f 发帖数: 8162 | 28 那就弄啊弄啊弄成一个200,000columns20rows的矩阵啊。。。 |
|
s*******d 发帖数: 59 | 29 估计是怎么可以不用新分配内存,n*n是很方便,m*n就有点麻烦了。 |
|
G***G 发帖数: 16778 | 30 a matrix stored in a file,
10000 lines with 250000 elements each line.
how to transpose to 250000 lines with 10000 elements each line. |
|
d****n 发帖数: 1637 | 31 没问题,既然你都知道是10 by 5 的matrix.
不过你也可一这么写。
//creation
float *myarr= new float *[m*n];
//deletion
if (myarr!=NULL ) delete [] myarr;
//iteration
int r, c;
for(r=0;r
for(c=0;c
cout<
//transpose 2D array to 1 dimension
茴字有5种写法,呵呵。见笑了。 |
|
X****r 发帖数: 3557 | 32 干吗加字符串,一边读一边打出来不就完了,一百列就重复一百遍。 |
|
v*******e 发帖数: 11604 | 33 估计你这个是genome data吧,用 plink 转,很快的。 |
|
i***r 发帖数: 1035 | 34 是genome data
还不知道plink怎么弄呢。。。这个ped就是plink的。
PED文件怎么搞成一长排,连less command看起来都好慢,一行就是20million。。。
我主要是想,直接python处理就完了,没想到今天居然跑了好几个小时才跑完 |
|
i***r 发帖数: 1035 | 35 我已开始也想一边读一边打。
不知道频繁的 io读写 会不会更慢? 我有空了试试。
但是我那样不断往string后面加长,估计会频繁relocate memory,更慢? |
|
X****r 发帖数: 3557 | 36 python的string是immutable的,所以在字符串后面不断加长就会不断分配新的字符串
。一般的做法是用个list,每次把新字符串放到list的最后,完了一个join。 |
|
|
i***r 发帖数: 1035 | 38 有道理,我有空了benchmark一下,看差别多大 |
|
|
|
i***r 发帖数: 1035 | 41 代码:
import sys
pop=98
def ped(genotype):
if genotype=='00':
ped='1 1 '
elif genotype=='01':
ped='1 2 '
elif genotype=='10':
ped='2 1 '
elif genotype=='11':
ped='2 2 '
else:
ped='0 0 '
return ped
def write_ped(filename,out):
fn=open(filename,'r')
fn_ped=open(out+'.ped','w')
fn_map=open(out+'.map','w')
modern_human=range(9,93) # 15+69
for ind in modern_human:
ind_genotype=''
fn.seek(0,0)
for line in fn.readlines():
lis=line[:-1].split("\t... 阅读全帖 |
|
c*********e 发帖数: 16335 | 42 把数据先存到内存里,再在内存里做些数学操作。内存多大? |
|
i***r 发帖数: 1035 | 43 input 文件的某1行,共22million
chr1 46669 46670 snp 11 + A G dbsnp.100:rs2548905
00 00 00 00 01 01 11 NN NN 11 01 NN 00
NN NN NN 00 00 00 00 00 11 NN 11 01
11 01 NN 00 00 00 11 NN 00 01 NN NN 00
00 00 NN 00 00 NN 01 00 11 NN 00 NN 01
01 NN 11 00 NN 11 NN 00 00 NN 00 01
01 NN NN 11 ... 阅读全帖 |
|
i***r 发帖数: 1035 | 44 内存有250G
这个文件7G,实际操作大概占用2~3G? |
|
|