第2页 - 关于fftw的讨论汇总 - 话题女王

g****e
发帖数: 1829

从8.10升级到9.04以后(online update)，以前编译安装的一些软件包比如说fftw，gsl什么的要重新编
译么？

t*******f
发帖数: 2634

Will the cuda driver work for RHEL 5.4? I have C1060 GPU card.
Also will cuda compile a new Linux kernel by itself?
How much faster can cufft be compared with fftw for
1024x1024 and 2048x2048 double complex FFT?

w***g
发帖数: 5958

来自主题: Programming版 - numerical recipe里的快速傅立叶变换

用fftw。matlab也用这个。

D***n
发帖数: 6804

来自主题: Programming版 - 请不要盲目崇拜FP语言

哈哈哈
SPARK没有libgfortran都运行不了，当然SPARK每次在不同系统上都需要重新编译，否
则libgfortran从何而来？天上掉下来么？
孩子你太幼稚了，估计你连FFTW都没听说过吧。

D***n
发帖数: 6804

来自主题: Programming版 - 请不要盲目崇拜FP语言

Fortran我一开始就说了，不重复，请自行脑补。
重构个屁，你说这话就是扯头扯尾的无知了。
比如傅立叶变换里的数论变换NTT，如果用梅森质数，32位和64位就不同，因为31位是
一个梅森质数，61位是另外一个梅森质数。不一样的质数出来不一样的精度（这个好久
不搞，记不太清楚了）
这些微妙的区别能用integer一个类型解决吗？
这方面的优化问题倒是原理上可以用FP解决，因为最后的解往往是一个函数。把具体情
况作为参数带进去生成不同的优化函数，FFTW就是这么干的。
这个话题到此为止。我也是外行，但是懂的好像比你多一点点。

w***g
发帖数: 5958

来自主题: Programming版 - intel knights landing 72core CPU 谁用过？

你有benchmark吗? 你这么说我很涨见识. 我见过的几个, openblas有openmp或者
thread版,
opencv用tbb, fftw用openmp, 还没见过哪个单机跑的轮子用MPI的. 你没有用32MPI我
觉得
就是一个证据, 就是MPI还做不到底. 但是即使是4x8或8x4能把OpenMP干掉我觉得也很
牛.

s******u
发帖数: 501

来自主题: Programming版 - intel knights landing 72core CPU 谁用过？

你可以去试一下，单机上跑fftw，MPI一点都不比OpenMP差，是不是更好我倒是忘记了
，很久以前跑的。有两个以上的node，比方说32x2或者32x4的话纯MPI就不行了，不过
这个也许跟MPI的实现和硬件有更大的关系。3d FFT的transpose要用到alltoall，如果
优化不好的话这个是最大的性能瓶颈
benchmark我找找看

w***g
发帖数: 5958

来自主题: Programming版 - intel knights landing 72core CPU 谁用过？

fftw还真有MPI!

s******u
发帖数: 501

来自主题: Programming版 - intel knights landing 72core CPU 谁用过？

而且MPI版本的fftw远远快过OpenMP，至少三年前我测的时候是这样子
虚线是MPI，实线是OpenMP，超过32的部分不用去管

g****t
发帖数: 31659

来自主题: Programming版 - FP的死穴还是性能

好几位好像忽略了编译型FP
F#,Haskell,Ocaml很快
以下是前面一个链接有人说的：
Our modern commercial packages for numerical computing are written in F# and
it beats Fortran quite happily. FFTW provides the FFT routines in MATLAB
and is written in OCaml and beats everything else quite happily. – Jon
Harrop

: python的卖点是易学，跟FP不在一个生态位

: python随便个阿猫阿狗学几个小时就能开始写code了，哪个FP能做到？

: 另外用python主要是用库，很多是C／C 写的

: 现在主流的FP，比如scala／clojure／F#都跑在虚拟机上，还不见得比python调
库快

: 这些FP语言跟java/C#/C/Cpp比还是性能差些

: 如果编译器经过AI优化后FP性能没有差距是... 阅读全帖

发帖数: 1

来自主题: Programming版 - 150行 F# 做矩阵运算比MKL还快

单纯的要在特定情况下比mkl或者fftw快，还是不算难的，我原来公司也自己重写过
fft
mkl目标是通用库不只是为了快

s*******k
发帖数: 71

来自主题: Software版 - Re: Can anyone recommend any web site to download a FFT subroutine for

Mathematical routines such as FFT shouldn't be machine dependent unless it
uses hardware accelerations (for example, Intel MMX instructions), so what
you have for SGI should be able to compile and run on any machine. If not,
try www.fftw.org, they have a free FFT library which works for almost all
platforms (well, maybe not PalmPC!)
SH

l******n
发帖数: 9344

来自主题: Unix版 - fortran使用fftw的一份问题

老是得到如下的错误信息，怎么回事？
谢谢
f90: Warning: Option -04 passed to ld, if ld is invoked, ignored otherwise
INTEGER FFTW_R2HC
^
"fftw3.f", Line = 1, Column = 7: ERROR: This unnamed main program unit is
missing an END statement.
f90comp: 95 SOURCE LINES
f90comp: 1 ERRORS, 0 WARNINGS, 0 OTHER MESSAGES, 0 ANSI
*** Error code 1

d*****w
发帖数: 124

来自主题: Computation版 - fft algorithm

waste time. r u sure u can beat FFTW?

can

X****r
发帖数: 3557

来自主题: Computation版 - fft algorithm

IMHO, the documentation of fftw is good enough.

y**********a
发帖数: 16

来自主题: Computation版 - [转载] 做FFT的陷阱

hehe, 第二条我也曾费了好大工夫才搞清楚怎么回事。
还是fftw简单好用，还快好多。

r****y
发帖数: 1437

来自主题: Computation版 - [转载] 做FFT的陷阱

nod nod, strongly recommend fftw.
I heard the professor who taugh me signal processing said that,
at 1960s, the guys who invented fft can actually patent it. But they did not.
If they did, they should be much richer than Bill Gates now.

g*****e
发帖数: 19

来自主题: Computation版 - [转载] 做FFT的陷阱

is fftn & ifftn OK in matlab? It seems mathworks use fftw for fftn and ifftn?
Is that right?

O******e
发帖数: 734

来自主题: Computation版 - a C language question regarding pointer usage

Using ** to deal with 2D arrays normally means that the array is not
stored contiguously in memory, and for nonsparse algorithms manipulating
the entire array at once the fragmentation can inhibit code optimization
since the compiler will generally have difficulty resolving aliasing and
flow
dependency issues.
FFTW and other references suggest allocating the entire array using *,
then using ** to point to the beginning of each row if you want the
convenience of using [][] indexing.

memory
[]
th

m***0
发帖数: 3

来自主题: Computation版 - FFT

fftw is more popular

fourier

s****y
发帖数: 2052

来自主题: Computation版 - 关于三维快速傅立叶（FFT）

google fftw

K*****n
发帖数: 23

来自主题: Computation版 - 关于三维快速傅立叶（FFT）

FFTW

A*g
发帖数: 102

来自主题: Computation版 - 求DFT的Fortran源代码。。。。。。。。。。。。。。。。。。。。

fftw

O******e
发帖数: 734

来自主题: Computation版 - 求DFT的Fortran源代码。。。。。。。。。。。。。。。。。。。。

Numerical Recipes if you need to understand the code but don't care
about performance, FFTW if you need speed but don't care about the code.

d*******2
发帖数: 340

来自主题: Computation版 - 一个发现: fftw里面的C语言做Fourier transform比Matlab快6倍 (无内文)

s**b
发帖数: 169

来自主题: Computation版 - 一个发现: fftw里面的C语言做Fourier transform比Matlab快6倍 (无内文)

hehe, reasonable

d*******2
发帖数: 340

来自主题: Computation版 - matlab改成C++,还用了号称史上最快的fftw，结果慢了一倍

matlab算每次循环只要21-29分钟，现在C++倒要50-60分钟了。怎么回事？哪位给个提
示？先谢了！

p*****e
发帖数: 310

来自主题: Computation版 - matlab改成C++,还用了号称史上最快的fftw，结果慢了一倍

编译的时候用优化选项了吗？还是在调试选项？

d*******2
发帖数: 340

来自主题: Computation版 - matlab改成C++,还用了号称史上最快的fftw，结果慢了一倍

不知道什么叫优化选项和调试选项。用的是dev C++,按F9 compile and run. 又比较了
一下，13小时平均下来 matlab还是快20%左右。31分钟对39.5分钟

p*****e
发帖数: 310

来自主题: Computation版 - matlab改成C++,还用了号称史上最快的fftw，结果慢了一倍

去找找编译设置吧，把优化速度的选上

h***z
发帖数: 233

来自主题: Computation版 - matlab改成C++,还用了号称史上最快的fftw，结果慢了一倍

A few things to try:
1. Compiler options: Turn on compiler optimization as well as any SIMD
2. Libraries: Link against pre-compiled libraries tuned for your computer
architecture and tune parameters of any libraries that you compile yourself.
3. Your code: Keep in mind the cache size of your CPU and write your code to
avoid cache thrashing.

l******n
发帖数: 9344

来自主题: Computation版 - 问个比较初级的问题关于cosine transform 解篇微分的

边界条件，像你说的这个一般是
C(1)=C(-1)
用cosine transform或者sine transform的实质是对角化所得的矩阵，解了之后，在用
反变幻回去就是解，根据你自己的情况看吧
fftw的document里边有介绍，比如2维有00，01，10，11,不同的选择，就是根据不同的
边值，需要shift一下

g*****a
发帖数: 340

来自主题: Computation版 - 问个比较初级的问题关于cosine transform 解篇微分的

多谢
还是不太了
大致来说就是强制最外层节点的值同次外层相同，dC/dx向外层的向量，在次外层就等
于零了？
fftw的document是在matlab里的么？木有找到相关内容哈？能否给个其它链接呢？

如果
整个

l******n
发帖数: 9344

来自主题: Computation版 - 问个比较初级的问题关于cosine transform 解篇微分的

google fftw
it is not matlab

l******n
发帖数: 9344

来自主题: Computation版 - fftw有memory leak

靠，终于找到了
在fortran call的时候出现的

l******n
发帖数: 9344

来自主题: Computation版 - compile lammps using fftw-2.1.5 and intel compiler

do you write a small code to test it?
put your makefile here

O******e
发帖数: 734

来自主题: Computation版 - compile lammps using fftw-2.1.5 and intel compiler

I also think you need to put -L before -l in SYSLIB.

l******n
发帖数: 9344

来自主题: Computation版 - compile lammps using fftw-2.1.5 and intel compiler

i agree, you need to specify the absolute directory and it seems to be the
problem you have now

O******e
发帖数: 734

来自主题: Computation版 - compile lammps using fftw-2.1.5 and intel compiler

for
I'm not sure what this mean, never installed FFTW2 before, but the .info
file is just documentation. Sounds like the rest of the stuff was
successfully installed.
If FFTW2 is anything like FFTW3, the compiled library will be installed
under /usr/local/lib by default. Do you see the library files there?
If they are there, you now have to fix your make file as I indicated.

O******e
发帖数: 734

来自主题: Computation版 - compile lammps using fftw-2.1.5 and intel compiler

Did you try
-L/usr/local/lib -ldfftw -ldrfftw
i.e., put the library path before the library?
The difference between dfftw, drfftw, etc. is explained in the manual.
You need check the manual for LAMMPS and FFTW2 to find out which one(s)
you need, and you might need to built FFTW2 several times with different
options to build the different (single and double) libraries.

l******n
发帖数: 9344

来自主题: Computation版 - compile lammps using fftw-2.1.5 and intel compiler

全部统一用一个compile,不然没办法
ifort其实不好，很慢

O******e
发帖数: 734

来自主题: Computation版 - compile lammps using fftw-2.1.5 and intel compiler

Forget about lammps for a second. After you compile and install FFTW2,
have your tried writing just a small test program to make sure it works?

O******e
发帖数: 734

来自主题: Computation版 - compile lammps using fftw-2.1.5 and intel compiler

Even though it is advisable to use the same compiler suite (gcc/g77/gfortran,
or icc/ifort), I'm not sure that this is the cause of the problem. I've
been mixing gcc/g77/icc/ifort all the time when testing my code, and I have
never had any problem.
(I don't use gfortran because it is too buggy. I'm also curious why you
say ifort is slow? In my experience the code it produces is way faster
than g77 or pgf77/pgf90, but you have to make good use of the Fortran 9x
language. Well, maybe you are c

l******n
发帖数: 9344

来自主题: Computation版 - compile lammps using fftw-2.1.5 and intel compiler

I am saying ifort is slow from my own experience, and it depends on my
machine of course. For compiling the same program, pfg90 and mpif90 is much
faster than ifort for me.
mixing compilors is doable, but you need to figure out the settings and all
the flags. Different compiler has different default setting, so reading the
whole manual is must. There also might be something which has not a
counterpart in others, this is very bad.

gfortran,
have

O******e
发帖数: 734

来自主题: Computation版 - compile lammps using fftw-2.1.5 and intel compiler

Interesting. I might have to give PGF a try again some time.

much
all
the

O******e
发帖数: 734

来自主题: Computation版 - compile lammps using fftw-2.1.5 and intel compiler

Read the FFTW2 manual. There should be very short examples in the first
few pages of the manual. You can fill an array of size N with random
numbers, do a forward FFT followed by a backward FFT, and see whether your
array is multiplied by N.
The "make test" or "make check" only tests FFTW2 after it is compiled but
before it is installed. If you run the test after you "make install", it
is still testing the unstalled copy.
I use FFTW3, and my code is too big to send to you.

l******n
发帖数: 9344

来自主题: Computation版 - compile lammps using fftw-2.1.5 and intel compiler

Testing code is very short, just do a whatever fft transform and then
transform it back.

d*******2
发帖数: 340

来自主题: Computation版 - 请问fftw和nag的fft谁好?

先谢了！

y*****n
发帖数: 3

来自主题: Computation版 - 请问fftw和nag的fft谁好?

c*******h
发帖数: 1096

来自主题: Computation版 - 并行的fft大家用什么library?

在用fftw。里面的data mapping合理但是不方便，主要是那个
FFTW_TRANSPOSED_ORDER了之后，相当于换了一个data mapping，
做后续操作的时侯为了避免MPI_AllGather搞得很麻烦

#	版面	帖数(主题数)
-	全站	4871 (796)
1	Military	3777 (569)
2	Stock	341 (51)
3	Joke	117 (17)
4	History	116 (3)
5	Automobile	100 (9)
6	USANews	55 (9)
7	Midlife	45 (1)
8	Headline	41 (41)
9	Dreamer	33 (13)
10	FleaMarket	32 (20)
11	Living	30 (7)

topics

未名新帖统计// 7月16日

历史上的今天