t*****o 发帖数: 74 | 1 Can anyone refer some good books or website on Fortran code optimization?
i.e. how to improve the code efficiency
I know if there two loop, one is 10 and another is 1000
do i = 1, 10
do j = 1, 1000
...
end do
end do
is better than
do i = 1, 1000
do j = 1, 10
...
end do
end do
Is there any other way to improve the efficiency? | t****n 发帖数: 39 | 2 Actually code optimization is machine dependent. However, for most popular
CPUs, the code written in a cache friendly manner could increase performance
significantly. For example, if you want calculate the following array
addition, C = A + B, and by switching off the compiler optimization,
code written as
do j = 1,jm
do i = 1,im
C(i,j) = A(i,j) + B(i,j)
end do
end do
would be way faster than
do i = 1,im
do j = 1,jm
C(i,j) = A(i,j) + B(i,j)
end do
end do
There are a lot of other t
【在 t*****o 的大作中提到】 : Can anyone refer some good books or website on Fortran code optimization? : i.e. how to improve the code efficiency : I know if there two loop, one is 10 and another is 1000 : do i = 1, 10 : do j = 1, 1000 : ... : end do : end do : is better than : do i = 1, 1000
| t*****o 发帖数: 74 | 3 thank you!
LOL
I always think the second one is faster because I think array is stored by J
column first then I row
Someone said in C there is a program called gprofile which can find out what
part take the most computation time, do you know if there is such program for
fortran too?
【在 t****n 的大作中提到】 : Actually code optimization is machine dependent. However, for most popular : CPUs, the code written in a cache friendly manner could increase performance : significantly. For example, if you want calculate the following array : addition, C = A + B, and by switching off the compiler optimization, : code written as : do j = 1,jm : do i = 1,im : C(i,j) = A(i,j) + B(i,j) : end do : end do
| S***y 发帖数: 186 | 4 I think most optimizations turn out to be the efficient use of the CACHE.
So, knowing the cache structure of a specific machine is the starting point.
If not an expert, I think it would be better trying not to spend too much
time on this. Trust the optimization options of the compilers, such as
-O, -arch, -tp ... They really help.
Another point, try to call optimized library subroutines whenever possible,
such as, linear algebric manipulations, fast Fourier transforms ...
Profile utilities are a
【在 t*****o 的大作中提到】 : thank you! : LOL : I always think the second one is faster because I think array is stored by J : column first then I row : Someone said in C there is a program called gprofile which can find out what : part take the most computation time, do you know if there is such program for : fortran too?
| l******v 发帖数: 12 | 5 yeah, that's the row major and column major difference between C and Fortran.
btw, in Fortran 90 and some implementations of Fortran 77, array ops are much
simpler. just write:
c = a + b, if a, b, and c are of same form
dot_product(a, b) gives the dot product of two vectors of same length
mat_mul(a, b) gives the multiplication of two matrix
d = k*a, when k is scaler, a and d has same form
the optimization jobs are then handled by the complier
for
performance
optimization?
【在 t*****o 的大作中提到】 : thank you! : LOL : I always think the second one is faster because I think array is stored by J : column first then I row : Someone said in C there is a program called gprofile which can find out what : part take the most computation time, do you know if there is such program for : fortran too?
| t*****o 发帖数: 74 | 6 thanks Sunyy and Lyapunov
learn a lot! | x*y 发帖数: 364 | 7 The compiler group of CS in my university are developing HPCVIEW to optimize
the code. I tested it on my Fortran code on a SGI origin machine last year,
it's really cool, code can even be much fater if we pass arrays from
subroutine to subroutine in good way. It will be really great if this tool
come to be in use.
【在 t*****o 的大作中提到】 : Can anyone refer some good books or website on Fortran code optimization? : i.e. how to improve the code efficiency : I know if there two loop, one is 10 and another is 1000 : do i = 1, 10 : do j = 1, 1000 : ... : end do : end do : is better than : do i = 1, 1000
|
|