w****i 发帖数: 964 | 1 My program need to exec the following code many many times
D = A&B[C]^C;
where A, B, C, D are unsigned int, (pre-calculated table is too big to fit
in memory)
is it possible to find a faster way to calculate D?
Any comment is appreciated. |
h*******e 发帖数: 225 | 2 pls post the real code snippet esp the loops.
【在 w****i 的大作中提到】 : My program need to exec the following code many many times : D = A&B[C]^C; : where A, B, C, D are unsigned int, (pre-calculated table is too big to fit : in memory) : is it possible to find a faster way to calculate D? : Any comment is appreciated.
|
w****i 发帖数: 964 | 3 here is a simplified version. (real code is too long and messy)
like this:
for (A=0; A<0x10000; A++)
for (C=0; C<0x10000; C++)
D = A&B[C]^C;
B is a random array.
thanks |
r*******y 发帖数: 290 | 4 at least you can optimize B[C]^C
switch the two for loops
【在 w****i 的大作中提到】 : here is a simplified version. (real code is too long and messy) : like this: : for (A=0; A<0x10000; A++) : for (C=0; C<0x10000; C++) : D = A&B[C]^C; : B is a random array. : thanks
|
w****i 发帖数: 964 | 5 but (A&B[C])^C != A&(B[C]^C), you can't extract B[C]^C |
l***8 发帖数: 149 | 6 what exactly is each D used for? do you need to calculate every D?
and, does the order matter?
I hope you don't need to dump every one of the total 4 billion numbers... |
d****d 发帖数: 699 | 7 Not much space for improvement. You can take a look at the assembly
generated by the compiler, also watch out cache locality, the major overhead
of your code is probably memory read. |
f*****e 发帖数: 57 | 8 You can still switch the loop, and pre-calculate B[C], and you don't need to
derefernce the B array every time. |
l*****c 发帖数: 1153 | 9 Yeah. How do you use D? And what is the actual optimization target?
【在 l***8 的大作中提到】 : what exactly is each D used for? do you need to calculate every D? : and, does the order matter? : I hope you don't need to dump every one of the total 4 billion numbers...
|
b***i 发帖数: 3043 | 10 do you perform this calcualtion multiple times?
【在 w****i 的大作中提到】 : here is a simplified version. (real code is too long and messy) : like this: : for (A=0; A<0x10000; A++) : for (C=0; C<0x10000; C++) : D = A&B[C]^C; : B is a random array. : thanks
|