Q*******e 发帖数: 939 | 1 【 以下文字转载自 JobHunting 讨论区 】
发信人: QuickTime (踏踏实实做事,老老实实做人), 信区: JobHunting
标 题: Two interview questions?
发信站: BBS 未名空间站 (Tue Oct 25 19:32:17 2005)
1) How to make memory copy fast?
2) When should we write malloc for ourselves?
回答的不好,上来问问 | o*****1 发帖数: 4 | 2
1. I don't know, memcpy should be the faster way, right? Or probablly you want
to use assembly string copy instruction? (Only exist for Intel platform, as
far as I know) I really can't find a way to make it faster.
2. Normally when you have tons of malloc and free and the data could be reused
, or when the memory is very small and you have to be very careful when
allocating/deallocating the memory. For example, in the STL you could provide
an allocator by yourself and some implementation did us
【在 Q*******e 的大作中提到】 : 【 以下文字转载自 JobHunting 讨论区 】 : 发信人: QuickTime (踏踏实实做事,老老实实做人), 信区: JobHunting : 标 题: Two interview questions? : 发信站: BBS 未名空间站 (Tue Oct 25 19:32:17 2005) : 1) How to make memory copy fast? : 2) When should we write malloc for ourselves? : 回答的不好,上来问问
| Q*******e 发帖数: 939 | 3 Thanks.
I answer problem 1) in the same way as you, but the engineering manager
told me that is not right, he has one way, but a little tricky! Faint!
Some guys reminded me memory copy will be faster if we copy it acording to
machine word, it is a little reasonable,but it depends on computer
achitecture.
want
reused
provide
allocator
)
【在 o*****1 的大作中提到】 : : 1. I don't know, memcpy should be the faster way, right? Or probablly you want : to use assembly string copy instruction? (Only exist for Intel platform, as : far as I know) I really can't find a way to make it faster. : 2. Normally when you have tons of malloc and free and the data could be reused : , or when the memory is very small and you have to be very careful when : allocating/deallocating the memory. For example, in the STL you could provide : an allocator by yourself and some implementation did us
| f*****r 发帖数: 229 | 4 I think about how to use cache line. For example, if the L1 cache line is 4
words (16 bytes), we move word1 to register a, then move word2 to reg b, then
move word3 to reg c, then move word4 to reg d; after those operations, move
rega to dest1, regb to dest2, etc. Is it faster?
Another possible way is using MMX mode. In each operation you can operate 16
bytes (or more). maybe MMX2 can give you better choice. But I guess that this
may be only good for bulk data copy, since mode switch has some ov
【在 Q*******e 的大作中提到】 : Thanks. : I answer problem 1) in the same way as you, but the engineering manager : told me that is not right, he has one way, but a little tricky! Faint! : Some guys reminded me memory copy will be faster if we copy it acording to : machine word, it is a little reasonable,but it depends on computer : achitecture. : : want : reused : provide
| u****u 发帖数: 229 | 5 There is no portable way to do so faster than memcpy(). If you are talking
about machine-specific, there might be TONS of faster ways. MMX, DMA, or else
, just to mention a few.
【在 Q*******e 的大作中提到】 : Thanks. : I answer problem 1) in the same way as you, but the engineering manager : told me that is not right, he has one way, but a little tricky! Faint! : Some guys reminded me memory copy will be faster if we copy it acording to : machine word, it is a little reasonable,but it depends on computer : achitecture. : : want : reused : provide
| f*****r 发帖数: 229 | 6 I'm curious how to use DMA implements memcpy, in which architecture?
else
【在 u****u 的大作中提到】 : There is no portable way to do so faster than memcpy(). If you are talking : about machine-specific, there might be TONS of faster ways. MMX, DMA, or else : , just to mention a few.
| R****r 发帖数: 227 | 7 take some cs intro classes...
【在 f*****r 的大作中提到】 : I'm curious how to use DMA implements memcpy, in which architecture? : : else
| f*****r 发帖数: 229 | 8 From what you said, it seems that you really don't know this issue.
Or show some intro classes to introduce it...
I searched from the web. The post http://blog.gmane.org/gmane.os.netbsd.devel.performance/month=20030701 said that "Seeing as certain embedded archetectures have speedy general purpose DMA devices". Maybe PS2 has this kind of thing.
About how to write a faster memcpy() for Pentium processors, refer to the post
http://www.gamedev.net/community/forums/topic.asp?topic_id=16772. I just p
【在 R****r 的大作中提到】 : take some cs intro classes...
| R****r 发帖数: 227 | 9 don't panic :P
post
【在 f*****r 的大作中提到】 : From what you said, it seems that you really don't know this issue. : Or show some intro classes to introduce it... : I searched from the web. The post http://blog.gmane.org/gmane.os.netbsd.devel.performance/month=20030701 said that "Seeing as certain embedded archetectures have speedy general purpose DMA devices". Maybe PS2 has this kind of thing. : About how to write a faster memcpy() for Pentium processors, refer to the post : http://www.gamedev.net/community/forums/topic.asp?topic_id=16772. I just p
| Q*******e 发帖数: 939 | 10 Thanks.
But this kind of interview question is really strange!
Frankly,I don't like to test agaist other guys using some special
trick. It is useless in a large project :)
post
【在 f*****r 的大作中提到】 : From what you said, it seems that you really don't know this issue. : Or show some intro classes to introduce it... : I searched from the web. The post http://blog.gmane.org/gmane.os.netbsd.devel.performance/month=20030701 said that "Seeing as certain embedded archetectures have speedy general purpose DMA devices". Maybe PS2 has this kind of thing. : About how to write a faster memcpy() for Pentium processors, refer to the post : http://www.gamedev.net/community/forums/topic.asp?topic_id=16772. I just p
| u****u 发帖数: 229 | 11 That goes to the first sentence: there is no universal way. There might be a
system function call, might be some address to write to, or even weirder ways.
【在 f*****r 的大作中提到】 : I'm curious how to use DMA implements memcpy, in which architecture? : : else
| f*****r 发帖数: 229 | 12 In fact, what I want to ask is: DMA is typically used for data movement
betwwen
system momory and I/O devices, but memcpy is typically used for that between
two system memory areas. How can you use DMA for the latter purpose and which
platform can you do that? We can even not consider cache coherence issue here.
a
ways.
【在 u****u 的大作中提到】 : That goes to the first sentence: there is no universal way. There might be a : system function call, might be some address to write to, or even weirder ways. :
| P*****f 发帖数: 2272 | 13 google "fast memory copy"
which
here.
【在 f*****r 的大作中提到】 : In fact, what I want to ask is: DMA is typically used for data movement : betwwen : system momory and I/O devices, but memcpy is typically used for that between : two system memory areas. How can you use DMA for the latter purpose and which : platform can you do that? We can even not consider cache coherence issue here. : : a : ways.
| a***r 发帖数: 35 | 14
_-_! 把memory看成money了。。
【在 Q*******e 的大作中提到】 : 【 以下文字转载自 JobHunting 讨论区 】 : 发信人: QuickTime (踏踏实实做事,老老实实做人), 信区: JobHunting : 标 题: Two interview questions? : 发信站: BBS 未名空间站 (Tue Oct 25 19:32:17 2005) : 1) How to make memory copy fast? : 2) When should we write malloc for ourselves? : 回答的不好,上来问问
|
|