x****3 发帖数: 62 | 1 Assume the gene library exist for all 7 Billion people on earth. Each person
's gene sequence is 3 billion length of 4 basic construction unit. You are
given the genetic sequence of one person. Describe how you can find his
closest genetic sequence neighbor. The closeness is defined by the edit-
distance between the two sequences. Describe how you store data and conduct
search.
朋友Onsite面的, 完全没思路。谢谢 |
n********y 发帖数: 66 | 2 suffix tree
check out ukkonen's algorithm |
s******n 发帖数: 240 | 3 Edit Distance可增可删。
【在 n********y 的大作中提到】 : suffix tree : check out ukkonen's algorithm
|
s******n 发帖数: 240 | 4 基本方法应该就是Edit Distance的定义用Dynamic Programming来做。面试的本意大概
就是考这个点吧。
不过字符集这么小,字符串这么长,应该有能优化的地方。 |
s*a 发帖数: 267 | 5 编码,AGTC分别用00,01,10,11表示,可以编码成一个8位的int
【在 s******n 的大作中提到】 : 基本方法应该就是Edit Distance的定义用Dynamic Programming来做。面试的本意大概 : 就是考这个点吧。 : 不过字符集这么小,字符串这么长,应该有能优化的地方。
|
s**********g 发帖数: 14942 | 6 8位? 你的整个DNA sequence是什么结构?
【在 s*a 的大作中提到】 : 编码,AGTC分别用00,01,10,11表示,可以编码成一个8位的int
|
p******a 发帖数: 130 | 7 这是算法题还是system design题?
[在 xm1223 (天天想上) 的大作中提到:]
:Assume the gene library exist for all 7 Billion people on earth. Each
person's gene sequence is 3 billion length of 4 basic construction unit.
You are
:given the genetic sequence of one person. Describe how you can find his
:closest genetic sequence neighbor. The closeness is defined by the edit-
:distance between the two sequences. Describe how you store data and conduct
search.
:朋友Onsite面的, 完全没思路。谢谢 |