由买买提看人间百态

topics

全部话题 - 话题: seq1
(共0页)
b********e
发帖数: 693
1
来自主题: JobHunting版 - One Phone Interview Problem
还是能有点优化的
假设第一次分开两个数组, 一个X大小, 一个Y大小
原始的SEQ1是N大小.
当我们比较SEQ1和X的时候, 我们就开是否SEQ1里面的字符都在X出现, 否则找到一个没
出现的字符, 对SEQ1进行分化成两个数组
A*********r
发帖数: 564
2
来自主题: JobHunting版 - One Phone Interview Problem
我怎么觉得没有那么复杂啊,貌似复杂度就是sort的复杂度 。。
用楼主的例子:
seq1: ABCDEFG
seq2: DBCAPFG
可以得到seq2中每个字母在seq1对应的坐标依次为, O(n)可以实现吧。
3 1 2 0 -1 5 6
P字母没有在seq1里面出现,所以标记为-1, 然后sort这个坐标组里面足用-1分割开的
每个段,变为
0 1 2 3 -1 5 6
然后找最长的strictly increment 的一组数就可以了。。
关于这个间隔sort的可行性和复杂度,可以再讨论一下。。

(
h*****g
发帖数: 312
3
来自主题: JobHunting版 - careerup 150 上一道题 答案没看懂?
9.7 A circus is designing a tower routine consisting of people standing atop
one another’s shoulders. For practical and aesthetic reasons, each person
must be both shorter and lighter than the person below him or her. Given the
heights and weights of each person in the circus, write a method to compute
the largest possible number of people in such a tower.
EXAMPLE:
Input (ht, wt): (65, 100) (70, 150) (56, 90) (75, 190) (60, 95) (68, 110)
Output: The longest tower is length 6 and includes from to... 阅读全帖
h*****g
发帖数: 312
4
来自主题: JobHunting版 - careerup 150 上一道题 答案没看懂?
9.7 A circus is designing a tower routine consisting of people standing atop
one another’s shoulders. For practical and aesthetic reasons, each person
must be both shorter and lighter than the person below him or her. Given the
heights and weights of each person in the circus, write a method to compute
the largest possible number of people in such a tower.
EXAMPLE:
Input (ht, wt): (65, 100) (70, 150) (56, 90) (75, 190) (60, 95) (68, 110)
Output: The longest tower is length 6 and includes from to... 阅读全帖
g**********y
发帖数: 423
5
来自主题: Biology版 - python为何要用嵌入来表示循环
def ReverseComplement(seq):
seq1 = 'ATCGNTAGCNatcgntagcn'
seq_dict = { seq1[i]:seq1[i+5] for i in range(20) if i < 5 or 10<=i<15 }
return "".join([seq_dict[base] for base in reversed(seq)])
s****n
发帖数: 1237
6
来自主题: JobHunting版 - One Phone Interview Problem
Give you two sequences of length N, how to find the max window of matching
patterns. The patterns can be mutated.
For example, seq1 = "ABCDEFG", seq2 = "DBCAPFG", then the max window is 4. (
ABCD from seq1 and DBCA from seq2). 起始位置无需相同。
我一点头绪都没有,就想出了brutal force的办法。对方说可以利用都是N的特性,可
以sort,还是不懂。请教一下应该怎么做。
s****n
发帖数: 1237
7
来自主题: JobHunting版 - One Phone Interview Problem
问题是如果不一样长,就不能直接判断了。
假设seq1 = ABCDE, X=ACE。
X里面的字符都出现在seq1里面,但是max window size不是3是1.
b********e
发帖数: 693
8
来自主题: JobHunting版 - One Phone Interview Problem
这个时候就是用X去比较Seq1,从头开始
Seq1之后下面几个组合
ABC
BCD
CDE
每次比较都可以重复前面的2步骤, sort, 然后分割
回避bruceforce少不少的
应为最后的结果是递归出来的, 所以在每个分叉上找到最大值就行了
h**k
发帖数: 3368
9
来自主题: JobHunting版 - one amazon interview problem
Give you two sequences of length N, how to find the max window of matching
patterns. The patterns can be mutated.
For example, seq1 = "ABCDEFG", seq2 = "DBCAPFG", then the max window is 4. (
ABCD from seq1 and DBCA from seq2). 起始位置无需相同。
这个我知道有O(nlogn)的算法,不知道是否有O(n)的算法。
t****a
发帖数: 1212
10
来自主题: JobHunting版 - Wildcard Matching题求助
这题怎么用greedy来做呢?DP的话是n*m计算量
memoize dp的解
(defn is-match [str1 str2]
(let [str1a (str str1 \a)
str2a (str str2 \a)
n1 (count str1a)
n2 (count str2a)
seq2 (apply concat (repeat n1 (range n2 -1 -1)))
seq1 (mapcat #(repeat n2 %) (range n1 -1 -1))
]
(do
(def is-match-rec
(memoize
(fn [i1 i2]
(let [l1 (- n1 i1)
l2 (- n2 i2)]
(cond (and (== 0 l1) (== 0 l2)) true
(or (=... 阅读全帖
g******1
发帖数: 295
11
来自主题: Biology版 - blastn 一问
For the following two sequences, I am doing blastn on Seq2 against Seq1
blastn -task blastn-short -outfmt "6 nident"
Seq1: ATTAGAWACCCBDGTAGTCC
Seq2: ATTAGATACCCTGGTAGTCC
The length of each sequence is 20. The blastn returns nident=17, with 3
mismatch.
However, T actually matches W and B, and G matches D.
With IUPAC characters, in this case, how can I also count ambiguous matches?
What I expect in this case is number of identity =20 instead of nident=17.
Is there any parameter for blastn so th... 阅读全帖
b********e
发帖数: 693
12
来自主题: JobHunting版 - One Phone Interview Problem
下面的方法,可能稍微优化,但是不太想正确答案
1 .如果两个字符串size相同,那么先sort,然后比较, 如果相同,就是最大值, 否则
2 .在seq2里面, 找到第一个不在seq1里面的字符, 假设位置是X
那么最大值就在(0,x)和(x+1,n-1)这两个数组里面, 假设这两个字符串是A和B
如果所有字符都出现了, 那么这2个字符串是相等的

(
c***2
发帖数: 838
13
来自主题: JobHunting版 - One Phone Interview Problem
Let’s first simplify the problem:
1) Assume the elements of both sequences contain only 26 upper case letters
(A..Z)
2) Assume both sequences don't have duplicate elements,
for example, "AABBC" is not allowed
With these in mind, it will be quite easy:
1) Map each sequence to a unsigned int32 (lower 26 bits)
seq1 = "ABCDEFG",
maps/hashes to b1= (000000) 00000000000000000001111111
seq2 = "DBCAPFG",
maps/hashes to b2=(000000) 00000000001000000001101111
2) c=b1&b2
3) Find the longest
c*******w
发帖数: 63
14
来自主题: JobHunting版 - One Phone Interview Problem
假如:
Seq1: ABXCDEFG
Seq2: DBCAPFGX
Seq中的每个字母坐标:
4 1 3 0 (-1) 6 7 2
sort之后
0 1 3 4 -1 2 6 7
Window[A,B], Window[C,D]等都不是valid的candidate啊?
不知道我有没有正确理解AprilFlower的办法.
c*******w
发帖数: 63
15
来自主题: JobHunting版 - One Phone Interview Problem
Counter Example:
Seq1: ABCDEFG
Seq2: ABXCDYZ
maps/hashes to b1= (000000) 00000000000000000001111111
maps/hashes to b2= (000000) 11100000000000000000001111
Is the answer: ABCD? It is not correct.

letters
l*******m
发帖数: 1096
16
来自主题: CS版 - 一个机器学习的问题
如果数据量不大,定义个distance(seq1, seq2), 然后上kNN或SVM
比较流行的distance 是dynamic time wrapping (DTW), DTW 是 O(n××2)的有些
慢,可以简化一下加速

x2
l**********1
发帖数: 5204
17
来自主题: Biology版 - Invitrogen Neon electorporation system?
Re:
sunnyday
please send mail or Fax to
AP Sumiyama K. the corresponding author for
paper:
//www.ncbi.nlm.nih.gov/pubmed/20219670
ask for his help about your above questions.
his lab:
//sayer.lab.nig.ac.jp/~sumiyama/index-e.html
or
//www.nig.ac.jp/section/saitou/saitou-e.html
or
//sayer.lab.nig.ac.jp/index-e.html
You can also ask him for sending this one:
pT2AL200R150G
by Fedex
of course you told him your US side Fedex receive account number and your side pays
all transportation charge.
... 阅读全帖
(共0页)