d****n 发帖数: 233 | 1 Given a document and a query of K words, how do u find the smallest window
that covers all the words at least once in that document? (given you know
the inverted lists of all K words, that is, for each word, you have a list
of all its occurrrences). This one is really hard. Could someone propose an
algorithm in O(n)? |
d****n 发帖数: 233 | 2 I remember a similar problem was discussed not long ago. Any takers here?
an
【在 d****n 的大作中提到】 : Given a document and a query of K words, how do u find the smallest window : that covers all the words at least once in that document? (given you know : the inverted lists of all K words, that is, for each word, you have a list : of all its occurrrences). This one is really hard. Could someone propose an : algorithm in O(n)?
|
m*****g 发帖数: 226 | 3 前面贴过好象
for each work, maintain a queue
scan
if(doc[i]==word[j])
queue[j].push[doc[i]];
if( queue[0]...queue[i] all have 1 elem && queue[j] has 2 elem)
update smallestWindowSize;
dequeue queue[0]...queue[i] until every queue has only 1 elem.
endif
end scan. |
B*****t 发帖数: 335 | 4 可以用堆来做,复杂度O(nlogk)
这题overlapping 的subproblems不好定义,DP好像不是很适合
smallest window
you know
have a list
propose an
【在 d****n 的大作中提到】 : Given a document and a query of K words, how do u find the smallest window : that covers all the words at least once in that document? (given you know : the inverted lists of all K words, that is, for each word, you have a list : of all its occurrrences). This one is really hard. Could someone propose an : algorithm in O(n)?
|
d****n 发帖数: 233 | 5 can you give some link or details?
【在 B*****t 的大作中提到】 : 可以用堆来做,复杂度O(nlogk) : 这题overlapping 的subproblems不好定义,DP好像不是很适合 : : smallest window : you know : have a list : propose an
|
B*****t 发帖数: 335 | 6 维护一个最小堆和最大堆存储work的position,最小堆k个元素,最大堆的至少k个元
素,
1. 初始化的时候用k个word的最小position构造以上两个堆。
2. 检查是否已经处理了所有word,没有的话如果最大top-最小堆的top
min,否则输出min
3. 每次从min heap pop一个word的pos,并设置一个flag,标记已用,并把这个
work的下一个pos分别压入两个堆,
4. 检查max heap的top元素是否为已经标记,如果已经标记,就把它pop出来,go
back 4, 否则go 2
【在 d****n 的大作中提到】 : can you give some link or details?
|
d****n 发帖数: 233 | 7 Good idea. So the time complexity is O(N*logK)? I was studying DP and
tempted to solve problem by it. For this one, I'm thinking of another way to
solve it. Complexity is O(N*K):
Assuming we sort all the positions into a new array A, where each element A[
i] contains two members (position, word), 0<= position < N, 0<=word < K; the
elements are sorted by the position in ascending order.
Elem {
int position,
int word
}
Elem A[] = sorted Elems by position
Elem B[K];
【在 B*****t 的大作中提到】 : 维护一个最小堆和最大堆存储work的position,最小堆k个元素,最大堆的至少k个元 : 素, : 1. 初始化的时候用k个word的最小position构造以上两个堆。 : 2. 检查是否已经处理了所有word,没有的话如果最大top-最小堆的top: min,否则输出min : 3. 每次从min heap pop一个word的pos,并设置一个flag,标记已用,并把这个 : work的下一个pos分别压入两个堆, : 4. 检查max heap的top元素是否为已经标记,如果已经标记,就把它pop出来,go : back 4, 否则go 2
|
l*****a 发帖数: 14598 | 8 O(n)
please read
http://www.mitbbs.com/article/JobHunting/31561927_3.html
way to
element A[
the
【在 d****n 的大作中提到】 : Good idea. So the time complexity is O(N*logK)? I was studying DP and : tempted to solve problem by it. For this one, I'm thinking of another way to : solve it. Complexity is O(N*K): : Assuming we sort all the positions into a new array A, where each element A[ : i] contains two members (position, word), 0<= position < N, 0<=word < K; the : elements are sorted by the position in ascending order. : Elem { : int position, : int word : }
|
x******3 发帖数: 245 | 9 这题的解法是不是用k个指针,每个指针指向一个occurrence list (sorted), 然后每
次移动位置最小的指针, 在这过程中update min window size
难道我理解错题意? |
s*******s 发帖数: 27 | 10 Why can it give the minimum?
【在 x******3 的大作中提到】 : 这题的解法是不是用k个指针,每个指针指向一个occurrence list (sorted), 然后每 : 次移动位置最小的指针, 在这过程中update min window size : 难道我理解错题意?
|
x******3 发帖数: 245 | 11 证明不出来,只能说说我的理解
给定一个window,只能通过移动window的左边界来得到更小的window,所以通过不断调
整window的左边界(即最小的
position),总是能遇到最小的window
【在 s*******s 的大作中提到】 : Why can it give the minimum?
|
s*******s 发帖数: 27 | 12 直观上你的方法可行。试着证明一下:
最小的window符合以下特性的window中最小的:
A1:window的边界元素(左一个右一个)对应的keyword在该window中除了边界元素不能有其他
occurrence
证明:如果A1不符合,说明window还能够再挤扁一点。
所有符合A1的window:
a)或者能被你的算法直接hit到;
b)或者找到另一个符合A1的window w0具有相同的边界元素,而w0能被你的算法直接hit到。
证明:稍微有点麻烦,但是也是可以的。
【在 x******3 的大作中提到】 : 证明不出来,只能说说我的理解 : 给定一个window,只能通过移动window的左边界来得到更小的window,所以通过不断调 : 整window的左边界(即最小的 : position),总是能遇到最小的window
|