an interview question, find mode in a rolling window along data sequence - JobHunting版 - 未名存档

本页内容为未名空间相应帖子的节选和存档，一周内的贴子最多显示50字，超过一周显示500字访问原贴

JobHunting版 - an interview question, find mode in a rolling window along data sequence

相关主题
● 我发现我竟然学会了12种tree traversal的办法	● LC的BST iterator到底要考察什么？
● An interview question of finding the median in a moving window.	● how to judge a linked list is palindrome?
● 问一道题(9)	● leetcode中的binary tree level order traverse II总是有run t
● Bloomberg 电面	● 请问怎样写没有parent pointer的BST iterator?
● 算法题	● L家的高频题merge k sorted arrays giving iterators求讨论！
● 谁来解释下hashtable的iterator是怎么实现的	● reverse an array
● （求推荐）recursion以及把recursion转变为iteration的资料	● 大量数据里面找top 100
● FB 上周2电面	● Amazon 面经

相关话题的讨论汇总
话题: data话题: window话题: hashtable话题: frequency话题: each

进入JobHunting版参与讨论

1

(共1页)

m****r 发帖数: 141	1 Given a sequence of data (with duplicates), move a fix-sized window along the data sequence and find mode in the window at each iteraion, where the oldest data is removed and a new data is inserted to the window. I cannot find better solutions here. My idea: Use a hashtable, key is the data, key's data is the frequency of the data occuring in the window. At the first iteration, iterate each data in the window and put it to the hashtable, meanwhile cout the frequency of each data. After that, traverse the hashtable and return the data with the highest frequency. For each following iteration, search the oldest data in the hashtable and reduce its frequency by 1 if it is in the hahstable if it becoms 0 use new data to replace the old one. Otherwise, just insert the new data into the hahstable. Traverse the table and return the mode. It is O(n * m) where n is data seq size and m is window size. The drawback is : The hashtable size is not fixed, it may have resize overhead. Each iteration, the table has to be traversed, it is not effcient. Is it possble to do it with O(n lg m) or O(n) ? Any help is appreciated. thanks
s******n 发帖数: 3946	2 hashtable + linked-list
k*******r 发帖数: 355	3 维持一个按频率排序的数组，hashtable 中除了存frequency,还存下此元素在排序数组中的位置。因为每次移动窗口, 如果对新元素，就直接加到排序数组末尾或从排序数组末尾删除，如果是旧元素，其frequency只可能+1 或-1或不变，所以你只用根据hash table直接找到此元素在排序数组中位置，将它与前一个或后一个元素的值比较，看是否要swap就可以了。总时间复杂度O(n)
m****r 发帖数: 141	4 It is a good idea. Space : O(m + m), m is the window size, one m is for 按频率排序的数组, another m is for hashtable. Is it possible to reduce space to O(m) ? or even O(lg m). Any help is appreciated. Thanks 组中的位置。，如果是旧元素，其frequency只可能+1 或-1或不变，所以你只用根据hash table直接找到此元素在排序数组中位置，将它与前一个或后一个元素的值比较，【在 k*******r 的大作中提到】 : 维持一个按频率排序的数组，hashtable 中除了存frequency,还存下此元素在排序数组中的位置。 : 因为每次移动窗口, 如果对新元素，就直接加到排序数组末尾或从排序数组末尾删除，如果是旧元素，其frequency只可能+1 或-1或不变，所以你只用根据hash table直接找到此元素在排序数组中位置，将它与前一个或后一个元素的值比较， : 看是否要swap就可以了。 : 总时间复杂度O(n)
a****2 发帖数: 1458	5 everytime u update the array, you need to change all the elements in the hash table, right? 组中的位置。，如果是旧元素，其frequency只可能+1 或-1或不变，所以你只用根据hash table直接找到此元素在排序数组中位置，将它与前一个或后一个元素的值比较，【在 k*******r 的大作中提到】 : 维持一个按频率排序的数组，hashtable 中除了存frequency,还存下此元素在排序数组中的位置。 : 因为每次移动窗口, 如果对新元素，就直接加到排序数组末尾或从排序数组末尾删除，如果是旧元素，其frequency只可能+1 或-1或不变，所以你只用根据hash table直接找到此元素在排序数组中位置，将它与前一个或后一个元素的值比较， : 看是否要swap就可以了。 : 总时间复杂度O(n)
k*******r 发帖数: 355	6 不用阿，只需要修改hashtable 中对应元素的 frequncy值和排序location值，因为每次最多只涉及调整hashtable中的两个elements，所以每次操作是 O(1) 【在 a****2 的大作中提到】 : everytime u update the array, you need to change all the elements in the : hash table, right? : : 组中的位置。 : ，如果是旧元素，其frequency只可能+1 或-1或不变，所以你只用根据hash table直 : 接找到此元素在排序数组中位置，将它与前一个或后一个元素的值比较，

1

(共1页)

进入JobHunting版参与讨论

相关主题
● Amazon 面经	● 算法题
● input a string "hello word", print l:3 o:2 e:1 d:1 h:1 r:1 w:1.不知道哪错了	● 谁来解释下hashtable的iterator是怎么实现的
● Two problems from Google	● （求推荐）recursion以及把recursion转变为iteration的资料
● 面经（L）	● FB 上周2电面
● 我发现我竟然学会了12种tree traversal的办法	● LC的BST iterator到底要考察什么？
● An interview question of finding the median in a moving window.	● how to judge a linked list is palindrome?
● 问一道题(9)	● leetcode中的binary tree level order traverse II总是有run t
● Bloomberg 电面	● 请问怎样写没有parent pointer的BST iterator?

相关话题的讨论汇总
话题: data话题: window话题: hashtable话题: frequency话题: each

未名新帖统计// 7月16日

#	版面	帖数(主题数)
-	全站	4871 (796)
1	Military	3777 (569)
2	Stock	341 (51)
3	Joke	117 (17)
4	History	116 (3)
5	Automobile	100 (9)
6	USANews	55 (9)
7	Midlife	45 (1)
8	Headline	41 (41)
9	Dreamer	33 (13)
10	FleaMarket	32 (20)
11	Living	30 (7)

* 这里只显示发帖超过25的版面，努力灌水吧:-)