关于众数问题，求助。另附一个Amazon题目求问。 - JobHunting版 - 未名存档

本页内容为未名空间相应帖子的节选和存档，一周内的贴子最多显示50字，超过一周显示500字访问原贴

JobHunting版 - 关于众数问题，求助。另附一个Amazon题目求问。

相关主题
● google on campus 面试多久出结果+面经	● 贴一个Google面题
● 【Google字符串面试题】	● hash_map 的遍历问题
● 关于MAP REDUCE	● 三连击
● 问一个链表方面的算法问题 (转载)	● 一道a家电面题目
● google电面小结，兼问onsite的准备	● 一个set，有add(i), del(i), count(i), size。写random()。
● 两次重要的面试都fail在同一个问题上	● 分享一下面试题目
● 问个问题 (large-scale question)	● 愿意自断经脉的VMware面试经历
● O(N) sort integer array	● 一道题

相关话题的讨论汇总
话题: count话题: int话题: mode话题: number话题: modecount

进入JobHunting版参与讨论

1

(共1页)

i*********7 发帖数: 348	1 How to find if a number is present >= (n / 2) times in an array of size n? 关于这个题目，有没有时间复杂度是O(n)，空间复杂度是O(1)的解？另外还有一题：一百万个amazon product id，问过去一小时销售量top 10的id。
s********r 发帖数: 137	2 MapReduce would be a more scalable solution. For more information, please refer to the introduction to Hadoop or other MapReduce tutorials: http://developer.yahoo.com/hadoop/tutorial/module4.html --------------------------------------------------------- http://www.mitbbs.com/article_t/JobHunting/32078421.html 4.bar raiser: 一百万个amazon product id，问过去一小时销售量top 10的(map- reduce)
s******n 发帖数: 3946	3 int number=a[0]; int count=1; for (int i=1; i if (a[i]!=number) { count--; if (count==0) { number = a[i]; count = 1; } } else count++; } return number;
i*********7 发帖数: 348	4 这个只对绝对众数有效。也就是大于n/2的时候，但是等于n/2的时候，是无效的。【在 s******n 的大作中提到】 : int number=a[0]; : int count=1; : for (int i=1; i: if (a[i]!=number) { : count--; : if (count==0) { : number = a[i]; : count = 1; : } : } else count++;
S******t 发帖数: 151	5 Simple modification is to add another loop to check if there's a number which has appeared [n/2] times. 【在 i*********7 的大作中提到】 : 这个只对绝对众数有效。也就是大于n/2的时候，但是等于n/2的时候，是无效的。
w****x 发帖数: 2483	6 第一题是微软的, 这题我感觉没啥意思. 第二题因该市每个product分散存储在很多台机器上, 每台机器对自己存储的product id做hashing, hashing的结果发送到另外对应的机器上, 这样分散的product id就会只聚集在特定目标机器上, 因为只有100万个product id是不是可以把所有统计出来的 , count> pair 发送到同一台机器上, 因为只需要top 10, 所以不认为需要selection algorithm或堆, 只需要用一个大小10的数组, 每次遍历这10个id对应的count,找出最小的那个, 如果新的count大于最小的这个就替换它. 【在 i*********7 的大作中提到】 : How to find if a number is present >= (n / 2) times in an array of size : n? : 关于这个题目，有没有时间复杂度是O(n)，空间复杂度是O(1)的解？ : 另外还有一题： : 一百万个amazon product id，问过去一小时销售量top 10的id。
s******n 发帖数: 3946	7 yes，run another loop that exclude this number and also count the number. If the count>=n/2, then it's it; if count the right answer by excluding another number. 【在 S******t 的大作中提到】 : Simple modification is to add another loop to check if there's a number : which has appeared [n/2] times.
m*****k 发帖数: 731	8 take 2, if same, keep, otherwise discard both, keep doing until you can not discard anyone. 【在 i*********7 的大作中提到】 : 这个只对绝对众数有效。也就是大于n/2的时候，但是等于n/2的时候，是无效的。
l*********8 发帖数: 4642	9 As iverson1407 said, the program doesn't work for the number that appears exactly n/2 times when n is even. Another problem: probably there is no such number existing in the array. // Find the number(s) that appear // Input: // int * a, input array // size_t n, length of the array // Output: /// int mode[2], store the mode nubmers found. // return: // int, how many mode numbers found. (0, 1 or 2) // int FindModeNumber(int a, size_t n, int modeNumbers[2]) { if (n<2) return n; int count = 1; int mode = a[0]; for (int i=1; i if(a[i] != mode) if (--count == 0) mode = a[i]; } int modeCount = 0; // how many mode numbers found. if (VerifyModeNumber(a, n, mode) ) modeNumbers[ modeCount++ ] = mode; if (count == 1 && mode == a[n-1]) if (VerifyModeNumber(a, n, a[n-2]) ) modeNumbers[ modeCount++ ] = a[n-2]; return modeCount; } bool VerifyModeNumber(int a, size_t n, int mode) { int count=0; for (int i=0; i if (a[i]==mode) ++count; return count>= (n+1)/2; } 【在 s******n 的大作中提到】 : int number=a[0]; : int count=1; : for (int i=1; i: if (a[i]!=number) { : count--; : if (count==0) { : number = a[i]; : count = 1; : } : } else count++;
b***d 发帖数: 87	10 每次遍历还不如用堆，时间开销明显少很多。每次遍历这10个id对应的count,找出最小的那个, 如果新的count大于最小的这个就替换它. 【在 w****x 的大作中提到】 : : 第一题是微软的, 这题我感觉没啥意思. : 第二题因该市每个product分散存储在很多台机器上, 每台机器对自己存储的product : id做hashing, hashing的结果发送到另外对应的机器上, 这样分散的product id就会只 : 聚集在特定目标机器上, 因为只有100万个product id是不是可以把所有统计出来的: , count> pair 发送到同一台机器上, 因为只需要top 10, 所以不认为需要selection : algorithm或堆, 只需要用一个大小10的数组, 每次遍历这10个id对应的count,找出最 : 小的那个, 如果新的count大于最小的这个就替换它.

1

(共1页)

进入JobHunting版参与讨论

相关主题
● 一道题	● google电面小结，兼问onsite的准备
● map reduce word count	● 两次重要的面试都fail在同一个问题上
● 求帮忙解答一个面试算法题==	● 问个问题 (large-scale question)
● 刚phone完MS,好紧张。。。。	● O(N) sort integer array
● google on campus 面试多久出结果+面经	● 贴一个Google面题
● 【Google字符串面试题】	● hash_map 的遍历问题
● 关于MAP REDUCE	● 三连击
● 问一个链表方面的算法问题 (转载)	● 一道a家电面题目

相关话题的讨论汇总
话题: count话题: int话题: mode话题: number话题: modecount

未名新帖统计// 7月16日

#	版面	帖数(主题数)
-	全站	4871 (796)
1	Military	3777 (569)
2	Stock	341 (51)
3	Joke	117 (17)
4	History	116 (3)
5	Automobile	100 (9)
6	USANews	55 (9)
7	Midlife	45 (1)
8	Headline	41 (41)
9	Dreamer	33 (13)
10	FleaMarket	32 (20)
11	Living	30 (7)

* 这里只显示发帖超过25的版面，努力灌水吧:-)