一道google面经题 - JobHunting版 - 未名存档

本页内容为未名空间相应帖子的节选和存档，一周内的贴子最多显示50字，超过一周显示500字访问原贴

JobHunting版 - 一道google面经题

相关主题
● 一道google的面试题.	● g家如何判断一个字符串里的都是合法的UTF-8编码in java？
● 到底什么样的二叉树才是平衡二叉树？	● 请问一道G phone interview题
● 弱问题，连反转链表都看不懂了	● 问两个题，一直不知道如何做
● 请教电面试题	● 怎么把 integer 转为 multi-byte integer format？ (转载)
● Google Intern	● 判断 bst 疑问
● 有道面试题是 UTF-8 string，大家知道原题是什么吗？记得有人报过，但是不知道具体是干嘛？	● remove a node in O(1) from link list
● Google电面题写dominoChecker class	● 看到一个ｃ的面试题，求教。
● 刚结束大G面试	● 新鲜Amazon面经(附参考答案) 顺便求各种大公司refer

相关话题的讨论汇总
话题: byte话题: type话题: rune话题: return话题: int

进入JobHunting版参与讨论

1

(共1页)

s*******3 发帖数: 6	1 UTF-8 encoding scheme is described below: 0XXXXXXX = this is the entire rune 10XXXXXX = this is a continuation of the rune from the previous byte 110XXXXX = this is the start of a 2-byte rune. 1110XXXX = this is the start of a 3-byte rune. 11110XXX = this is the start of a 4-byte rune. 111110XX = this is the start of a 5-byte rune. 1111110X = this is the start of a 6-byte rune. 11111110 = this is the start of a 7-byte rune. 11111111 = this is the start of a 8-byte rune. For example, a 3-byte rune would be 1110XXXX 10XXXXXX 10XXXXXX. Write a function that decides whether a given byte array (or string) is valid UTF-8 encoded text. 求讨论求指导
x***j 发帖数: 75	2 没太看懂
l*****a 发帖数: 14598	3 那就去学习一下UTF-8的定义【在 x***j 的大作中提到】 : 没太看懂
l*****a 发帖数: 14598	4 通过这个识别每个byte的type,再check是否满足规则？ public int getType(byte input) { int target=1<<7; int type=0; while(target>0) { if(intput & target==0) return type; type++; target>=1; } } 【在 s*******3 的大作中提到】 : UTF-8 encoding scheme is described below: : 0XXXXXXX = this is the entire rune : 10XXXXXX = this is a continuation of the rune from the previous byte : 110XXXXX = this is the start of a 2-byte rune. : 1110XXXX = this is the start of a 3-byte rune. : 11110XXX = this is the start of a 4-byte rune. : 111110XX = this is the start of a 5-byte rune. : 1111110X = this is the start of a 6-byte rune. : 11111110 = this is the start of a 7-byte rune. : 11111111 = this is the start of a 8-byte rune.
l*******a 发帖数: 16	5 Nice~ public boolean validUTF8(byte [] array){ if (array == null \|\| array.length == 0){ return false; } int leftByte = 0; for (int i = 0; i < array.length; i++){ int val = getType(array[i]); if (leftByte == 0){ // new start if (val == 1){ // continue byte return false; } else if (val == 0){ continue; } else { leftByte = val-1; } } else{ // shall continue if (val != 1){ return false; } else{ leftByte--; } } } if (leftByte == 0){ return true; } else{ return false; } } public int getType(byte input) { int target=1<<7; int type=0; while(target>0) { if((input & target)==0) return type; type++; target = target>>1; } return type; } 【在 l*****a 的大作中提到】 : 通过这个识别每个byte的type,再check是否满足规则？ : public int getType(byte input) { : int target=1<<7; : int type=0; : while(target>0) { : if(intput & target==0) return type; : type++; : target>=1; : } : }
p**p 发帖数: 742	6 这样做不是最优吧。没必要把8位都循环一遍。遇到第一个0就可以返回了： private int getType(byte b) { int mask = 1 << 7; int type = 0; while((mask & b) != 0) { type++; mask >>= 1; } return type; } 【在 l*****a 的大作中提到】 : 通过这个识别每个byte的type,再check是否满足规则？ : public int getType(byte input) { : int target=1<<7; : int type=0; : while(target>0) { : if(intput & target==0) return type; : type++; : target>=1; : } : }
p**p 发帖数: 742	7 看错了。你的方法也是第一个0就返回。【在 p**p 的大作中提到】 : 这样做不是最优吧。没必要把8位都循环一遍。 : 遇到第一个0就可以返回了： : private int getType(byte b) { : int mask = 1 << 7; : int type = 0; : while((mask & b) != 0) { : type++; : mask >>= 1; : } : return type;
p*****2 发帖数: 21240	8 我回家写一下【在 l*****a 的大作中提到】 : 通过这个识别每个byte的type,再check是否满足规则？ : public int getType(byte input) { : int target=1<<7; : int type=0; : while(target>0) { : if(intput & target==0) return type; : type++; : target>=1; : } : }
p**p 发帖数: 742	9 private int getType(byte b) { int mask = 1 << 7; int type = 0; while((mask & b) != 0) { type++; mask >>= 1; } return type; } public boolean isValidUTF8(byte[] bytes) { if(bytes == null \|\| bytes.length == 0) { return false; } int bytesLeft = 0; for(byte b : bytes) { int type = getType(b); if(type == 0) { if(bytesLeft != 0) { return false; } continue; } else if (type == 1) { if(bytesLeft == 0) { return false; } bytesLeft--; } else { if(bytesLeft != 0) { return false; } bytesLeft = type-1; } } return bytesLeft == 0; } 【在 s*******3 的大作中提到】 : UTF-8 encoding scheme is described below: : 0XXXXXXX = this is the entire rune : 10XXXXXX = this is a continuation of the rune from the previous byte : 110XXXXX = this is the start of a 2-byte rune. : 1110XXXX = this is the start of a 3-byte rune. : 11110XXX = this is the start of a 4-byte rune. : 111110XX = this is the start of a 5-byte rune. : 1111110X = this is the start of a 6-byte rune. : 11111110 = this is the start of a 7-byte rune. : 11111111 = this is the start of a 8-byte rune.
n**p 发帖数: 1150	10 while(s != NULL && s != NULL) { if (s & 0x80 == 0) {s++; continue;} if (s & 0xC0 == 0x80) return false; char t = s+1; for(int bitmask = 0x40; bitmask >0 && (s & bitmask != 0); bitmask /= 2, t++) if (t == NULL \|\| (t & 0xC0 != 0x80)) return false; s = t + 1; } return true; 【在 s******3 的大作中提到】 : UTF-8 encoding scheme is described below: : 0XXXXXXX = this is the entire rune : 10XXXXXX = this is a continuation of the rune from the previous byte : 110XXXXX = this is the start of a 2-byte rune. : 1110XXXX = this is the start of a 3-byte rune. : 11110XXX = this is the start of a 4-byte rune. : 111110XX = this is the start of a 5-byte rune. : 1111110X = this is the start of a 6-byte rune. : 11111110 = this is the start of a 7-byte rune. : 11111111 = this is the start of a 8-byte rune.
p*****2 发帖数: 21240	11 def isUTF(xs: List[String]):Boolean = { xs match { case Nil => true case head::tail => head.toList match { case '0'::_ => isUTF(xs.tail) case '1'::'0'::_ => false case _ => val len = (x: String) => x.prefixLength(_ == '1') val l = len(head) val rest = tail.take(l-1) rest.size == l-1 && rest.forall(len(_)==1) && isUTF(xs.drop(l)) } } }
l*********8 发帖数: 4642	12 bool isUTF8(const string & str) { int ct = 0; for (char ch : str) { int type = 8; for (ch = ~ch; ch; ch >>= 1) --type; if (ct > 0) { if (type != 1) return false; --ct; } else { if (type == 1) return false; ct = max(0, type - 1); } } return ct == 0; }

1

(共1页)

进入JobHunting版参与讨论

相关主题
● 新鲜Amazon面经(附参考答案) 顺便求各种大公司refer	● Google Intern
● google 电面面经	● 有道面试题是 UTF-8 string，大家知道原题是什么吗？记得有人报过，但是不知道具体是干嘛？
● crack code interview 4.7 给的答案是对的么	● Google电面题写dominoChecker class
● Programming interview exposed 上面的那道NULL or Cycle的linked list题	● 刚结束大G面试
● 一道google的面试题.	● g家如何判断一个字符串里的都是合法的UTF-8编码in java？
● 到底什么样的二叉树才是平衡二叉树？	● 请问一道G phone interview题
● 弱问题，连反转链表都看不懂了	● 问两个题，一直不知道如何做
● 请教电面试题	● 怎么把 integer 转为 multi-byte integer format？ (转载)

相关话题的讨论汇总
话题: byte话题: type话题: rune话题: return话题: int

未名新帖统计// 7月16日

#	版面	帖数(主题数)
-	全站	4871 (796)
1	Military	3777 (569)
2	Stock	341 (51)
3	Joke	117 (17)
4	History	116 (3)
5	Automobile	100 (9)
6	USANews	55 (9)
7	Midlife	45 (1)
8	Headline	41 (41)
9	Dreamer	33 (13)
10	FleaMarket	32 (20)
11	Living	30 (7)

* 这里只显示发帖超过25的版面，努力灌水吧:-)