airBnb电面面经 - JobHunting版

本页内容为未名空间相应帖子的节选和存档，一周内的贴子最多显示50字，超过一周显示500字访问原贴

JobHunting版 - airBnb电面面经

相关主题
● 求两个程序的C++ code	● 求一题的完美简洁解答
● 贡献个regular expression DP解法	● aababccbc remove abc
● 讨论一道两个linked list的题目	● 重新看一道经典题
● 专家们，find the longest common substring of two strings	● 问个MSFT的题
● 问个题？	● 求教合并两数组并排除重复
● 这题谁知道答案?	● 做题做得很郁闷，求指点
● 经典面试题	● 50行code能解决addbinary 问题么
● 两个Sorted Array，找K smallest element	● 问一道题(2)

相关话题的讨论汇总
话题: string话题: int话题: len1话题: node话题: word

进入JobHunting版参与讨论

(共1页)

w******g
发帖数: 189

一上来就做题，没有废话。只做一题，但是要求直接编译运行出正确结果。
题目是给定一个word list 和一个target word，要求输出在word list 里跟target
word的edit distance 相差不大于k的所有words。我写了一个基于edit distance的解
法(见下面)，程序是要从头到尾都要写，包括main函数和test数据。写完后他问有没有
更好的解法，我说可以用trie，但是我不觉得我能在剩余时间里写得出来基于trie的解
法，就大致说了一下我认为的思路。在此也求大牛给一个基于trie解法的code。
======================
#include
#include
#include
#include
using namespace std;
//k=1
// word_list = ['dat', 'bat', 'batt', 'beetle']
// similar(query, word_list)
// similar('cat', word_list) = ['dat', 'bat']
// similar('bat', word_list) = ['bat', 'batt', 'dat']
// similar('datt', word_list) = ['dat', 'batt']
//
// To execute C++, please define "int main()"
vector similar(string word, vector &word_list, int k);
int editDist(string word1, string word2);
int main() {
string word = "datt";
vector word_list = vector({"dat", "bat", "batt", "beetle"}
);
int k=1;
vector ret = similar(word, word_list, k);
for(int i=0; i< (int) ret.size(); i++)
cout< }
int editDist(string word1, string word2){
int len1=word1.length();
int len2=word2.length();
int m[len1+1][len2+1];
//m[0][0]=0;
//init the matrix
for(int i=0; i m[i][0]=i;
}
for(int j=0; j m[0][j]=j;
//then update the matrix
for(int i=1; i<=len1; i++){
for(int j=1; j<=len2; j++){
m[i][j]=min(m[i-1][j-1]+(int)(word1[i-1]!=word2[j-1]), m[i-1][j]+1);
m[i][j]=min(m[i][j], m[i][j-1]+1);

}

}
return m[len1][len2];

}
vector similar(string word, vector &word_list, int k){
vector ret;
int n= word_list.size();
if(n<1) return ret;
for(int i=0; i if(editDist(word, word_list[i])<=k){
cout< ret.push_back(word_list[i]);
}
}

return ret;
}

k*******r
发帖数: 355

如果是简单版本，我觉得可以转化为图,用广度优先遍历走k步，就能得到所有editing
distance之内的词。每个边对应一个add/del/modify操作。trie似乎不适合这道题
如果追求更高级的版本，stackflow上有个帖子，http://stackoverflow.com/questions/12886997/how-to-find-all-strings-at-a-given-edit-distance-from-a-given-string, 涉及到一些 NLP词频分析，不过这个面试时肯定写不完代码的

j**********n
发帖数: 32

顶顶

【在 w******g 的大作中提到】

: 一上来就做题，没有废话。只做一题，但是要求直接编译运行出正确结果。
: 题目是给定一个word list 和一个target word，要求输出在word list 里跟target
: word的edit distance 相差不大于k的所有words。我写了一个基于edit distance的解
: 法(见下面)，程序是要从头到尾都要写，包括main函数和test数据。写完后他问有没有
: 更好的解法，我说可以用trie，但是我不觉得我能在剩余时间里写得出来基于trie的解
: 法，就大致说了一下我认为的思路。在此也求大牛给一个基于trie解法的code。
: ======================
: #include
: #include
: #include

m*******e
发帖数: 361

还是去年的题啊。有谁知道他家的offer给股票在什么水平？根据不同的资历？

w******g
发帖数: 189

哦？是去年的题？板上哪里有讨论么？

l******6
发帖数: 340

I agree with bfs with target word.
Trie should be solution of preparation of word list. I have a feeling that
when the question is about a set of things and a target, the set should be
prepared for repeated calls.

editing

【在 k*******r 的大作中提到】

: 如果是简单版本，我觉得可以转化为图,用广度优先遍历走k步，就能得到所有editing
: distance之内的词。每个边对应一个add/del/modify操作。trie似乎不适合这道题
: 如果追求更高级的版本，stackflow上有个帖子，http://stackoverflow.com/questions/12886997/how-to-find-all-strings-at-a-given-edit-distance-from-a-given-string, 涉及到一些 NLP词频分析，不过这个面试时肯定写不完代码的

w******g
发帖数: 189

用BFS expand这个算法，如果target word的substitution操作字符范围是256个(ASCII
码表)的话，复杂度好像很大哦。谁能贡献一个这个算法的code并分析一下复杂度吗？

G**C
发帖数: 365

onsite 拿到了吗？

ASCII

【在 w******g 的大作中提到】

: 用BFS expand这个算法，如果target word的substitution操作字符范围是256个(ASCII
: 码表)的话，复杂度好像很大哦。谁能贡献一个这个算法的code并分析一下复杂度吗？

V**********i
发帖数: 82

看来是高频题，我也挂在这题上，他让我improve的时候我有想过trie，但觉得不合适
就没说，最后问他答案，他居然说trie，吐血，当时心里就骂了丫的有本事写给我看看
。。。

【在 w******g 的大作中提到】

w****r
发帖数: 15252

我靠，面试的题目怎么这么难，还让人家怎么找工作

相关主题
● 这题谁知道答案?	● 求一题的完美简洁解答
● 经典面试题	● aababccbc remove abc
● 两个Sorted Array，找K smallest element	● 重新看一道经典题
进入JobHunting版参与讨论

b*******d
发帖数: 750

http://blog.faroo.com/2012/06/07/improved-edit-distance-based-s
long time ago I read this, and I found it very useful and easier to write
for an index.

v*****y
发帖数: 68

我看到一篇文章，如果用trie的话，可以避免一些重复的计算，例如如果我们已经计算
了target word和mit之间的距离，当我们再计算target word和mits之间的距离时，就
可以省去一些计算，这是使用trie的意义。

c********l
发帖数: 8138

问问楼上
“trie”在英文中如何发音？

v*****y
发帖数: 68

"Try"。公开课上说过这个发音...

问问楼上“trie”在英文中如何发音？

【在 c********l 的大作中提到】

: 问问楼上
: “trie”在英文中如何发音？

f***s
发帖数: 112

这题45分钟弄不出来
import java.util.HashMap;
import java.util.Map;
public class KDistance {
static class Node{
char c;
boolean isLeaf;
Node p;
Map children;
String word;
int no;
int depth;
static int counter = 0;
static Map nodemap = new HashMap();
public Node(Node parent, char c){
children = new HashMap();
p = parent;
if(p == null) depth = 0;
else depth = p.depth+1;
this.c = c;
isLeaf = false;
no = counter++;
nodemap.put(no, this);
}
public void addChild(char c){
assert(children.get(c) == null);
children.put(c, new Node(this, c));
}
public Node getChild(char c){
return children.get(c);
}
public static Node findNode(int num){
assert(num < counter);
return nodemap.get(num);
}
};

public Node constructTire(String[] dict){
Node root = new Node(null, ' ');
for(int i = 0 ; i < dict.length ; i++){
Node ptr = root;
for(int j = 0 ; j < dict[i].length() ; j++){
if(ptr.getChild(dict[i].charAt(j)) == null)
ptr.addChild(dict[i].charAt(j));
ptr = ptr.getChild(dict[i].charAt(j));
}
ptr.isLeaf = true;
ptr.word = dict[i];
}
return root;
}

public void printKDistanceElements(String target, Node ptr, int k){
int m = target.length()+1;
int n = Node.counter;
int[][] matrix = new int[m][n];
int i,j;
for(i = 0 ; i < m ; i++)
matrix[i][0] = i;
for(j = 0 ; j < n ; j++)
matrix[0][j] = Node.findNode(j).depth;

for(j = 1; j < n ;j++){
for(i = 1 ; i < m ; i++){
Node node = Node.findNode(j);
if(target.charAt(i-1) == node.c){
matrix[i][j] = matrix[i-1][node.p.no];
}else{
matrix[i][j] = Math.min(matrix[i-1][j] + 1, matrix[i][
node.p.no]+1);
matrix[i][j] = Math.min(matrix[i][j], matrix[i-1][node.p
.no]+1);
}
if(i == m - 1 && node.isLeaf == true && matrix[i][j] == k){
System.out.println(node.word);
}
}
}
}
public static void main(String[] args) {
KDistance kinst = new KDistance();
String[] input = new String[]{"baa","aaa","zz"};
String target = "a";
Node root = kinst.constructTire(input);
kinst.printKDistanceElements(target, root, 2);
}
}

a***n
发帖数: 623

用trie怎么解？
比如target字符为[blablabla1]A[blablabla2]
数组有一个字符为[blablabla1]Z[blablabla2]，编辑距离仅为1，但在trie中距离还挺
远的。

a***n
发帖数: 623

说起来这个问题确实有更好的解法，不过我不认为能在面试的时候直接把代码写出来……
FYI Aho–Corasick string matching algorithm：
http://en.wikipedia.org/wiki/Aho%E2%80%93Corasick_string_matchi

m********g
发帖数: 272

有人能在45分钟内把这个题目做出来吗？好像他们要求能run。
写个bug-free的trie都不容易呀。

j*******t
发帖数: 223

必须写出来一次编译运行通过？

Q*****a
发帖数: 33

实现了4种方法
1: 直接遍历完整计算edit distance. 285 clocks.
2: 直接遍历，计算edit distance到 >k 就返回. 149 clocks.
3: Trie+shortcut edit distance. Build trie: 606 clocks, process: 6 clocks.
http://stevehanov.ca/blog/index.php?id=114
4: Generate delete k transformation. Build dict: 17033 clocks. process: 0
clocks.
但这里不仅需要生成delete k的pattern, 还需要生成所有delete 1..k-1的pattern,
否则不能handle如(chrome brome)的case
http://blog.faroo.com/2012/06/07/improved-edit-distance-based-s
#include "common.h"
class Trie {
public:
class TrieNode {
public:
unordered_map children;
bool isEnd;
TrieNode() {
isEnd = false;
}
void AddChild(const char* s) {
if (*s == 0) {
isEnd = true;
}
else {
auto it = children.find(*s);
TrieNode* child = NULL;
if (it == children.end()) {
child = new TrieNode();
children[*s] = child;
}
else {
child = it->second;
}
child->AddChild(s+1);
}
}
void FindAllEditDistanceUnderThreshold(string path, const string& s,
const vector& curDistance, vector& results, int threshold) {
if (isEnd && curDistance.back() <= threshold) results.push_back(
path);
for (auto kv: children) {
vector newDistance = curDistance;
int topLeft = newDistance[0];
++newDistance[0];
char c = kv.first;
bool underThreshold = newDistance[0] <= threshold;
for (int len = 1; len <= s.size(); ++len) {
int t = newDistance[len];
newDistance[len] = min(topLeft, min(newDistance[len],
newDistance[len-1])) + 1;
if (c == s[len-1]) {
newDistance[len] = min(topLeft, newDistance[len]);
}
if (newDistance[len] <= threshold) {
underThreshold = true;
}
topLeft = t;
}
if (underThreshold) {
string newPath = path;
newPath.append(1, c);
kv.second->FindAllEditDistanceUnderThreshold(newPath, s,
newDistance, results, threshold);
}
}
}
};
void AddString(const char* s) {
root.AddChild(s);
}
vector FindAllEditDistanceUnderThreshold(const string& s, int
threshold) {
string path;
vector results;
vector curDistance(s.size() + 1, 0);
iota(curDistance.begin(), curDistance.end(), 0);
root.FindAllEditDistanceUnderThreshold(path, s, curDistance, results
, threshold);
return results;
}

private:
TrieNode root;
};
int EditDistance(string s1, string s2) {
vector v(s1.size() + 1);
iota(v.begin(), v.end(), 0);
for (int len = 1; len <= s2.size(); ++len) {
int topLeft = v[0];
++v[0];
for (int len1 = 1; len1 <= s1.size(); ++len1) {
int t = v[len1];
v[len1] = min(topLeft, min(v[len1], v[len1-1])) + 1;
if (s1[len1-1] == s2[len-1]) {
v[len1] = min(topLeft, v[len1]);
}
topLeft = t;
}
}
return v.back();
}
int EditDistanceWithThreshold(string s1, string s2, int threshold) {
vector v(s1.size() + 1);
iota(v.begin(), v.end(), 0);
for (int len = 1; len <= s2.size(); ++len) {
int topLeft = v[0];
++v[0];
bool underThreshold = v[0] <= threshold;
for (int len1 = 1; len1 <= s1.size(); ++len1) {
int t = v[len1];
v[len1] = min(topLeft, min(v[len1], v[len1-1])) + 1;
if (s1[len1-1] == s2[len-1]) {
v[len1] = min(topLeft, v[len1]);
}
if (v[len1] <= threshold) {
underThreshold = true;
}
topLeft = t;
}
if (!underThreshold) {
return threshold + 1;
}
}
return v.back();
}
vector correct1(string s, int d, vector& dict) {
vector results;
for (auto word: dict) {
if (EditDistance(s, word) <= d) {
results.push_back(word);
}
}
return results;
}
vector correct2(string s, int d, vector& dict) {
vector results;
for (auto word: dict) {
if (EditDistanceWithThreshold(s, word, d) <= d) {
results.push_back(word);
}
}
return results;
}
void DeleteM(string s, int start, int m, int initM, vector& results)
{
if (m != initM) {
results.push_back(s);
}
if (m == 0) return;
for (int i = start; i < s.size(); ++i) {
string s1 = s;
s1.erase(s1.begin() + i);
DeleteM(s1, i, m-1, initM, results);
}
}
// Need to generate -1 to -m string, not only -m, otherwise
// for m = 2, chrome and brome can't be treated as match
vector DeleteM(string s, int m) {
vector results;
DeleteM(s, 0, m, m, results);
return results;
}
int main() {
int threshold = 2;
const char* dictFile = "british\brit-a-z.txt";
ifstream input(dictFile);
vector wordList;
string source = "chrome";
{
Clock clock("Load word list");
for (string line; getline(input, line);) {
wordList.push_back(line);
}
}
{
Clock clock("iterate without shortcut");
auto results = correct1(source, threshold, wordList);
DumpStringVector(results);
printf("n");
}
{
Clock clock("iterate with shortcut");
auto results = correct2(source, threshold, wordList);
DumpStringVector(results);
printf("n");
}
Trie trie;
{
Clock clock("build trie");
for (string s: wordList) {
trie.AddString(s.c_str());
}
printf("n");
}
{
Clock clock("Trie with shortcut");
auto results = trie.FindAllEditDistanceUnderThreshold(source,
threshold);
DumpStringVector(results);
printf("n");
}
unordered_set origDict;
multimap deleteMDict;
{
Clock clock("Build Delete Deict");
for (auto s: wordList) {
origDict.insert(s);
for (auto del: DeleteM(s, threshold)) {
deleteMDict.insert(make_pair(del, s));
}
}
printf("n");
}
{
Clock clock("Delte dict");
auto deleteSource = DeleteM(source, threshold);
unordered_set results;
if (origDict.find(source) != origDict.end()) {
results.insert(source);
}
{
auto range = deleteMDict.equal_range(source);
auto start = range.first;
auto end = range.second;
while (start != end) {
results.insert(start->second);
++start;
}
}
for (string s: deleteSource) {
if (origDict.find(s) != origDict.end()) {
results.insert(s);
}
auto range = deleteMDict.equal_range(s);
auto start = range.first;
auto end = range.second;
while (start != end) {
if (EditDistanceWithThreshold(source, start->second,
threshold) <= threshold) {
results.insert(start->second);
}
++start;
}
}
DumpStringRange(results.begin(), results.end());
printf("n");
}
return 0;
}

【在 w******g 的大作中提到】

相关主题
● 问个MSFT的题	● 50行code能解决addbinary 问题么
● 求教合并两数组并排除重复	● 问一道题(2)
● 做题做得很郁闷，求指点	● 求教电面遇到的一道pattern match的实现
进入JobHunting版参与讨论

w******g
发帖数: 189

一上来就做题，没有废话。只做一题，但是要求直接编译运行出正确结果。
题目是给定一个word list 和一个target word，要求输出在word list 里跟target
word的edit distance 相差不大于k的所有words。我写了一个基于edit distance的解
法(见下面)，程序是要从头到尾都要写，包括main函数和test数据。写完后他问有没有
更好的解法，我说可以用trie，但是我不觉得我能在剩余时间里写得出来基于trie的解
法，就大致说了一下我认为的思路。在此也求大牛给一个基于trie解法的code。
======================
#include
#include
#include
#include
using namespace std;
//k=1
// word_list = ['dat', 'bat', 'batt', 'beetle']
// similar(query, word_list)
// similar('cat', word_list) = ['dat', 'bat']
// similar('bat', word_list) = ['bat', 'batt', 'dat']
// similar('datt', word_list) = ['dat', 'batt']
//
// To execute C++, please define "int main()"
vector similar(string word, vector &word_list, int k);
int editDist(string word1, string word2);
int main() {
string word = "datt";
vector word_list = vector({"dat", "bat", "batt", "beetle"}
);
int k=1;
vector ret = similar(word, word_list, k);
for(int i=0; i< (int) ret.size(); i++)
cout< }
int editDist(string word1, string word2){
int len1=word1.length();
int len2=word2.length();
int m[len1+1][len2+1];
//m[0][0]=0;
//init the matrix
for(int i=0; i<=len1; i++){
m[i][0]=i;
}
for(int j=0; j<=len2; j++)
m[0][j]=j;
//then update the matrix
for(int i=1; i<=len1; i++){
for(int j=1; j<=len2; j++){
m[i][j]=min(m[i-1][j-1]+(int)(word1[i-1]!=word2[j-1]), m[i-1][j]+1);
m[i][j]=min(m[i][j], m[i][j-1]+1);

}

}
return m[len1][len2];

}
vector similar(string word, vector &word_list, int k){
vector ret;
int n= word_list.size();
if(n<1) return ret;
for(int i=0; i if(editDist(word, word_list[i])<=k){
cout< ret.push_back(word_list[i]);
}
}

return ret;
}

k*******r
发帖数: 355

j**********n
发帖数: 32

顶顶

【在 w******g 的大作中提到】

m*******e
发帖数: 361

还是去年的题啊。有谁知道他家的offer给股票在什么水平？根据不同的资历？

w******g
发帖数: 189

哦？是去年的题？板上哪里有讨论么？

l******6
发帖数: 340

w******g
发帖数: 189

G**C
发帖数: 365

onsite 拿到了吗？

ASCII

【在 w******g 的大作中提到】

V**********i
发帖数: 82

w****r
发帖数: 15252

我靠，面试的题目怎么这么难，还让人家怎么找工作

相关主题
● 求教一个string match 的 dp 解法	● 贡献个regular expression DP解法
● edit distance vs. word ladder	● 讨论一道两个linked list的题目
● 求两个程序的C++ code	● 专家们，find the longest common substring of two strings
进入JobHunting版参与讨论

b*******d
发帖数: 750

http://blog.faroo.com/2012/06/07/improved-edit-distance-based-s
long time ago I read this, and I found it very useful and easier to write
for an index.

v*****y
发帖数: 68

c********l
发帖数: 8138

问问楼上
“trie”在英文中如何发音？

v*****y
发帖数: 68

"Try"。公开课上说过这个发音...

问问楼上“trie”在英文中如何发音？

【在 c********l 的大作中提到】

: 问问楼上
: “trie”在英文中如何发音？

f***s
发帖数: 112

a***n
发帖数: 623

用trie怎么解？
比如target字符为[blablabla1]A[blablabla2]
数组有一个字符为[blablabla1]Z[blablabla2]，编辑距离仅为1，但在trie中距离还挺
远的。

a***n
发帖数: 623

m********g
发帖数: 272

有人能在45分钟内把这个题目做出来吗？好像他们要求能run。
写个bug-free的trie都不容易呀。

j*******t
发帖数: 223

必须写出来一次编译运行通过？

Q*****a
发帖数: 33

相关主题
● 专家们，find the longest common substring of two strings	● 经典面试题
● 问个题？	● 两个Sorted Array，找K smallest element
● 这题谁知道答案?	● 求一题的完美简洁解答
进入JobHunting版参与讨论

j******r
发帖数: 98

qiu refer

s*********3
发帖数: 25

这个不对吧，编辑距离还包括，删除和插入字母的操作。
你的操作只有修改。

e**********6
发帖数: 78

确实是啊，我忘了。那就再多两个loop。一个是del，一个是insert，没有什么大不同。
核心想法就是穷举所有edit distance == 1的情况。我把之前的code删了，免得贻害众
人。。
如下：
// for delete.
for (int i = 0; i < curr.length(); i++)
{
curr.erase(i);
// compare and insert if in dict
}
// for insert
for (int i = 0; i <= curr.length(); i++)
{
for (char c = 'a'; c <= 'z'; c++)
{
curr = curr.substr(0, i) + c + curr.substr(i);
// compare and insert if in dict
}
}

【在 s*********3 的大作中提到】

: 这个不对吧，编辑距离还包括，删除和插入字母的操作。
: 你的操作只有修改。

n***t
发帖数: 76

如果面试官要的是最优解，那么说明他们就喜欢搞过ACM大牛。。。没办法，公司太火
，candidate都太强

(共1页)

进入JobHunting版参与讨论

相关主题
● 问一道题(2)	● 问个题？
● 求教电面遇到的一道pattern match的实现	● 这题谁知道答案?
● 求教一个string match 的 dp 解法	● 经典面试题
● edit distance vs. word ladder	● 两个Sorted Array，找K smallest element
● 求两个程序的C++ code	● 求一题的完美简洁解答
● 贡献个regular expression DP解法	● aababccbc remove abc
● 讨论一道两个linked list的题目	● 重新看一道经典题
● 专家们，find the longest common substring of two strings	● 问个MSFT的题

相关话题的讨论汇总
话题: string话题: int话题: len1话题: node话题: word

#	版面	帖数(主题数)
-	全站	4871 (796)
1	Military	3777 (569)
2	Stock	341 (51)
3	Joke	117 (17)
4	History	116 (3)
5	Automobile	100 (9)
6	USANews	55 (9)
7	Midlife	45 (1)
8	Headline	41 (41)
9	Dreamer	33 (13)
10	FleaMarket	32 (20)
11	Living	30 (7)

boards

未名新帖统计// 7月16日

历史上的今天