由买买提看人间百态

boards

本页内容为未名空间相应帖子的节选和存档,一周内的贴子最多显示50字,超过一周显示500字 访问原贴
JobHunting版 - 什么时候用SUFFIX TREE,什么时候用TRIE
相关主题
suffix tree有必要搞懂吗?新鲜onsite面经
面试中遇到suffix tree / trie这种题,需要自己实现吗?问一个boggle题的扩展
请教个prefix tree (trie)和boggle的问题finds all repeated substrings in the string --- YAHOO interview question
请教几道经典题那个 google hint words 的老题
suffix tree 和 trie单词提示是怎么实现的?
问个Longest Common Substring的问题Longest Common Fix
一道MS题google phone interview question
急问,Boggle (crossword)的解题思路?Longest common string问题
相关话题的讨论汇总
话题: trie话题: children话题: tree话题: words话题: suffix
进入JobHunting版参与讨论
1 (共1页)
t**r
发帖数: 3428
1
哪位高人能简单总结一下?
谢谢
t**r
发帖数: 3428
2
还有,TRIE的 插入和查找都是 O(1) 我的理解多么
i**********e
发帖数: 1145
3
Trie 的插入和查找是 O(N), N = 字符串的长度。
Trie 是 prefix tree 的简称。
网上有很多关于 trie 的教程,相信你可以很快就掌握。
至于 suffix tree 就比较复杂,面试一般都不太问。至少不会变态到叫你写代码。

【在 t**r 的大作中提到】
: 还有,TRIE的 插入和查找都是 O(1) 我的理解多么
i**********e
发帖数: 1145
4
Trie 的应用挺强大的,例如 auto complete (suggestion),boggle 游戏的优化,还
有这个 word rectangle 这个 puzzle 题:
http://www.mitbbs.com/article_t/JobHunting/31916637.html
s*****y
发帖数: 897
5
How is the space complexity of the trie you create in the previous post:
求一道 面世题 的解答思路

【在 i**********e 的大作中提到】
: Trie 的插入和查找是 O(N), N = 字符串的长度。
: Trie 是 prefix tree 的简称。
: 网上有很多关于 trie 的教程,相信你可以很快就掌握。
: 至于 suffix tree 就比较复杂,面试一般都不太问。至少不会变态到叫你写代码。

i**********e
发帖数: 1145
6
The space complexity depend on how many words you insert into the trie, and
also the common prefixes they all shared. In addition, it also depends on
how you represent a trie node's children. For example, representing its
children as a linked list is more space efficient than an array of children
nodes.
In all cases, the trie is more efficient in terms of space and speed
compared to hash table for the purposes of checking prefixes of a string.

【在 s*****y 的大作中提到】
: How is the space complexity of the trie you create in the previous post:
: 求一道 面世题 的解答思路

s*****y
发帖数: 897
7
Ye. That is my question. Base on the input of that puzzle, it seems that the
trie will occupy a lot of memory as you create a tire for so many words in
a 1.6MB file.
I did not check carefully how similar are their prefix though.

and
children

【在 i**********e 的大作中提到】
: The space complexity depend on how many words you insert into the trie, and
: also the common prefixes they all shared. In addition, it also depends on
: how you represent a trie node's children. For example, representing its
: children as a linked list is more space efficient than an array of children
: nodes.
: In all cases, the trie is more efficient in terms of space and speed
: compared to hash table for the purposes of checking prefixes of a string.

i**********e
发帖数: 1145
8
You don't need to create a trie that store all words.
If you're finding a rectangle of size mxn, you only need to store all words
of length m and length n only in the trie.

the
in

【在 s*****y 的大作中提到】
: Ye. That is my question. Base on the input of that puzzle, it seems that the
: trie will occupy a lot of memory as you create a tire for so many words in
: a 1.6MB file.
: I did not check carefully how similar are their prefix though.
:
: and
: children

s*****y
发帖数: 897
9
icic
I did not read your algorithm carefully.
Thanks.

words

【在 i**********e 的大作中提到】
: You don't need to create a trie that store all words.
: If you're finding a rectangle of size mxn, you only need to store all words
: of length m and length n only in the trie.
:
: the
: in

i**********e
发帖数: 1145
10
Here are some statistics on how much time and memory to create a trie with
all words in the dictionary:
Each node has 26 trie nodes. (children represented as an array)
Time: < 1sec, Memory: 45 MB
Each node has a pointer to the head of its children. (children represented
as linked list)
Time: < 1sec, Memory: 10 MB

the
in

【在 s*****y 的大作中提到】
: Ye. That is my question. Base on the input of that puzzle, it seems that the
: trie will occupy a lot of memory as you create a tire for so many words in
: a 1.6MB file.
: I did not check carefully how similar are their prefix though.
:
: and
: children

相关主题
问个Longest Common Substring的问题新鲜onsite面经
一道MS题问一个boggle题的扩展
急问,Boggle (crossword)的解题思路?finds all repeated substrings in the string --- YAHOO interview question
进入JobHunting版参与讨论
f*******t
发帖数: 7549
11
请问children是怎样存的?一个node可能有26种child,分别对应26个字母。用一个指
针存"head of its children"是什么意思?是用26个指针吗?那不是还是存了一个
array?

【在 i**********e 的大作中提到】
: Here are some statistics on how much time and memory to create a trie with
: all words in the dictionary:
: Each node has 26 trie nodes. (children represented as an array)
: Time: < 1sec, Memory: 45 MB
: Each node has a pointer to the head of its children. (children represented
: as linked list)
: Time: < 1sec, Memory: 10 MB
:
: the
: in

h**********8
发帖数: 267
12
mark一下,学习
i**********e
发帖数: 1145
13
1) array trie definition
struct Trie {
// pointer to its 26 children.
// children[i] == NULL means that particular child doesn't exist.
Trie *children[26];
bool end; // marker to indicate if current letter is end of a word.
};
2) linked list trie definition
struct Trie {
// pointer to its first child.
Trie *son;
// pointer to its sibling node.
Trie *next;
};
For 1) -- array trie definition, the insert(const char *s) function is easy
to write. For 2), is a little more tricky to write a non-recursive insert()
function. You can choose to insert in a way such that its children are
always sorted in order. If it's sorted, it's faster to search, but still
take at worst 26 iteration. Use a bitmap (a 32-bit integer) can overcome
this shortcoming.

【在 f*******t 的大作中提到】
: 请问children是怎样存的?一个node可能有26种child,分别对应26个字母。用一个指
: 针存"head of its children"是什么意思?是用26个指针吗?那不是还是存了一个
: array?

g**********y
发帖数: 14569
14
实现简单的suffix tree build还是很容易的,你可以告诉面试官,Ukkonen 95年做了
个O(n)的,现在你让兄弟写,那是没戏,但我可以给你查出来。

【在 i**********e 的大作中提到】
: Trie 的插入和查找是 O(N), N = 字符串的长度。
: Trie 是 prefix tree 的简称。
: 网上有很多关于 trie 的教程,相信你可以很快就掌握。
: 至于 suffix tree 就比较复杂,面试一般都不太问。至少不会变态到叫你写代码。

f*******t
发帖数: 7549
15
谢了。
看了前面某回帖,没想到用array存children也就一些指针,居然会多耗几倍的内存……

【在 i**********e 的大作中提到】
: 1) array trie definition
: struct Trie {
: // pointer to its 26 children.
: // children[i] == NULL means that particular child doesn't exist.
: Trie *children[26];
: bool end; // marker to indicate if current letter is end of a word.
: };
: 2) linked list trie definition
: struct Trie {
: // pointer to its first child.

1 (共1页)
进入JobHunting版参与讨论
相关主题
Longest common string问题suffix tree 和 trie
有没有遇到让当场写一个suffix tree或者automaton的?问个Longest Common Substring的问题
有没有大牛总结一下trie, suffix tree, suffix array, B+ tree各自应用在哪些问题。一道MS题
trie vs suffix tree急问,Boggle (crossword)的解题思路?
suffix tree有必要搞懂吗?新鲜onsite面经
面试中遇到suffix tree / trie这种题,需要自己实现吗?问一个boggle题的扩展
请教个prefix tree (trie)和boggle的问题finds all repeated substrings in the string --- YAHOO interview question
请教几道经典题那个 google hint words 的老题
相关话题的讨论汇总
话题: trie话题: children话题: tree话题: words话题: suffix