g*****u 发帖数: 298 | 1 比如pattern为www.i*a*c.com,那么应该找到www.icabc.com, www.isdfac.com,...
怎么建立index及search呢?我想用suffix tree,但是具体算法应该怎样写呢? | g*****u 发帖数: 298 | | s*********l 发帖数: 103 | 3
Regular Expressions
Java, C#, Python, Perl have built-in support for regular expressions.
For C/C++ users, there are POSIX C API's for manipulating regular
expressions and the Boost.Regex library from boost.
http://www.boost.org/doc/libs/release/libs/regex
http://onlamp.com/pub/a/onlamp/2006/04/06/boostregex.html
Regular expression matching can be implemented using finite automata.
http://swtch.com/~rsc/regexp/
collects resources about implementing regular expression search efficiently
【在 g*****u 的大作中提到】 : 比如pattern为www.i*a*c.com,那么应该找到www.icabc.com, www.isdfac.com,... : 怎么建立index及search呢?我想用suffix tree,但是具体算法应该怎样写呢?
| h*********e 发帖数: 56 | 4 如果非得用suffix tree,能不能转化成exact set matching?
"www.i" occurs at index X = {x1, x2...}
"a" occurs at index Y = {y1, y2...}
"c.com" occurs at index Z = {z1, z2...}
Building the suffix tree and matching can be done in linear time. Then,
pattern occurs in text iff the following has solution:
y-x >= 5
z-y >= 1
x, y, z from X, Y, Z
不知道这方程组有没有线性解法。 | s*********l 发帖数: 103 | 5 This paper
Junghoo Cho, Sridhar Rajagopalan: A Fast Regular Expression Indexing Engine.
ICDE 2002: 419-430
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.18.6659&rep=rep1&type=pdf
addresses scale and performance issues when matching regular expressions (a
regex) against a large corpus.
efficiently
【在 s*********l 的大作中提到】 : : Regular Expressions : Java, C#, Python, Perl have built-in support for regular expressions. : For C/C++ users, there are POSIX C API's for manipulating regular : expressions and the Boost.Regex library from boost. : http://www.boost.org/doc/libs/release/libs/regex : http://onlamp.com/pub/a/onlamp/2006/04/06/boostregex.html : Regular expression matching can be implemented using finite automata. : http://swtch.com/~rsc/regexp/ : collects resources about implementing regular expression search efficiently
| g*****u 发帖数: 298 | 6 谢谢,我以前见过这paper,还有Yates and Gonnet那篇,好长啊。。
能否简单说一下它的中心思想是什么?
Engine.
a
【在 s*********l 的大作中提到】 : This paper : Junghoo Cho, Sridhar Rajagopalan: A Fast Regular Expression Indexing Engine. : ICDE 2002: 419-430 : http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.18.6659&rep=rep1&type=pdf : addresses scale and performance issues when matching regular expressions (a : regex) against a large corpus. : : efficiently
|
|