g*********s 发帖数: 1782 | 1 【 以下文字转载自 Programming 讨论区 】
发信人: gandjmitbbs (Nothing), 信区: Programming
标 题: implement a simple regular expression match?
发信站: BBS 未名空间站 (Tue Jan 25 02:30:51 2011, 美东)
the regular expression only includes a-z, '*', and '.'
support:
1. "x*", aka 0 or more continuous 'x'
2. ".", aka wildcard matching any letter.
it seems back-trace is inevitable because of the following ambiguity:
match("b", "b*.") == true;
match("b", "b*") == true; | l*****a 发帖数: 14598 | 2 关于正则表达式匹配,我的做法如下。
将输入pattern 转化成状态数组。
比方说 ab*d
initial state=s[0]
f(s[0],a)=s[1]
f(s[1],b)=s[2]
f(s[2],b)=s[2]
f(s[1],d)=s[3]
f(s[2],d)=s[3]
end state=s[3]
对于希望进行匹配的字符串,逐个分析当前字符,查上面的hash_map or matrix
judge whether it can start from s[start] and move to s[end]
if so,it match.
otherwise,doesn't match
【在 g*********s 的大作中提到】 : 【 以下文字转载自 Programming 讨论区 】 : 发信人: gandjmitbbs (Nothing), 信区: Programming : 标 题: implement a simple regular expression match? : 发信站: BBS 未名空间站 (Tue Jan 25 02:30:51 2011, 美东) : the regular expression only includes a-z, '*', and '.' : support: : 1. "x*", aka 0 or more continuous 'x' : 2. ".", aka wildcard matching any letter. : it seems back-trace is inevitable because of the following ambiguity: : match("b", "b*.") == true;
| g*********s 发帖数: 1782 | 3 what if input is "bb" and pattern "b*b"?
【在 l*****a 的大作中提到】 : 关于正则表达式匹配,我的做法如下。 : 将输入pattern 转化成状态数组。 : 比方说 ab*d : initial state=s[0] : f(s[0],a)=s[1] : f(s[1],b)=s[2] : f(s[2],b)=s[2] : f(s[1],d)=s[3] : f(s[2],d)=s[3] : end state=s[3]
| l*****a 发帖数: 14598 | 4 这个问题问得很好
f(s[0],'b')=s[0] //for b* we can use this (*=1..n)
f(s[0],'b')=s[1] //for b*,*=0,we can use this
对于同一状态,相同输入,可能有不同结果
就各种情况试一遍好了。
【在 g*********s 的大作中提到】 : what if input is "bb" and pattern "b*b"?
| g*********s 发帖数: 1782 | 5 i don't see any simple way to do this.
what u described here is just a finite automata. u still need to eliminate
the non-determinism of nfa.
【在 l*****a 的大作中提到】 : 这个问题问得很好 : f(s[0],'b')=s[0] //for b* we can use this (*=1..n) : f(s[0],'b')=s[1] //for b*,*=0,we can use this : 对于同一状态,相同输入,可能有不同结果 : 就各种情况试一遍好了。
| l*****a 发帖数: 14598 | 6 basically you should use b*b instead of b*
eliminate
【在 g*********s 的大作中提到】 : i don't see any simple way to do this. : what u described here is just a finite automata. u still need to eliminate : the non-determinism of nfa.
|
|