diceidea

parser

導(dǎo)航

留言簿(1)

隨筆分類

文章分類

Dev log (rss)

收藏夾

隨筆檔案

閱讀排行榜

評(píng)論排行榜

常用鏈接

Others

我的外文地址 (rss)
C++ blog這里主要是國(guó)內(nèi)交流，外語(yǔ)的嘛在msn空間里，目前只有英語(yǔ)，以后可能會(huì)學(xué)西班牙語(yǔ)，或者法語(yǔ)，我想學(xué)的好多啊

有用的東西

DFA和lexical analysis

對(duì)于hand written的lexical analyzer來(lái)說(shuō)，NFA和DFA的運(yùn)用是不可避免的，除非你的grammer十分簡(jiǎn)單。
一旦給出了source program(也就是你想處理的character stream)的一個(gè)pattern的正則表達(dá)式，就可以構(gòu)造對(duì)應(yīng)的NFA，然后轉(zhuǎn)換為DFA，這個(gè)DFA就可以用來(lái)處理你的source program, 將里面能夠match這個(gè)pattern的lexeme全都找出來(lái)。按照這樣的流程，對(duì)于一種編程語(yǔ)言，不管是常用的語(yǔ)言，還是腳本語(yǔ)言，只要對(duì)所有的pattern構(gòu)造DFA，就能夠?qū)懗鲎约旱膌exical analyzer了。
有兩篇關(guān)于正則表達(dá)式到DFA的文章寫的很好：
1.Writing own regular expression parser By Amer Gerzic英文的
http://www.codeproject.com/KB/recipes/OwnRegExpressionsParser.aspx
有源碼
2. 《構(gòu)造正則表達(dá)式引擎》新鮮出爐啦！中文的，by vczh,華南理工大學(xué)
http://m.shnenglu.com/vczh/archive/2008/05/22/50763.html
閱讀完上面兩篇文章，寫個(gè)能夠運(yùn)行的lexer就不成問(wèn)題了。
另外附上龍書（Compilers, principles techniques and tools）里一段token,pattern和lexeme術(shù)語(yǔ)的區(qū)別：
1. A t o k e n is a pair consisting of a token name and an optional attribute
value. The token name is an abstract symbol representing a kind of
lexical unit(lexeme), e.g., a particular keyword, or a sequence of input characters
denoting an identifier. The token names are the input symbols that the
parser processes. In what follows, we shall generally write the name of a
token in boldface. We will often refer to a token by its token name.
2. A pattern is a description of the form that the lexemes of a token may take.
In the case of a keyword as a token, the pattern is just the sequence of
characters that form the keyword. For identifiers and some other tokens,
the pattern is a more complex structure that is matched by many strings.
3. A lexeme is a sequence of characters in the source program that matches
the pattern for a token and is identified by the lexical analyzer as an
instance of that token.
notes:
1. more than one lexeme can match a pattern
2. 看看example 3.1

posted on 2008-05-24 13:59 diceidea 閱讀(534) 評(píng)論(0) 編輯收藏引用所屬分類: Dev log

只有注冊(cè)用戶登錄后才能發(fā)表評(píng)論。
【推薦】100%開(kāi)源！大型工業(yè)跨平臺(tái)軟件C++源碼提供，建模，組態(tài)！

相關(guān)文章: DFA和lexical analysis dev log(1)

網(wǎng)站導(dǎo)航: 博客園 IT新聞 BlogJava 博問(wèn) Chat2DB 管理

青青草原综合久久大伊人导航_色综合久久天天综合_日日噜噜夜夜狠狠久久丁香五月_热久久这里只有精品