青青草原综合久久大伊人导航_色综合久久天天综合_日日噜噜夜夜狠狠久久丁香五月_热久久这里只有精品

Matrix
Klarke's C/C++ Home
posts - 61,comments - 0,trackbacks - 0

The regexp Command

The regexp command provides direct access to the regular expression matcher. Not only does it tell you whether a string matches a pattern, it can also extract one or more matching substrings. The return value is 1 if some part of the string matches the pattern; it is 0 otherwise. Its syntax is:

regexp ?flags? pattern string ?match sub1 sub2...?

The flags are described in Table 11-6:

Table 11-6. Options to the regexp command

-nocase

Lowercase characters in pattern can match either lowercase or uppercase letters in string.

-indices

The match variables each contain a pair of numbers that are in indices delimiting the match within string. Otherwise, the matching string itself is copied into the match variables.

-expanded

The pattern uses the expanded syntax discussed on page 154.

-line

The same as specifying both -lineanchor and -linestop.

-lineanchor

Change the behavior of ^ and $ so they are line-oriented as discussed on page 153.

-linestop

Change matching so that . and character classes do not match newlines as discussed on page 153.

-about

Useful for debugging. It returns information about the pattern instead of trying to match it against the input.

--

Signals the end of the options. You must use this if your pattern begins with -.

The pattern argument is a regular expression as described earlier. If string matches pattern, then regexp stores the results of the match in the variables provided. These match variables are optional. If present, match is set to the part of the string that matched the pattern. The remaining variables are set to the substrings of string that matched the corresponding subpatterns in pattern. The correspondence is based on the order of left parentheses in the pattern to avoid ambiguities that can arise from nested subpatterns.

Example 11-2 uses regexp to pick the hostname out of the DISPLAY environment variable, which has the form:

hostname:display.screen
Example 11-2 Using regular expressions to parse a string
set env(DISPLAY) sage:0.1
regexp {([^:]*):} $env(DISPLAY) match host
=> 1
set match
=> sage:
set host
=> sage

The pattern involves a complementary set, [^:], to match anything except a colon. It uses repetition, *, to repeat that zero or more times. It groups that part into a subexpression with parentheses. The literal colon ensures that the DISPLAY value matches the format we expect. The part of the string that matches the complete pattern is stored into the match variable. The part that matches the subpattern is stored into host. The whole pattern has been grouped with braces to quote the square brackets. Without braces it would be:

regexp (\[^:\]*): $env(DISPLAY) match host

With advanced regular expressions the nongreedy quantifier *? can replace the complementary set:

regexp (.*?): $env(DISPLAY) match host

This is quite a powerful statement, and it is efficient. If we had only had the string command to work with, we would have needed to resort to the following, which takes roughly twice as long to interpret:

set i [string first : $env(DISPLAY)]
if {$i >= 0} {
set host [string range $env(DISPLAY) 0 [expr $i-1]]
}

A Pattern to Match URLs

Example 11-3 demonstrates a pattern with several subpatterns that extract the different parts of a URL. There are lots of subpatterns, and you can determine which match variable is associated with which subpattern by counting the left parenthesis. The pattern will be discussed in more detail after the example:

Example 11-3 A pattern to match URLs
set url http://www.beedub.com:80/index.html
regexp {([^:]+)://([^:/]+)(:([0-9]+))?(/.*)} $url \
match protocol server x port path
=> 1
set match
=> http://www.beedub.com:80/index.html
set protocol
=> http
set server
=> www.beedub.com
set x
=> :80
set port
=> 80
set path
=> /index.html

Let's look at the pattern one piece at a time. The first part looks for the protocol, which is separated by a colon from the rest of the URL. The first part of the pattern is one or more characters that are not a colon, followed by a colon. This matches the http: part of the URL:

[^:]+:

Using nongreedy +? quantifier, you could also write that as:

.+?:

The next part of the pattern looks for the server name, which comes after two slashes. The server name is followed either by a colon and a port number, or by a slash. The pattern uses a complementary set that specifies one or more characters that are not a colon or a slash. This matches the //www.beedub.com part of the URL:

//[^:/]+

The port number is optional, so a subpattern is delimited with parentheses and followed by a question mark. An additional set of parentheses are added to capture the port number without the leading colon. This matches the :80 part of the URL:

(:([0-9]+))?

The last part of the pattern is everything else, starting with a slash. This matches the /index.html part of the URL:

/.*

Use subpatterns to parse strings.


To make this pattern really useful, we delimit several subpatterns with parentheses:

([^:]+)://([^:/]+)(:([0-9]+))?(/.*)

These parentheses do not change the way the pattern matches. Only the optional port number really needs the parentheses in this example. However, the regexp command gives us access to the strings that match these subpatterns. In one step regexp can test for a valid URL and divide it into the protocol part, the server, the port, and the trailing path.

The parentheses around the port number include the : before the digits. We've used a dummy variable that gets the : and the port number, and another match variable that just gets the port number. By using noncapturing parentheses in advanced regular expressions, we can eliminate the unused match variable. We can also replace both complementary character sets with a nongreedy .+? match. Example 11-4 shows this variation:

Example 11-4 An advanced regular expression to match URLs
set url http://www.beedub.com:80/book/
regexp {(.+?)://(.+?)(?::([0-9]+))?(/.*)$} $url \
match protocol server port path
=> 1
set match
=> http://www.beedub.com:80/book/
set protocol
=> http
set server
=> www.beedub.com
set port
=> 80
set path
=> /book/

Bugs When Mixing Greedy and Non-Greedy Quantifiers

If you have a regular expression pattern that uses both greedy and non-greedy quantifiers, then you can quickly run into trouble. The problem is that in complex cases there can be ambiguous ways to resolve the quantifiers. Unfortunately, what happens in practice is that Tcl tends to make all the quantifiers either greedy, or all of them non-greedy. Example 11-4 has a $ at the end to force the last greedy term to go to the end of the string. In theory, the greediness of the last subpattern should match all the characters out to the end of the string. In practice, Tcl makes all the quantifiers non-greedy, so the anchor is necessary to force the pattern to match to the end of the string.

Sample Regular Expressions

The table in this section lists regular expressions as you would use them in Tcl commands. Most are quoted with curly braces to turn off the special meaning of square brackets and dollar signs. Other patterns are grouped with double quotes and use backslash quoting because the patterns include backslash sequences like \n and \t. In Tcl 8.0 and earlier, these must be substituted by Tcl before the regexp command is called. In these cases, the equivalent advanced regular expression is also shown.

Table 11-7. Sample regular expressions

{^[yY]}

Begins with y or Y, as in a Yes answer.

{^(yes|YES|Yes)$}

Exactly "yes", "Yes", or "YES".

{^[^ \t:\]+:}

Begins with colon-delimited field that has no spaces or tabs.

{^\S+?:}

Same as above, using \S for "not space".

"^\[ \t]*$"

A string of all spaces or tabs.

{(?n)^\s*$}

A blank line using newline sensitive mode.

"(\n|^)\[^\n\]*(\n|$)"

A blank line, the hard way.

{^[A-Za-z]+$}

Only letters.

{^[[:alpha:]]+$}

Only letters, the Unicode way.

{[A-Za-z0-9_]+}

Letters, digits, and the underscore.

{\w+}

Letters, digits, and the underscore using \w.

{[][${}\\]}

The set of Tcl special characters: ] [ $ { } \

"\[^\n\]*\n"

Everything up to a newline.

{.*?\n}

Everything up to a newline using nongreedy *?

{\.}

A period.

{[][$^?+*()|\\]}

The set of regular expression special characters:

] [ $ ^ ? + * ( ) | \

<H1>(.*?)</H1>

An H1 HTML tag. The subpattern matches the string between the tags.

<!--.*?-->

HTML comments.

{[0-9a-hA-H][0-9a-hA-H]}

2 hex digits.

{[[:xdigit:]]{2}}

2 hex digits, using advanced regular expressions.

{\d{1,3}}

1 to 3 digits, using advanced regular expressions.

posted on 2010-09-26 17:22 Klarke 閱讀(508) 評論(0)  編輯 收藏 引用

只有注冊用戶登錄后才能發(fā)表評論。
網(wǎng)站導(dǎo)航: 博客園   IT新聞   BlogJava   博問   Chat2DB   管理


青青草原综合久久大伊人导航_色综合久久天天综合_日日噜噜夜夜狠狠久久丁香五月_热久久这里只有精品
  • <ins id="pjuwb"></ins>
    <blockquote id="pjuwb"><pre id="pjuwb"></pre></blockquote>
    <noscript id="pjuwb"></noscript>
          <sup id="pjuwb"><pre id="pjuwb"></pre></sup>
            <dd id="pjuwb"></dd>
            <abbr id="pjuwb"></abbr>
            最新中文字幕一区二区三区| 亚洲大片免费看| 一区二区欧美国产| 欧美专区在线播放| 日韩视频―中文字幕| 久久免费高清视频| 国产午夜精品美女视频明星a级| 最新中文字幕亚洲| 美女免费视频一区| 午夜激情亚洲| 国产视频在线一区二区 | 一区二区三区四区五区在线| 久久婷婷影院| 久久国产精品久久久久久| 国产欧美婷婷中文| 久久高清国产| 久久久久高清| 亚洲肉体裸体xxxx137| 亚洲国产高清aⅴ视频| 欧美成人午夜激情| 99av国产精品欲麻豆| 亚洲精品乱码久久久久久按摩观| 欧美ed2k| 久久久久91| 久久精品国产清高在天天线| 国产伦精品一区二区三区| 亚洲欧美日韩一区二区三区在线观看| 亚洲伦理精品| 国产精品丝袜白浆摸在线| 欧美亚洲尤物久久| 久久精品国产亚洲一区二区三区 | 国产精品久久久爽爽爽麻豆色哟哟| 日韩一级黄色大片| 亚洲午夜伦理| 精品电影一区| 亚洲九九九在线观看| 国产精品美女黄网| 老司机67194精品线观看| 欧美va亚洲va日韩∨a综合色| 日韩视频在线一区二区三区| 一色屋精品视频在线看| 亚洲网站视频福利| 好看的日韩av电影| 亚洲国产另类精品专区| 欧美日韩国产美女| 午夜久久久久| 亚洲精品免费网站| 国产精品v欧美精品v日韩精品| 亚洲高清在线精品| 性色av香蕉一区二区| 欧美一区二区三区免费大片| 激情五月***国产精品| 久久亚洲高清| 免费成人高清视频| 久久久精品国产免费观看同学| 久久久久久久久岛国免费| 久久精品人人做人人综合 | 狠狠久久五月精品中文字幕| 韩国成人福利片在线播放| 久久综合福利| 欧美激情亚洲自拍| 久久久久久国产精品mv| 欧美精品1区2区3区| 欧美一区二区在线免费观看| 免费观看一区| 久久精品国产欧美激情| 欧美精品一区二区三区高清aⅴ| 国产老女人精品毛片久久| 日韩视频二区| 午夜精品久久久久久久久久久| 在线观看中文字幕亚洲| 亚洲一级在线观看| 一本色道**综合亚洲精品蜜桃冫| 亚洲综合色婷婷| 亚洲乱亚洲高清| 亚洲高清免费在线| 亚洲欧美日韩视频一区| 国产精品美女一区二区| 亚洲国产精品免费| 久久久99免费视频| 久久综合狠狠综合久久综青草 | 欧美日韩色婷婷| 国产亚洲综合精品| 国产精品成人va在线观看| 欧美激情一区二区三区全黄| 欧美日韩亚洲视频一区| 精品99一区二区| 亚洲麻豆av| 久久久久久久网| 国产午夜精品久久久久久免费视 | 亚洲国产成人精品女人久久久| 亚洲国产精品激情在线观看| 亚洲剧情一区二区| 午夜精品av| 欧美国产丝袜视频| 国产精品蜜臀在线观看| 影音先锋欧美精品| 麻豆精品视频在线观看| 欧美在线高清视频| 91久久精品视频| 在线观看91久久久久久| 欧美成人伊人久久综合网| 欧美激情五月| 亚洲欧美日本在线| 伊人精品在线| 久久黄色网页| 亚洲国产精品99久久久久久久久| 久久久综合网站| 亚洲色图自拍| 欧美高清不卡| 久久成人国产精品| 久久精品国产一区二区三区| 欧美成人一区二免费视频软件| 亚洲视频一区二区在线观看| 久久xxxx| 激情亚洲一区二区三区四区| 亚洲嫩草精品久久| 亚洲高清在线| 欧美国产成人在线| 永久久久久久| 欧美国产在线观看| 欧美亚洲一区二区在线| 国产亚洲一区二区三区| 久久九九全国免费精品观看| 欧美一区二区视频观看视频| 国产欧美精品一区二区三区介绍| 亚洲一区二区三区在线看| 一区二区三区日韩| 欧美日韩直播| 亚洲综合999| 久久精品电影| 亚洲青涩在线| 一区二区高清视频| 久久亚洲影院| 一本久道久久综合中文字幕| 激情综合网址| 夜夜嗨av一区二区三区网站四季av| 国产欧美精品va在线观看| 国产精品久久久久久久第一福利 | 亚洲午夜一区| 久久夜色精品| 国产精品国产三级国产专播精品人| 欧美日韩国产综合新一区| 久久成人免费日本黄色| 在线国产精品播放| 亚洲一区二区动漫| 亚洲精品国产精品久久清纯直播 | 亚洲国产精品尤物yw在线观看| 欧美一级成年大片在线观看| 麻豆成人在线| 亚洲国产精品传媒在线观看| 亚洲永久视频| 欧美精品v日韩精品v韩国精品v| 在线欧美小视频| 久久嫩草精品久久久精品| 亚洲欧美一区二区三区久久| 国产精品一区在线播放| 亚洲欧美日韩视频一区| 夜夜嗨av色一区二区不卡| 欧美日本不卡| 免费成人高清视频| 国产色综合天天综合网| 亚洲午夜精品一区二区三区他趣| 亚洲国产一区二区a毛片| 久久免费少妇高潮久久精品99| 欧美一区二区女人| 国产午夜精品一区理论片飘花| 在线亚洲自拍| 久久人人看视频| 亚洲韩国日本中文字幕| 猛男gaygay欧美视频| 欧美国产一区二区三区激情无套| 亚洲激情成人网| 欧美日韩在线播放| 制服丝袜激情欧洲亚洲| 亚洲免费视频一区二区| 国产精品日韩精品欧美在线 | 欧美一区二区三区在线观看视频| 欧美亚洲一区三区| 亚洲第一精品夜夜躁人人躁| 欧美激情区在线播放| 亚洲无玛一区| 国产一区二区你懂的| 欧美在线电影| 亚洲国产另类久久久精品极度| 亚洲色图在线视频| 黄色成人小视频| 国产精品视频yy9099| 久久午夜视频| 欧美一区二区视频在线观看2020 | 欧美日韩岛国| 久久国产99| 亚洲精品一区二区三区蜜桃久| 一本久久综合| 亚洲人成毛片在线播放| 黑人巨大精品欧美黑白配亚洲| 亚洲成色www久久网站| 亚洲久久一区| 亚洲国产一区二区三区a毛片| 国产精品一香蕉国产线看观看|