青青草原综合久久大伊人导航_色综合久久天天综合_日日噜噜夜夜狠狠久久丁香五月_热久久这里只有精品

Matrix
Klarke's C/C++ Home
posts - 61,comments - 0,trackbacks - 0

The regexp Command

The regexp command provides direct access to the regular expression matcher. Not only does it tell you whether a string matches a pattern, it can also extract one or more matching substrings. The return value is 1 if some part of the string matches the pattern; it is 0 otherwise. Its syntax is:

regexp ?flags? pattern string ?match sub1 sub2...?

The flags are described in Table 11-6:

Table 11-6. Options to the regexp command

-nocase

Lowercase characters in pattern can match either lowercase or uppercase letters in string.

-indices

The match variables each contain a pair of numbers that are in indices delimiting the match within string. Otherwise, the matching string itself is copied into the match variables.

-expanded

The pattern uses the expanded syntax discussed on page 154.

-line

The same as specifying both -lineanchor and -linestop.

-lineanchor

Change the behavior of ^ and $ so they are line-oriented as discussed on page 153.

-linestop

Change matching so that . and character classes do not match newlines as discussed on page 153.

-about

Useful for debugging. It returns information about the pattern instead of trying to match it against the input.

--

Signals the end of the options. You must use this if your pattern begins with -.

The pattern argument is a regular expression as described earlier. If string matches pattern, then regexp stores the results of the match in the variables provided. These match variables are optional. If present, match is set to the part of the string that matched the pattern. The remaining variables are set to the substrings of string that matched the corresponding subpatterns in pattern. The correspondence is based on the order of left parentheses in the pattern to avoid ambiguities that can arise from nested subpatterns.

Example 11-2 uses regexp to pick the hostname out of the DISPLAY environment variable, which has the form:

hostname:display.screen
Example 11-2 Using regular expressions to parse a string
set env(DISPLAY) sage:0.1
regexp {([^:]*):} $env(DISPLAY) match host
=> 1
set match
=> sage:
set host
=> sage

The pattern involves a complementary set, [^:], to match anything except a colon. It uses repetition, *, to repeat that zero or more times. It groups that part into a subexpression with parentheses. The literal colon ensures that the DISPLAY value matches the format we expect. The part of the string that matches the complete pattern is stored into the match variable. The part that matches the subpattern is stored into host. The whole pattern has been grouped with braces to quote the square brackets. Without braces it would be:

regexp (\[^:\]*): $env(DISPLAY) match host

With advanced regular expressions the nongreedy quantifier *? can replace the complementary set:

regexp (.*?): $env(DISPLAY) match host

This is quite a powerful statement, and it is efficient. If we had only had the string command to work with, we would have needed to resort to the following, which takes roughly twice as long to interpret:

set i [string first : $env(DISPLAY)]
if {$i >= 0} {
set host [string range $env(DISPLAY) 0 [expr $i-1]]
}

A Pattern to Match URLs

Example 11-3 demonstrates a pattern with several subpatterns that extract the different parts of a URL. There are lots of subpatterns, and you can determine which match variable is associated with which subpattern by counting the left parenthesis. The pattern will be discussed in more detail after the example:

Example 11-3 A pattern to match URLs
set url http://www.beedub.com:80/index.html
regexp {([^:]+)://([^:/]+)(:([0-9]+))?(/.*)} $url \
match protocol server x port path
=> 1
set match
=> http://www.beedub.com:80/index.html
set protocol
=> http
set server
=> www.beedub.com
set x
=> :80
set port
=> 80
set path
=> /index.html

Let's look at the pattern one piece at a time. The first part looks for the protocol, which is separated by a colon from the rest of the URL. The first part of the pattern is one or more characters that are not a colon, followed by a colon. This matches the http: part of the URL:

[^:]+:

Using nongreedy +? quantifier, you could also write that as:

.+?:

The next part of the pattern looks for the server name, which comes after two slashes. The server name is followed either by a colon and a port number, or by a slash. The pattern uses a complementary set that specifies one or more characters that are not a colon or a slash. This matches the //www.beedub.com part of the URL:

//[^:/]+

The port number is optional, so a subpattern is delimited with parentheses and followed by a question mark. An additional set of parentheses are added to capture the port number without the leading colon. This matches the :80 part of the URL:

(:([0-9]+))?

The last part of the pattern is everything else, starting with a slash. This matches the /index.html part of the URL:

/.*

Use subpatterns to parse strings.


To make this pattern really useful, we delimit several subpatterns with parentheses:

([^:]+)://([^:/]+)(:([0-9]+))?(/.*)

These parentheses do not change the way the pattern matches. Only the optional port number really needs the parentheses in this example. However, the regexp command gives us access to the strings that match these subpatterns. In one step regexp can test for a valid URL and divide it into the protocol part, the server, the port, and the trailing path.

The parentheses around the port number include the : before the digits. We've used a dummy variable that gets the : and the port number, and another match variable that just gets the port number. By using noncapturing parentheses in advanced regular expressions, we can eliminate the unused match variable. We can also replace both complementary character sets with a nongreedy .+? match. Example 11-4 shows this variation:

Example 11-4 An advanced regular expression to match URLs
set url http://www.beedub.com:80/book/
regexp {(.+?)://(.+?)(?::([0-9]+))?(/.*)$} $url \
match protocol server port path
=> 1
set match
=> http://www.beedub.com:80/book/
set protocol
=> http
set server
=> www.beedub.com
set port
=> 80
set path
=> /book/

Bugs When Mixing Greedy and Non-Greedy Quantifiers

If you have a regular expression pattern that uses both greedy and non-greedy quantifiers, then you can quickly run into trouble. The problem is that in complex cases there can be ambiguous ways to resolve the quantifiers. Unfortunately, what happens in practice is that Tcl tends to make all the quantifiers either greedy, or all of them non-greedy. Example 11-4 has a $ at the end to force the last greedy term to go to the end of the string. In theory, the greediness of the last subpattern should match all the characters out to the end of the string. In practice, Tcl makes all the quantifiers non-greedy, so the anchor is necessary to force the pattern to match to the end of the string.

Sample Regular Expressions

The table in this section lists regular expressions as you would use them in Tcl commands. Most are quoted with curly braces to turn off the special meaning of square brackets and dollar signs. Other patterns are grouped with double quotes and use backslash quoting because the patterns include backslash sequences like \n and \t. In Tcl 8.0 and earlier, these must be substituted by Tcl before the regexp command is called. In these cases, the equivalent advanced regular expression is also shown.

Table 11-7. Sample regular expressions

{^[yY]}

Begins with y or Y, as in a Yes answer.

{^(yes|YES|Yes)$}

Exactly "yes", "Yes", or "YES".

{^[^ \t:\]+:}

Begins with colon-delimited field that has no spaces or tabs.

{^\S+?:}

Same as above, using \S for "not space".

"^\[ \t]*$"

A string of all spaces or tabs.

{(?n)^\s*$}

A blank line using newline sensitive mode.

"(\n|^)\[^\n\]*(\n|$)"

A blank line, the hard way.

{^[A-Za-z]+$}

Only letters.

{^[[:alpha:]]+$}

Only letters, the Unicode way.

{[A-Za-z0-9_]+}

Letters, digits, and the underscore.

{\w+}

Letters, digits, and the underscore using \w.

{[][${}\\]}

The set of Tcl special characters: ] [ $ { } \

"\[^\n\]*\n"

Everything up to a newline.

{.*?\n}

Everything up to a newline using nongreedy *?

{\.}

A period.

{[][$^?+*()|\\]}

The set of regular expression special characters:

] [ $ ^ ? + * ( ) | \

<H1>(.*?)</H1>

An H1 HTML tag. The subpattern matches the string between the tags.

<!--.*?-->

HTML comments.

{[0-9a-hA-H][0-9a-hA-H]}

2 hex digits.

{[[:xdigit:]]{2}}

2 hex digits, using advanced regular expressions.

{\d{1,3}}

1 to 3 digits, using advanced regular expressions.

posted on 2010-09-26 17:22 Klarke 閱讀(508) 評(píng)論(0)  編輯 收藏 引用

只有注冊(cè)用戶登錄后才能發(fā)表評(píng)論。
網(wǎng)站導(dǎo)航: 博客園   IT新聞   BlogJava   博問   Chat2DB   管理


青青草原综合久久大伊人导航_色综合久久天天综合_日日噜噜夜夜狠狠久久丁香五月_热久久这里只有精品
  • <ins id="pjuwb"></ins>
    <blockquote id="pjuwb"><pre id="pjuwb"></pre></blockquote>
    <noscript id="pjuwb"></noscript>
          <sup id="pjuwb"><pre id="pjuwb"></pre></sup>
            <dd id="pjuwb"></dd>
            <abbr id="pjuwb"></abbr>
            久久精品夜色噜噜亚洲aⅴ| 欧美日韩成人免费| 欧美 日韩 国产精品免费观看| 香蕉乱码成人久久天堂爱免费| 99re视频这里只有精品| 亚洲啪啪91| 欧美色欧美亚洲高清在线视频| 久久综合九色99| 免费不卡在线观看av| 欧美成人免费在线观看| 欧美视频你懂的| 国产精品亚洲综合| 黄色资源网久久资源365| 亚洲高清毛片| 亚洲视频你懂的| 欧美一区2区视频在线观看| 久久久综合精品| 91久久精品国产91久久性色| 亚洲一区精彩视频| 美女国内精品自产拍在线播放| 欧美精品久久99久久在免费线| 欧美午夜精品| 亚洲国产精品一区制服丝袜| 亚洲一区欧美一区| 欧美不卡视频一区| 亚洲综合第一页| 免费在线观看精品| 国产精品久久久久久久app| 亚洲电影免费在线观看| 亚洲欧美亚洲| 亚洲国产日韩美| 久久国产精品久久久久久电车| 欧美激情按摩| 一区二区三区在线看| 亚洲欧美综合国产精品一区| 免费久久99精品国产自| 亚洲欧美日韩国产精品| 欧美日韩国产片| 亚洲第一毛片| 亚洲视频精品| 久久久久.com| 亚洲视频精选在线| 欧美激情一区二区久久久| 国产亚洲综合在线| 午夜精品美女久久久久av福利| 欧美自拍丝袜亚洲| 欧美成人精品不卡视频在线观看| 一区二区激情小说| 另类激情亚洲| 韩日精品在线| 久久九九免费| 午夜一级久久| 国产精品久久久久影院亚瑟| 夜夜精品视频一区二区| 欧美大片国产精品| 久久噜噜噜精品国产亚洲综合| 国产亚洲成精品久久| 午夜激情久久久| 一本久道久久综合婷婷鲸鱼| 欧美日韩国内自拍| 在线视频你懂得一区| 亚洲高清一区二| 另类图片综合电影| 亚洲国产欧美一区二区三区同亚洲| 久久精品网址| 欧美中在线观看| 国产美女诱惑一区二区| 欧美在线91| 久久精品国产综合精品| 狠狠做深爱婷婷久久综合一区| 久久久久欧美精品| 久久综合影音| 亚洲精品免费在线播放| 亚洲人成毛片在线播放| 亚洲欧美日韩第一区| 国产免费一区二区三区香蕉精| 亚洲欧美色婷婷| 亚洲欧美综合一区| 1024日韩| 一本一本久久| 韩日欧美一区二区| 亚洲国产毛片完整版| 欧美性色aⅴ视频一区日韩精品| 亚洲在线免费| 久久av最新网址| 亚洲精品影视在线观看| aa日韩免费精品视频一| 国产亚洲福利社区一区| 欧美激情一级片一区二区| 国产精品白丝jk黑袜喷水| 久久免费黄色| 欧美日韩免费区域视频在线观看| 午夜天堂精品久久久久| 久久久噜噜噜久久中文字免| 一本色道久久88精品综合| 亚洲少妇一区| 亚洲国产免费| 欧美一区二区三区啪啪| 91久久精品美女高潮| 亚洲天堂男人| 亚洲国内自拍| 亚洲一区在线观看视频| 亚洲国产美女| 亚洲欧美精品中文字幕在线| 亚洲清纯自拍| 香蕉久久夜色精品国产使用方法| 国产精品国产成人国产三级| 久久香蕉国产线看观看av| 久久女同精品一区二区| 一区二区三区导航| 欧美中文在线免费| 一区在线免费| 亚洲视频精品在线| 亚洲人成小说网站色在线| 亚洲一级免费视频| 妖精成人www高清在线观看| 久久精品国产v日韩v亚洲| 亚洲女人av| 欧美啪啪一区| 女同性一区二区三区人了人一 | 国产精品最新自拍| 91久久在线| 亚洲成色精品| 久久精品动漫| 久久久久久免费| 国产精品影音先锋| 中文国产成人精品| 99热免费精品在线观看| 久久久久久久一区二区| 久久久久女教师免费一区| 国产美女精品| 亚洲女人av| 欧美伊人久久| 国产精品久久夜| 亚洲视频精选在线| 在线综合亚洲| 欧美日本二区| 一本色道久久综合狠狠躁篇怎么玩| 亚洲精选视频免费看| 美女亚洲精品| 亚洲国产一区二区a毛片| 亚洲精品免费网站| 欧美日本韩国一区| 一区二区日韩欧美| 性色一区二区| 国产日韩欧美自拍| 欧美一区国产在线| 鲁大师影院一区二区三区| 一区二区亚洲精品国产| 日韩午夜av| 欧美一级午夜免费电影| 国产欧美日本一区视频| 性色av香蕉一区二区| 久久久精品国产99久久精品芒果| 国产亚洲精品aa午夜观看| 久久精品国产99国产精品澳门| 蜜桃精品久久久久久久免费影院| 精品福利免费观看| 欧美电影免费观看高清| 99国产精品视频免费观看| 亚洲摸下面视频| 国际精品欧美精品| 欧美国产精品劲爆| 99在线|亚洲一区二区| 久久精品国产免费看久久精品| 在线播放豆国产99亚洲| 欧美精品三级| 欧美亚洲在线播放| 亚洲黄色一区| 午夜视频久久久久久| 亚洲大胆美女视频| 国产精品成人v| 亚洲日本中文字幕| 国产精品久久久久久av下载红粉| 欧美一级大片在线观看| 欧美激情亚洲精品| 午夜精品国产| 亚洲国产成人不卡| 国产精品美女久久久久av超清| 久久九九免费视频| 中日韩午夜理伦电影免费| 欧美丰满少妇xxxbbb| 欧美一区二区三区视频免费| 1024成人网色www| 国产精品激情电影| 免费观看在线综合色| 亚洲欧美激情一区二区| 亚洲区国产区| 美女日韩在线中文字幕| 亚洲伊人色欲综合网| 亚洲国产精品福利| 国产区二精品视| 欧美日韩一区二区欧美激情| 亚洲国产婷婷综合在线精品 | 国产精品igao视频网网址不卡日韩| 亚洲欧美日韩系列| 9i看片成人免费高清| 亚洲大胆在线| 久久在线免费观看| 欧美在线观看日本一区|