如何獲取漢字的拼音:
http://blog.csdn.net/cpu88/archive/2004/11/26/195315.aspx
步驟:
//得到拼音(包括多音)
?A:?? 用輸入法生成器(win2000)"C:\Program Files\Windows NT\Accessories\Imegen.exe"
?????? 逆轉(zhuǎn)換拼音輸入法文件C:\WINNT\SYSTEM32\WINPY.MB
????? 會(huì)生成一個(gè)C:\WINNT\SYSTEM32\WINPY.txt文件(簡(jiǎn)稱 WINPY.txt文件)
B:?? WINPY.txt文件里面是??? 漢字拼音列表5萬多條 除去詞組 有漢字2萬多個(gè)(含多音)
C:?? 漢字可以轉(zhuǎn)換成某中編碼可以自己構(gòu)造編碼方法,保證一個(gè)漢字對(duì)應(yīng)一個(gè)編碼 簡(jiǎn)稱編碼方法)
????? 如 byte[] uniCode = new String(temp).getBytes(“GB2312“);
????? 將WINPY.txt里面所有的漢字變成編碼。得到漢字編碼 拼音對(duì)應(yīng)表(簡(jiǎn)稱漢字編碼表)
??????? XXXX0,a??? //XXXX0是某個(gè)漢字的編碼
??????? XXYX2,o??? //XXYX2是某個(gè)漢字的編碼
?D:? 漢字編碼表按編碼排序,編碼表按編碼大小排序。
????? 編碼表分組(方便查詢 ) 而且得到分組的標(biāo)志。
?E: 查詢漢字拼音? 將漢字進(jìn)行編碼(按自己的編碼方法)。
????? 用編碼在編碼表中查詢就可以得到拼音,查詢時(shí)在編碼表中的某個(gè)分組中查詢,而不是在所有編碼中查詢。速度很快。
//得到首字符 如'北京'? 得到 'bj'??????????? '呆子'得到? 'd[a]z '? //多音
//排序? 有了拼音 就可以按一些常見的排序方法排序
如何在C++中集成Lua腳本(LuaPlus篇):
http://ly4cn.teeta.com/blog/data/44939.html
據(jù)說無法調(diào)用虛函數(shù)。有點(diǎn)暈。有空再看看。
http://luaplus.org/
字符串到其他格式的轉(zhuǎn)換:
CString ? timestr ? = ? "2000年04月05日"; ?
? int ? a,b,c ? ; ?
? sscanf(timestr.GetBuffer(timestr.GetLength()),"%d年%d月%d日",&a,&b,&c); ?
? CTime ? time(a,b,c,0,0,0);??
ANSI兼容,我太孤陋寡聞了...
RMI for c++?:
http://www.codeproject.com/threads/RMI_For_Cpp.asp
反正目前沒這個(gè)需要。
聲音引擎,audiere,支持跨平臺(tái):
似乎挺不錯(cuò)。
http://m.shnenglu.com/gogoplayer/archive/2006/11/29/15763.html
http://sourceforge.net/projects/audiere/
wxWidget的wxSound支持windows和unix。
其它關(guān)于sound:
http://www.opensound.com/oss.html
http://www.libsdl.org/
Why You Should Turn Down That Job Offer
http://www.computerworld.com/action/article.do?command=viewArticleBasic&articleId=9005602&pageNumber=1
You will be the fifth person to have held the job in the past three years.
Why is this job vacant?
Is the turnover rate high for this position?
What's typically the next career step for those with this job?
You will clash with the corporate culture.
You will be bored -- or overwhelmed -- in the role.
You will not be able to move forward.
may not be reason enough to turn down a job:
You will earn less than you did before.
You will be in the car for two hours each day.
You will receive a “demotion” in title.
計(jì)算Int最大最小值
http://m.shnenglu.com/Winux32/archive/2006/12/01/15853.html通常我們會(huì)使用CRT提供給我們的一個(gè)頭文件<limits.h>中預(yù)定義宏INT_MAX, INT_MIN, UINT_MAX來定義int的最大最小值下邊給出由計(jì)算得出這些值的方法,其他數(shù)據(jù)類型同理
unsigned? int? GetUnsignedIntMax()
{
???? return?? ~ 0 ;
}
signed? int? GetSignedIntMax()
{
???? return? (static_cast < unsigned? int > ( ~ 0 ))? >>?? 1 ;
}
signed? int? GetSignedIntMin()
{
??? signed? int? i? =?? - 1 ;
??? if? (i? &?? 1 )
??? ??? return?? - ( (static_cast < unsigned? int > ( ~ 0 ))? >>?? 1? )? -?? 1 ;
??? else
??? ??? return?? - ( (static_cast < unsigned? int > ( ~ 0 ))? >>?? 1? );
}
稍微解釋一下,前兩個(gè)沒有什么好說的,最后一個(gè)要考慮是two complement還是one complement
如果是前者,有這樣一個(gè)計(jì)算公式,~X + 1= -X,即一個(gè)數(shù)取反加一表示這個(gè)數(shù)所對(duì)應(yīng)的負(fù)數(shù)
How Skype & Co. get round firewallscool..
http://www.heise-security.co.uk/articles/82481
先Copy過來再說.
The hole trick
How Skype & Co. get round firewalls
Peer-to-peer software applications are a
network administrator's nightmare. In order to be able
to exchange packets with their counterpart as directly
as possible they use subtle tricks to punch holes in
firewalls, which shouldn't actually be letting in
packets from the outside world.
Increasingly, computers are positioned behind firewalls
to protect systems from internet threats. Ideally, the
firewall function will be performed by a router, which
also translates the PC's local network address to the
public IP address (Network Address Translation, or
NAT). This means an attacker cannot directly adress the
PC from the outside - connections have to be
established from the inside.
This is of course a problem when two computers behind
NAT firewalls require to talk directly to each other -
if, for example, their users want to call each other
using Voice over IP (VoIP). The dilemma is clear -
whichever party calls the other, the recipient's
firewall will decline the apparent attack and will
simply discard the data packets. The telephone call
doesn't happen. Or at least that's what a network
administrator would expect.
Punched
But anyone who has used the popular internet telephony
software Skype knows that it works as smoothly behind a
NAT firewall as it does if the PC is connected directly
to the internet. The reason for this is that the
inventors of Skype and similar software have come up
with a solution.
Naturally every firewall must also let packets through
into the local network - after all the user wants to
view websites, read e-mails, etc. The firewall must
therefore forward the relevant data packets from
outside, to the workstation computer on the
LAN. However it only does so, when it is convinced that
a packet represents the response to an outgoing data
packet. A NAT router therefore keeps tables of which
internal computer has communicated with which external
computer and which ports the two have used.
The trick used by VoIP software consists of persuading
the firewall that a connection has been
established, to which it should allocate subsequent
incoming data packets. The fact that audio data for
VoIP is sent using the connectionless UDP protocol
acts to Skype's advantage. In contrast to TCP, which
includes additional connection information in each
packet, with UDP, a firewall sees only the addresses
and ports of the source and destination systems. If,
for an incoming UDP packet, these match an NAT table
entry, it will pass the packet on to an internal
computer with a clear conscience.
Switching
The switching server, with which both ends of a call
are in constant contact, plays an important role when
establishing a connection using Skype. This occurs via
a TCP connection, which the clients themselves
establish. The Skype server therefore always knows
under what address a Skype user is currently available
on the internet. Where possible the actual telephone
connections do not run via the Skype server; rather,
the clients exchange data directly.
Let's assume that Alice wants to call her friend
Bob. Her Skype client tells the Skype server that she
wants to do so. The Skype server already knows a bit
about Alice. From the incoming query it sees that Alice
is currently registered at the IP address 1.1.1.1 and a
quick test reveals that her audio data always comes
from UDP port 1414. The Skype server passes this
information on to Bob's Skype client, which, according
to its database, is currently registered at the IP
address 2.2.2.2 and which, by preference uses UDP port
2828.
Bob's Skype program then punches a hole in its own
network firewall: It sends a UDP packet to 1.1.1.1 port
1414. This is discarded by Alice's firewall, but
Bob's firewall doesn't know that. It now thinks that
anything which comes from 1.1.1.1 port 1414 and is
addressed to Bob's IP address 2.2.2.2 and port 2828 is
legitimate - it must be the response to the query which
has just been sent.
Now the Skype server passes Bob's coordinates on to
Alice, whose Skype application attempts to contact Bob
at 2.2.2.2:2828. Bob's firewall sees the recognised
sender address and passes the apparent response on to
Bob's PC - and his Skype phone rings.
Doing the rounds
This description is of course somewhat simplified - the
details depend on the specific properties of the
firewalls used. But it corresponds in principle to our
observations of the process of establishing a
connection between two Skype clients, each of which was
behind a Linux firewall. The firewalls were configured
with NAT for a LAN and permitted outgoing UDP traffic.
Linux' NAT functions have the VoIP friendly property
of, at least initially, not changing the ports of
outgoing packets. The NAT router merely replaces the
private, local IP address with its own address - the
UDP source port selected by Skype is retained. Only
when multiple clients on the local network use the same
source port does the NAT router stick its oar in and
reset the port to a previously unused value. This is
because each set of two IP addresses and ports must be
able to be unambiguously assigned to a connection
between two computers at all times. The router will
subsequently have to reconstruct the internal IP
address of the original sender from the response
packet's destination port.
Other NAT routers will try to assign ports in a
specific range, for example ports from 30,000 onwards,
and translate UDP port 1414, if possible, to
31414. This is, of course, no problem for Skype - the
procedure described above continues to work in a
similar manner without limitations.
It becomes a little more complicated if a firewall
simply assigns ports in sequence, like Check Point's
FireWall-1: the first connection is assigned 30001,
the next 30002, etc. The Skype server knows that Bob is
talking to it from port 31234, but the connection to
Alice will run via a different port. But even here
Skype is able to outwit the firewall. It simply runs
through the ports above 31234 in sequence, hoping at
some point to stumble on the right one. But if this
doesn't work first go, Skype doesn't give up. Bob's
Skype opens a new connection to the Skype server, the
source port of which is then used for a further
sequence of probes.
Nevertheless, in very active networks Alice may not
find the correct, open port. The same also applies for
a particular type of firewall, which assigns every new
connection to a random source port. The Skype server is
then unable to tell Alice where to look for a suitable
hole in Bob's firewall.
However, even then, Skype doesn't give up. In such
cases a Skype server is then used as a relay. It
accepts incoming connections from both Alice and Bob
and relays the packets onwards. This solution is always
possible, as long as the firewall permits outgoing UDP
traffic. It involves, however, an additional load on
the infrastructure, because all audio data has to run
through Skype's servers. The extended packet transmission
times can also result in an unpleasant delay.
Use of the procedure described above is not limited to
Skype and is known as "UDP hole punching". Other
network services such as the Hamachi gaming VPN
application, which relies on peer-to-peer communication
between computers behind firewalls, use similar
procedures. A more developed form has even made it to
the rank of a standard - RFC 3489 "Simple Traversal of UDP
through NAT" (STUN) describes a protocol which with two
STUN clients can get around the restrictions of NAT
with the help of a STUN server in many cases. The
draft Traversal Using Relay NAT (TURN) protocol describes a possible
standard for relay servers.
DIY hole punching
With a few small utilities, you can try out UDP hole
punching for yourself. The tools required, hping2 and
netcat, can be found in most Linux
distributions. Local is a computer behind a
Linux firewall (local-fw) with a stateful firewall
which only permits outgoing (UDP) connections. For
simplicity, in our test the test computer
remote was connected directly to the internet
with no firewall.
Firstly start a UDP listener on UDP port 14141 on the
local/1 console behind the firewall:
local/1# nc -u -l -p 14141
An external computer "remote" then attempts to contact it.
remote# echo "hello" | nc -p 53 -u local-fw 14141
However, as expected nothing is received on
local/1 and, thanks to the firewall, nothing
is returned to remote. Now on a second
console, local/2, hping2, our universal tool
for generating IP packets, punches a hole in the
firewall:
local/2# hping2 -c 1 -2 -s 14141 -p 53 remote
As long as remote is behaving itself, it will
send back a "port unreachable" response via ICMP -
however this is of no consequence. On the second
attempt
remote# echo "hello" | nc -p 53 -u local-fw 14141
the netcat listener on console local/1 then
coughs up a "hello" - the UDP packet from outside has
passed through the firewall and arrived at the computer
behind it.
Network administrators who do not appreciate this sort
of hole in their firewall and are worried about abuse,
are left with only one option - they have to block
outgoing UDP traffic, or limit it to essential
individual cases. UDP is not required for normal
internet communication anyway - the web, e-mail and
suchlike all use TCP. Streaming protocols may, however,
encounter problems, as they often use UDP because of
the reduced overhead.
Astonishingly, hole punching also works with TCP. After
an outgoing SYN packet the firewall / NAT router will
forward incoming packets with suitable IP addresses and
ports to the LAN even if they fail to confirm, or
confirm the wrong sequence number (ACK). Linux
firewalls at least, clearly fail to evaluate this
information consistently. Establishing a TCP connection
in this way is, however, not quite so simple, because
Alice does not have the sequence number sent in Bob's
first packet. The packet containing this information
was discarded by her firewall.
HashCode推薦算法
《Effective Java》
1,int result = 17;
2,對(duì)每個(gè)重要數(shù)據(jù)成員(Equals中用到的),計(jì)算int c:
boolean : c = f?0:1;
byte,int,char,short : c = (int)f
long c = (int)(f^(f>>>32));
float : c = Float.floatToIntBits(f);
double : long l = (int)(f^(f>>>32));c = (int)(l^(l>>>32));
其它reference : c = f.hashCode();
3,result = 37*result+c;
正則表達(dá)式語(yǔ)法
翻遍了MSDN2002中關(guān)于正則表達(dá)式的文章,居然找不到正則表達(dá)式語(yǔ)法,氣死我了。
貼過來。轉(zhuǎn)載自某人的轉(zhuǎn)載的轉(zhuǎn)載的轉(zhuǎn)載....
正則表達(dá)式(regular expression)描述了一種字符串匹配的模式,可以用來檢查一個(gè)串是否含有某種子串、將匹配的子串做替換或者從某個(gè)串中取出符合某個(gè)條件的子串等。
列目錄時(shí), dir *.txt或ls *.txt中的*.txt就
不是一個(gè)正則表達(dá)式,因?yàn)檫@里*與正則式的*的含義是不同的。
為便于理解和記憶,先從一些概念入手,所有特殊字符或字符組合有一個(gè)總表在后面,最后一些例子供理解相應(yīng)的概念。
正則表達(dá)式
是由普通字符(例如字符 a 到 z)以及特殊字符(稱為元字符)組成的文字模式。正則表達(dá)式作為一個(gè)模板,將某個(gè)字符模式與所搜索的字符串進(jìn)行匹配。
可以通過在一對(duì)分隔符之間放入表達(dá)式模式的各種組件來構(gòu)造一個(gè)正則表達(dá)式,即/expression/
普通字符
由所有那些未顯式指定為元字符的打印和非打印字符組成。這包括所有的大寫和小寫字母字符,所有數(shù)字,所有標(biāo)點(diǎn)符號(hào)以及一些符號(hào)。
非打印字符
字符 | 含義 |
\cx | 匹配由x指明的控制字符。例如, \cM 匹配一個(gè) Control-M 或回車符。x 的值必須為 A-Z 或 a-z 之一。否則,將 c 視為一個(gè)原義的 'c' 字符。 |
\f | 匹配一個(gè)換頁(yè)符。等價(jià)于 \x0c 和 \cL。 |
\n | 匹配一個(gè)換行符。等價(jià)于 \x0a 和 \cJ。 |
\r | 匹配一個(gè)回車符。等價(jià)于 \x0d 和 \cM。 |
\s | 匹配任何空白字符,包括空格、制表符、換頁(yè)符等等。等價(jià)于 [ \f\n\r\t\v]。 |
\S | 匹配任何非空白字符。等價(jià)于 [^ \f\n\r\t\v]。 |
\t | 匹配一個(gè)制表符。等價(jià)于 \x09 和 \cI。 |
\v | 匹配一個(gè)垂直制表符。等價(jià)于 \x0b 和 \cK。 |
特殊字符
所謂特殊字符,就是一些有特殊含義的字符,如上面說的"*.txt"中的*,簡(jiǎn)單的說就是表示任何字符串的意思。如果要查找文件名中有*的文件,則需要對(duì)*進(jìn)行轉(zhuǎn)義,即在其前加一個(gè)\。ls \*.txt。正則表達(dá)式有以下特殊字符。
特別字符 | 說明 |
$ | 匹配輸入字符串的結(jié)尾位置。如果設(shè)置了 RegExp 對(duì)象的 Multiline 屬性,則 $ 也匹配 '\n' 或 '\r'。要匹配 $ 字符本身,請(qǐng)使用 \$。 |
( ) | 標(biāo)記一個(gè)子表達(dá)式的開始和結(jié)束位置。子表達(dá)式可以獲取供以后使用。要匹配這些字符,請(qǐng)使用 \( 和 \)。 |
* | 匹配前面的子表達(dá)式零次或多次。要匹配 * 字符,請(qǐng)使用 \*。 |
+ | 匹配前面的子表達(dá)式一次或多次。要匹配 + 字符,請(qǐng)使用 \+。 |
. | 匹配除換行符 \n之外的任何單字符。要匹配 .,請(qǐng)使用 \。 |
[ | 標(biāo)記一個(gè)中括號(hào)表達(dá)式的開始。要匹配 [,請(qǐng)使用 \[。 |
? | 匹配前面的子表達(dá)式零次或一次,或指明一個(gè)非貪婪限定符。要匹配 ? 字符,請(qǐng)使用 \?。 |
\ | |
^ | 匹配輸入字符串的開始位置,除非在方括號(hào)表達(dá)式中使用,此時(shí)它表示不接受該字符集合。要匹配 ^ 字符本身,請(qǐng)使用 \^。 |
{ | 標(biāo)記限定符表達(dá)式的開始。要匹配 {,請(qǐng)使用 \{。 |
| | 指明兩項(xiàng)之間的一個(gè)選擇。要匹配 |,請(qǐng)使用 \|。 |
構(gòu)造正則表達(dá)式的方法和創(chuàng)建數(shù)學(xué)表達(dá)式的方法一樣。也就是用多種元字符與操作符將小的表達(dá)式結(jié)合在一起來創(chuàng)建更大的表達(dá)式。正則表達(dá)式的組件可以是單個(gè)的字符、字符集合、字符范圍、字符間的選擇或者所有這些組件的任意組合。
限定符
限定符用來指定正則表達(dá)式的一個(gè)給定組件必須要出現(xiàn)多少次才能滿足匹配。有*或+或?或{n}或{n,}或{n,m}共6種。
*、+和?限定符都是貪婪的,因?yàn)樗鼈儠?huì)盡可能多的匹配文字,只有在它們的后面加上一個(gè)?就可以實(shí)現(xiàn)非貪婪或最小匹配。
正則表達(dá)式的限定符有:
字符 | 描述 |
* | 匹配前面的子表達(dá)式零次或多次。例如,zo* 能匹配 "z" 以及 "zoo"。* 等價(jià)于{0,}。 |
+ | |
? | |
{n} | n 是一個(gè)非負(fù)整數(shù)。匹配確定的 n 次。例如,'o{2}' 不能匹配 "Bob" 中的 'o',但是能匹配 "food" 中的兩個(gè) o。 |
{n,} | n 是一個(gè)非負(fù)整數(shù)。至少匹配n 次。例如,'o{2,}' 不能匹配 "Bob" 中的 'o',但能匹配 "foooood" 中的所有 o。'o{1,}' 等價(jià)于 'o+'。'o{0,}' 則等價(jià)于 'o*'。 |
{n,m} | m 和 n 均為非負(fù)整數(shù),其中n <= m。最少匹配 n 次且最多匹配 m 次。例如,"o{1,3}" 將匹配 "fooooood" 中的前三個(gè) o。'o{0,1}' 等價(jià)于 'o?'。請(qǐng)注意在逗號(hào)和兩個(gè)數(shù)之間不能有空格。 |
定位符
用來描述字符串或單詞的邊界,^和$分別指字符串的開始與結(jié)束,\b描述單詞的前或后邊界,\B表示非單詞邊界。
不能對(duì)定位符使用限定符。選擇
用圓括號(hào)將所有選擇項(xiàng)括起來,相鄰的選擇項(xiàng)之間用|分隔。但用圓括號(hào)會(huì)有一個(gè)副作用,是相關(guān)的匹配會(huì)被緩存,此時(shí)可用?:放在第一個(gè)選項(xiàng)前來消除這種副作用。
其中?:是非捕獲元之一,還有兩個(gè)非捕獲元是?=和?!,這兩個(gè)還有更多的含義,前者為正向預(yù)查,在任何開始匹配圓括號(hào)內(nèi)的正則表達(dá)式模式的位置來匹配搜索字符串,后者為負(fù)向預(yù)查,在任何開始不匹配該正則表達(dá)式模式的位置來匹配搜索字符串。
后向引用
對(duì)一個(gè)正則表達(dá)式模式或部分模式兩邊添加圓括號(hào)將導(dǎo)致相關(guān)匹配存儲(chǔ)到一個(gè)臨時(shí)緩沖區(qū)中,所捕獲的每個(gè)子匹配都按照在正則表達(dá)式模式中從左至右所遇到的
內(nèi)容存儲(chǔ)。存儲(chǔ)子匹配的緩沖區(qū)編號(hào)從 1 開始,連續(xù)編號(hào)直至最大 99 個(gè)子表達(dá)式。每個(gè)緩沖區(qū)都可以使用 '\n' 訪問,其中 n
為一個(gè)標(biāo)識(shí)特定緩沖區(qū)的一位或兩位十進(jìn)制數(shù)。
可以使用非捕獲元字符 '?:', '?=', or '?!' 來忽略對(duì)相關(guān)匹配的保存。
各種操作符的運(yùn)算優(yōu)先級(jí)
相同優(yōu)先級(jí)的從左到右進(jìn)行運(yùn)算,不同優(yōu)先級(jí)的運(yùn)算先高后低。各種操作符的優(yōu)先級(jí)從高到低如下:
操作符 | 描述 |
\ | 轉(zhuǎn)義符 |
(), (?:), (?=), [] | 圓括號(hào)和方括號(hào) |
*, +, ?, {n}, {n,}, {n,m} | 限定符 |
^, $, \anymetacharacter | 位置和順序 |
| | “或”操作 |
全部符號(hào)解釋
字符 | 描述 |
\ | 將下一個(gè)字符標(biāo)記為一個(gè)特殊字符、或一個(gè)原義字符、或一個(gè) 向后引用、或一個(gè)八進(jìn)制轉(zhuǎn)義符。例如,'n' 匹配字符 "n"。'\n' 匹配一個(gè)換行符。序列 '\\' 匹配 "\" 而 "\(" 則匹配 "("。 |
^ | 匹配輸入字符串的開始位置。如果設(shè)置了 RegExp 對(duì)象的 Multiline 屬性,^ 也匹配 '\n' 或 '\r' 之后的位置。 |
$ | 匹配輸入字符串的結(jié)束位置。如果設(shè)置了RegExp 對(duì)象的 Multiline 屬性,$ 也匹配 '\n' 或 '\r' 之前的位置。 |
* | 匹配前面的子表達(dá)式零次或多次。例如,zo* 能匹配 "z" 以及 "zoo"。* 等價(jià)于{0,}。 |
+ | |
? | |
{n} | n 是一個(gè)非負(fù)整數(shù)。匹配確定的 n 次。例如,'o{2}' 不能匹配 "Bob" 中的 'o',但是能匹配 "food" 中的兩個(gè) o。 |
{n,} | n 是一個(gè)非負(fù)整數(shù)。至少匹配n 次。例如,'o{2,}' 不能匹配 "Bob" 中的 'o',但能匹配 "foooood" 中的所有 o。'o{1,}' 等價(jià)于 'o+'。'o{0,}' 則等價(jià)于 'o*'。 |
{n,m} | m 和 n 均為非負(fù)整數(shù),其中n <= m。最少匹配 n 次且最多匹配 m 次。例如,"o{1,3}" 將匹配 "fooooood" 中的前三個(gè) o。'o{0,1}' 等價(jià)于 'o?'。請(qǐng)注意在逗號(hào)和兩個(gè)數(shù)之間不能有空格。 |
? | 當(dāng)
該字符緊跟在任何一個(gè)其他限制符 (*, +, ?, {n}, {n,}, {n,m})
后面時(shí),匹配模式是非貪婪的。非貪婪模式盡可能少的匹配所搜索的字符串,而默認(rèn)的貪婪模式則盡可能多的匹配所搜索的字符串。例如,對(duì)于字符串
"oooo",'o+?' 將匹配單個(gè) "o",而 'o+' 將匹配所有 'o'。 |
. | |
(pattern) | 匹配 pattern 并獲取這一匹配。所獲取的匹配可以從產(chǎn)生的 Matches 集合得到,在VBScript 中使用 SubMatches 集合,在JScript 中則使用 $0…$9 屬性。要匹配圓括號(hào)字符,請(qǐng)使用 '\(' 或 '\)'。 |
(?:pattern) | 匹
配 pattern 但不獲取匹配結(jié)果,也就是說這是一個(gè)非獲取匹配,不進(jìn)行存儲(chǔ)供以后使用。這在使用 "或" 字符 (|)
來組合一個(gè)模式的各個(gè)部分是很有用。例如, 'industr(?:y|ies) 就是一個(gè)比 'industry|industries'
更簡(jiǎn)略的表達(dá)式。 |
(?=pattern) | 正
向預(yù)查,在任何匹配 pattern
的字符串開始處匹配查找字符串。這是一個(gè)非獲取匹配,也就是說,該匹配不需要獲取供以后使用。例如,'Windows
(?=95|98|NT|2000)' 能匹配 "Windows 2000" 中的 "Windows" ,但不能匹配 "Windows 3.1"
中的
"Windows"。預(yù)查不消耗字符,也就是說,在一個(gè)匹配發(fā)生后,在最后一次匹配之后立即開始下一次匹配的搜索,而不是從包含預(yù)查的字符之后開始。 |
(?!pattern) | 負(fù)
向預(yù)查,在任何不匹配 pattern
的字符串開始處匹配查找字符串。這是一個(gè)非獲取匹配,也就是說,該匹配不需要獲取供以后使用。例如'Windows
(?!95|98|NT|2000)' 能匹配 "Windows 3.1" 中的 "Windows",但不能匹配 "Windows 2000"
中的 "Windows"。預(yù)查不消耗字符,也就是說,在一個(gè)匹配發(fā)生后,在最后一次匹配之后立即開始下一次匹配的搜索,而不是從包含預(yù)查的字符之后開始 |
x|y | 匹配 x 或 y。例如,'z|food' 能匹配 "z" 或 "food"。'(z|f)ood' 則匹配 "zood" 或 "food"。 |
[xyz] | |
[^xyz] | |
[a-z] | |
[^a-z] | |
\b | |
\B | |
\cx | 匹配由 x 指明的控制字符。例如, \cM 匹配一個(gè) Control-M 或回車符。x 的值必須為 A-Z 或 a-z 之一。否則,將 c 視為一個(gè)原義的 'c' 字符。 |
\d | 匹配一個(gè)數(shù)字字符。等價(jià)于 [0-9]。 |
\D | 匹配一個(gè)非數(shù)字字符。等價(jià)于 [^0-9]。 |
\f | 匹配一個(gè)換頁(yè)符。等價(jià)于 \x0c 和 \cL。 |
\n | 匹配一個(gè)換行符。等價(jià)于 \x0a 和 \cJ。 |
\r | 匹配一個(gè)回車符。等價(jià)于 \x0d 和 \cM。 |
\s | 匹配任何空白字符,包括空格、制表符、換頁(yè)符等等。等價(jià)于 [ \f\n\r\t\v]。 |
\S | 匹配任何非空白字符。等價(jià)于 [^ \f\n\r\t\v]。 |
\t | 匹配一個(gè)制表符。等價(jià)于 \x09 和 \cI。 |
\v | 匹配一個(gè)垂直制表符。等價(jià)于 \x0b 和 \cK。 |
\w | |
\W | |
\xn | 匹配 n,其中 n 為十六進(jìn)制轉(zhuǎn)義值。十六進(jìn)制轉(zhuǎn)義值必須為確定的兩個(gè)數(shù)字長(zhǎng)。例如,'\x41' 匹配 "A"。'\x041' 則等價(jià)于 '\x04' & "1"。正則表達(dá)式中可以使用 ASCII 編碼。. |
\num | 匹配 num,其中 num 是一個(gè)正整數(shù)。對(duì)所獲取的匹配的引用。例如,'(.)\1' 匹配兩個(gè)連續(xù)的相同字符。 |
\n | 標(biāo)識(shí)一個(gè)八進(jìn)制轉(zhuǎn)義值或一個(gè)向后引用。如果 \n 之前至少 n 個(gè)獲取的子表達(dá)式,則 n 為向后引用。否則,如果 n 為八進(jìn)制數(shù)字 (0-7),則 n 為一個(gè)八進(jìn)制轉(zhuǎn)義值。 |
\nm | 標(biāo)
識(shí)一個(gè)八進(jìn)制轉(zhuǎn)義值或一個(gè)向后引用。如果 \nm 之前至少有 nm 個(gè)獲得子表達(dá)式,則 nm 為向后引用。如果 \nm 之前至少有 n
個(gè)獲取,則 n 為一個(gè)后跟文字 m 的向后引用。如果前面的條件都不滿足,若 n 和 m 均為八進(jìn)制數(shù)字 (0-7),則 \nm
將匹配八進(jìn)制轉(zhuǎn)義值 nm。 |
\nml | 如果 n 為八進(jìn)制數(shù)字 (0-3),且 m 和 l 均為八進(jìn)制數(shù)字 (0-7),則匹配八進(jìn)制轉(zhuǎn)義值 nml。 |
\un | 匹配 n,其中 n 是一個(gè)用四個(gè)十六進(jìn)制數(shù)字表示的 Unicode 字符。例如, \u00A9 匹配版權(quán)符號(hào) (?)。 |
部分例子
正則表達(dá)式 | 說明 |
/\b([a-z]+) \1\b/gi | 一個(gè)單詞連續(xù)出現(xiàn)的位置 |
/(\w+):\/\/([^/:]+)(:\d*)?([^# ]*)/ | 將一個(gè)URL解析為協(xié)議、域、端口及相對(duì)路徑 |
/^(?:Chapter|Section) [1-9][0-9]{0,1}$/ | 定位章節(jié)的位置 |
/[-a-z]/ | A至z共26個(gè)字母再加一個(gè)-號(hào)。 |
/ter\b/ | 可匹配chapter,而不能terminal |
/\Bapt/ | 可匹配chapter,而不能aptitude |
/Windows(?=95 |98 |NT )/ | 可匹配Windows95或Windows98或WindowsNT,當(dāng)找到一個(gè)匹配后,從Windows后面開始進(jìn)行下一次的檢索匹配。 |