cq9为什么五块免费不出分

書名： Python進(jìn)階編程：編寫更高效、優(yōu)雅的Python代碼
作者名：劉宇宙謝東劉艷
本章字?jǐn)?shù)： 908字
更新時(shí)間： 2021-04-30 12:39:39

2.3.3　字符串匹配和搜索

在實(shí)際應(yīng)用中，我們有時(shí)需要搜索特定模式的文本。

如果想匹配的是字面字符串，那么通常只需要調(diào)用基本字符串方法即可，如str.find()、str.endswith()、str.startswith()或類似的方法，示例如下：

text_val = 'life is short, I use python, what about you'
print(text_val == 'life')
print(text_val.startswith('life'))
print(text_val.endswith('what'))
print(text_val.find('python'))

對(duì)于復(fù)雜的匹配，我們需要使用正則表達(dá)式和re模塊，如匹配數(shù)字格式的日期字符串04/20/2020，示例如下：

date_text_1 = '04/20/2020'
date_text_2 = 'April 20, 2020'

import re
if re.match(r'\d+/\d+/\d+', date_text_1):
    print('yes,the date type is match')
else:
    print('no,it is not match')

if re.match(r'\d+/\d+/\d+', date_text_2):
    print('yes,it match')
else:
    print('no,not match')

若想使用同一個(gè)模式去做多次匹配，可以先將模式字符串預(yù)編譯為模式對(duì)象，示例如下：

date_pat = re.compile(r'\d+/\d+/\d+')
if date_pat.match(date_text_1):
    print('yes,the date type is match')
else:
    print('no,it is not match')

if date_pat.match(date_text_2):
    print('yes,it match')
else:
    print('no,not match')

match()方法總是從字符串開始去匹配。如果想查找字符串任意部分的模式出現(xiàn)位置，可以使用findall()方法代替，示例如下：

date_text = 'Today is 11/27/2012. PyCon starts 3/13/2013.'
print(date_pat.findall(date_text))

定義正則式時(shí)，通常用括號(hào)捕獲分組，示例如下：

date_pat_1 = re.compile(r'(\d+)/(\d+)/(\d+)')

捕獲分組可以使得后面的處理更加簡(jiǎn)單，并且可以分別將每個(gè)組的內(nèi)容提取出來(lái)，相關(guān)代碼（str_match_search.py）示例如下：

group_result = date_pat_1.match('04/20/2020')
print(f'group result is:{group_result}')
print(f'group 0 is:{group_result.group(0)}')
print(f'group 1 is:{group_result.group(1)}')
print(f'group 2 is:{group_result.group(2)}')
print(f'group 3 is:{group_result.group(3)}')

print(f'groups is:{group_result.groups()}')

month, date, year = group_result.groups()
print(f'month is {month}, date is {date}, year is {year}')

print(date_pat_1.findall(date_text))

for month, day, year in date_pat_1.findall(date_text):
    print(f'{year}-{month}-{day}')

執(zhí)行py文件，得到的輸出結(jié)果類似如下：

group result is:<re.Match object; span=(0, 10), match='04/20/2020'>
group 0 is:04/20/2020
group 1 is:04
group 2 is:20
group 3 is:2020
groups is:('04', '20', '2020')
month is 04, date is 20, year is 2020
[('11', '27', '2012'), ('3', '13', '2013')]
2012-11-27
2013-3-13

findall()方法會(huì)搜索文本并以列表形式返回所有的匹配。如果想以迭代方式返回匹配，可以使用finditer()方法代替，相關(guān)代碼（str_match_search.py）示例如下：

for m_val in date_pat_1.finditer(date_text):
    print(m_val.groups())

這里闡述了使用re模塊進(jìn)行匹配和搜索文本的最基本方法。核心步驟就是先使用re.compile()方法編譯正則表達(dá)式字符串，然后使用match()、findall()或者finditer()等方法進(jìn)行匹配。

我們?cè)趯懻齽t表達(dá)式字符串的時(shí)候，相對(duì)普遍的做法是使用原始字符串，比如r'(\d+)/(\d+)/(\d+)'。這種字符串不需要解析反斜杠，這在正則表達(dá)式中是很有用的。如果不使用原始字符串，必須使用兩個(gè)反斜杠，類似'(\\d+)/(\\d+)/(\\d+)'。

注意：match()方法僅僅檢查字符串的開始部分。它的匹配結(jié)果有可能并不是期望的那樣，示例如下：

group_result = date_pat_1.match('04/20/2020abcdef')
print(group_result)
print(group_result.group())

如果想精確匹配，需要確保正則表達(dá)式以$結(jié)尾，示例如下：

date_pat_2 = re.compile(r'(\d+)/(\d+)/(\d+)$')
print(date_pat_2.match('04/20/2020abcdef'))
print(date_pat_2.match('04/20/2020'))

如果僅僅是做一次簡(jiǎn)單的文本匹配/搜索操作，可以略過(guò)編譯部分，直接使用re模塊級(jí)別的函數(shù)，示例如下：

print(re.findall(r'(\d+)/(\d+)/(\d+)', date_text))

注意：如果打算做大量的匹配和搜索操作，最好先編譯正則表達(dá)式，然后再重復(fù)使用它。模塊級(jí)別的函數(shù)會(huì)將最近編譯過(guò)的模式緩存起來(lái)，因此不會(huì)降低太多性能。如果使用預(yù)編譯模式，會(huì)減少查找和一些額外處理的損耗。

官术网_书友最值得收藏!

Python進(jìn)階編程：編寫更高效、優(yōu)雅的Python代碼

2.3.3 字符串匹配和搜索

2.3.3　字符串匹配和搜索