官术网_书友最值得收藏!

Repeated occurrences

So far, we saw how we can match fixed characters or numeric patterns. Most often, you want to handle certain repetitive natures of patterns also. For example, if I want to match 4 as, I can write /aaaa/, but what if I want to specify a pattern that can match any number of as?

Regular expressions provide you with a wide variety of repetition quantifiers. Repetition quantifiers let us specify how many times a particular pattern can occur. We can specify fixed values (characters should appear n times) and variable values (characters can appear at least n times till they appear m times). The following table lists the various repetition quantifiers:

  • ?: Either 0 or 1 occurrence (marks the occurrence as optional)
  • *: 0 or more occurrences
  • +: 1 or more occurrences
  • {n}: Exactly n occurrences
  • {n,m}: Occurrences between n and m
  • {n,}: At least an n occurrence
  • {,n}: 0 to n occurrences

In the following example, we create a pattern where the character u is optional (has 0 or 1 occurrence):

var str = /behaviou?r/;
console.log(str.test("behaviour"));
// true
console.log(str.test("behavior"));
// true

It helps to read the /behaviou?r/ expression as 0 or 1 occurrences of character u. The repetition quantifier succeeds the character that we want to repeat. Let's try out some more examples:

console.log(/'\d+'/.test("'123'")); // true

You should read and interpret the \d+ expression as ' is a literal character match, \d matches characters [0-9], the + quantifier will allow one or more occurrences, and ' is a literal character match.

You can also group character expressions using (). Observe the following example:

var heartyLaugh = /Ha+(Ha+)+/i;
console.log(heartyLaugh.test("HaHaHaHaHaHaHaaaaaaaaaaa"));
//true

Let's break the preceding expression into smaller chunks to understand what is going on in here:

  • H: literal character match
  • a+: 1 or more occurrences of character a
  • (: start of the expression group
  • H: literal character match
  • a+: 1 or more occurrences of character a
  • ): end of expression group
  • +: 1 or more occurrences of expression group (Ha+)

Now it is easier to see how the grouping is done. If we have to interpret the expression, it is sometimes helpful to read out the expression, as shown in the preceding example.

Often, you want to match a sequence of letters or numbers on their own and not just as a substring. This is a fairly common use case when you are matching words that are not just part of any other words. We can specify the word boundaries by using the \b pattern. The word boundary with \b matches the position where one side is a word character (letter, digit, or underscore) and the other side is not. Consider the following examples.

The following is a simple literal match. This match will also be successful if cat is part of a substring:

console.log(/cat/.test('a black cat')); //true

However, in the following example, we define a word boundary by indicating \b before the word cat—this means that we want to match only if cat is a word and not a substring. The boundary is established before cat, and hence a match is found on the text, a black cat:

console.log(/\bcat/.test('a black cat')); //true

When we use the same boundary with the word tomcat, we get a failed match because there is no word boundary before cat in the word tomcat:

console.log(/\bcat/.test('tomcat')); //false

There is a word boundary after the string cat in the word tomcat, hence the following is a successful match:

console.log(/cat\b/.test('tomcat')); //true

In the following example, we define the word boundary before and after the word cat to indicate that we want cat to be a standalone word with boundaries before and after:

console.log(/\bcat\b/.test('a black cat')); //true

Based on the same logic, the following match fails because there are no boundaries before and after cat in the word concatenate:

console.log(/\bcat\b/.test("concatenate")); //false

The exec() method is useful in getting information about the match found because it returns an object with information about the match. The object returned from exec() has an index property that tells us where the successful match begins in the string. This is useful in many ways:

var match = /\d+/.exec("There are 100 ways to do this");
console.log(match);
// ["100"]
console.log(match.index);
// 10

Alternatives – OR

Alternatives can be expressed using the | (pipe) character. For example, /a|b/ matches either the a or b character, and /(ab)+|(cd)+/ matches one or more occurrences of either ab or cd.

主站蜘蛛池模板: 宝鸡市| 电白县| 固阳县| 彰武县| 康马县| 宣恩县| 文成县| 仁布县| 伊春市| 揭西县| 通榆县| 朔州市| 新密市| 鹤山市| 洛扎县| 斗六市| 新安县| 博白县| 日土县| 达州市| 丹寨县| 邻水| 秦安县| 科尔| 丽水市| 临海市| 顺昌县| 河北省| 乐东| 南宫市| 壤塘县| 新建县| 红桥区| 佛冈县| 曲沃县| 新营市| 饶平县| 大石桥市| 格尔木市| 通化市| 永平县|