RegExp and regular expressions (part 1)

Maria Cristina Di Termine

JavaScript has native support for regular expressions based on the RegExp object. A regular expression in JavaScript is therefore an object, with properties and methods that allows we to manage texts, locate and possibly replace strings within other strings.

Create a regular expression

There are two approaches to creating a regular expression: referring explicitly to the RegExp object or using a special literal notation. Example:

var x = new RegExp("apple");
var y = /apple/;

Both statements get the same result: a regular expression for searching for instances of the string “apple” within other strings.

Syntax and functionality

The syntax and functionality of regular expressions supported by JavaScript are based on the model supported by Perl 5 (Perl is an open source, general-use, interpreted language). We can say that a regular expression is a string pattern (pattern) composed of a sequence of alphanumeric characters and any special characters.

An expression of only alphanumeric characters directly indicates the string to be searched within another string. For example, the regular / apple / expression can be used to find or replace the abc substring within a string.

Normally, the search for patterns within a string takes into account the distinction between uppercase and lowercase and the search or replacement ends as soon as an occurrence is found. It is also possible to change the default behavior using the following modifiers:

i = performs a search by ignoring the case sensitivity
g = performs a global search, that is, it identifies all the occurrences of a pattern
m = performs a multiline string search

Modifiers are specified differently depending on the approach used to define the regular expression. In case of use of a literal modifiers are specified immediately after the letter itself, while in case of use of the RegExp object they are indicated as an additional parameter. The following examples show the use of modifiers in both cases:

var x = new RegExp("apple", "i");
var y = /apple/i;

var x = new RegExp("apple", "ig");
var y = /apple/ig;

Special characters

Special characters in a regular expression let you create patterns that identify not just one string but sets of strings.
Between these characters we have the square brackets that allow to specify a set of alphanumeric characters. For example, the following regular expression indicates the set of strings that begin by voice and end with ‘pole’:

var y = /[aeiou]pple/i;

We also have the ability to specify a range of characters indicating the start and end elements, such as [a-z] or [0-9].

Another category of special characters is metacharacters, that is a category of characters that indicate others. To give some examples:

. = The dot indicates any character
\w = indicates an alphanumeric character
\d = indicates a numerical digit

To prevent w and d from being interpreted as alphabetic characters we place the slash in front. So within a regular expression, to use wildcards w and d we will use the sequence \w and \d.
As a result, the following regular expression identifies strings that begin with a two-digit number are followed by the string aa and then by any character and end with two alphanumeric characters:

var y = /\d\daa.\w\w/i;

Another important category of special characters is that of quantifiers, ie characters that indicate how many times a character can appear in a string. Let’s see some:

+ = if it is placed after a character or a metacharacter, indicates that one or more occurrences of the character / metacharacter is expected
* = indicates the existence of zero or more occurrences
? = indicates the existence of zero or an occurrence
{n} = indicates the existence of exactly n occurrences

For example, the following regular expression identifies valid names for variables in JavaScript, that is alphanumeric variable length sequences that start with an alphabetic character:

var y = /[a-z]+\w*/i;


you might also like