JavaScript - RegExp

Overview

Estimated time: 40–50 minutes

Regular expressions (RegExp) let you search, extract, and transform text with patterns. Learn literals vs constructor, flags, capturing groups, lookarounds, and Unicode-safe matching.

Learning Objectives

Create regex patterns with literals and the RegExp constructor.
Use test, exec, match, and matchAll effectively.
Capture with numbered and named groups and use them in replace.
Apply anchors, boundaries, and lookaheads/lookbehinds for precise matches.
Work with Unicode: u flag, property escapes, and pitfalls.

Prerequisites

JavaScript - Strings
JavaScript - Arrays (recommended)

Creating regexes and flags

// Literal syntax
const rx1 = /cat/;              // simple pattern
const rx2 = /c.at/i;            // dot matches any char; i = case-insensitive

// Constructor (useful for dynamic patterns; note escaping of backslashes)
const word = "hello";
const rx3 = new RegExp(`^${word}\\d+$`, 'gm');

// Common flags:
// g - global (find all matches)
// i - ignore case
// m - multiline (^ and $ match per line)
// s - dotAll (dot matches newlines)
// u - unicode (enables code points, \p{...}, better escapes)
// y - sticky (match at lastIndex only)

Testing and matching

const str = 'Cat catalog concatenation';
/cat/i.test(str);          // true

// match vs matchAll
const res1 = str.match(/cat/gi);     // ['Cat','cat']

// matchAll returns an iterator with groups; spread to array
const matches = [...str.matchAll(/c(at)/gi)];
// each item: ["Cat", "at", index, input, groups]

exec loops and lastIndex

// exec with global or sticky keeps state via lastIndex
const rx = /a./g;
const s = 'a1 a2 a3';
let m;
while ((m = rx.exec(s)) !== null) {
  console.log(m[0], 'at', m.index);
}
// Beware: reusing a global regex across different strings can lead to surprises due to lastIndex.
// Prefer creating a fresh regex or reset lastIndex = 0.

Capturing groups and replace

// Reorder YYYY-MM-DD to DD/MM/YYYY
'2025-09-05'.replace(/(\d{4})-(\d{2})-(\d{2})/, '$3/$2/$1'); // '05/09/2025'

// Named groups (modern engines)
const m2 = '2025-09-05'.match(/(?<y\d>\d{4})-(?<m\d>\d{2})-(?<d\d>\d{2})/);
// Access by name in replace
'2025-09-05'.replace(/(?<y\d>\d{4})-(?<m\d>\d{2})-(?<d\d>\d{2})/, '$<d\d>/$<m\d>/$<y\d>');

// Replace with a function for flexible transformations
'foo-12 bar-34'.replace(/(\w+)-(\d+)/g, (m, name, num) => `${name}:${Number(num)*2}`);
// 'foo:24 bar:68'

Anchors, boundaries, and lookarounds

// Anchors and boundaries
/^\w+$/m.test('hello');      // start ^ and end $ (multiline aware)
/\bcat\b/i.test('a cat!');  // word boundary; not true for 'concatenate'

// Lookaheads and lookbehinds
const s2 = 'Item: A-12, B-07';
// Match code letters followed by hyphen and digits (but only capture letters)
const ahead = /[A-Z]+(?=-\d+)/g; // positive lookahead
[...s2.matchAll(ahead)].map(m => m[0]); // ['A','B']

// Extract digits preceded by letters and hyphen (lookbehind)
const behind = /(?<=[A-Z]+-)\d+/g;  // positive lookbehind
[...s2.matchAll(behind)].map(m => m[0]); // ['12','07']

Unicode and property escapes

// Use the 'u' flag for full code point support and property escapes
const emoji = 'A😀B';
/.{2}/.test(emoji);          // true (but splits the emoji surrogate pair!)
/.{2}/u.test(emoji);         // false (correct count by code point)

// Unicode properties (requires 'u')
const words = 'über España 東京 123';
const rxWords = /\p{L}+/gu;   // one or more letters from any script
[...words.matchAll(rxWords)].map(m => m[0]);
// ['über','España','東京']

Common Pitfalls

Escaping: When building patterns with RegExp, you must double-escape backslashes (e.g., "\\d").
Global state: /g and /y modify lastIndex. Don’t reuse a stateful regex across unrelated strings.
ASCII classes: \w, \d, \b are ASCII-centric. For international text use u flag and Unicode properties.
Serialization: Regexes aren’t JSON-serializable. Store the pattern and flags separately if needed.
Lookbehind support: Older environments may lack lookbehind; feature-detect or provide fallbacks.

Try it

Run to extract codes using groups and lookarounds:

« Previous: JavaScript - Map & Set (WeakMap/WeakSet) | Next: JavaScript - Date & Math »