This guide assumes Powershell 5.1 or 7+ and basic familiarity with regular expressions. The syntax is .NET regex (slightly different from POSIX or PCRE).

-match and -replace are the entry points and that's where most posts stop. The .NET regex engine underneath is much more powerful and much faster if you let it.

The Operators Quickly

'hello world' -match    'world'              # True, populates $matches
'hello world' -replace  'world','everyone'   # 'hello everyone'
'a,b,,c'      -split    ','                  # ['a','b','','c']
'a,b,,c'      -split    ',', 0, 'RegexMatch' # explicit regex split
@('foo','bar','baz') -match '^b'             # ['bar','baz'] - array mode

The array form on the right of -match filters: every element that matches survives. Useful, often missed.

$matches The Auto-Populated Hashtable

Every successful -match populates $matches with the capture groups:

'2025-04-13' -match '(\d{4})-(\d{2})-(\d{2})'
$matches[0]          # full match: '2025-04-13'
$matches[1]          # '2025'
$matches[2]          # '04'
$matches[3]          # '13'

$matches is overwritten on every successful match. If you do -match twice in a row, the first capture is gone. Copy it ($m = $matches.Clone()) before the next match.

Named Captures Stop Counting

Numeric groups break the moment you reorder the regex. Named captures don't:

'2025-04-13' -match '(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})'
$matches.year        # '2025'
$matches.month       # '04'
$matches.day         # '13'

Use (?<name>...) for every group you'll reference. Anonymous groups (?:...) are non-capturing use them for grouping that doesn't need to be extracted.

# 'log' is grouped but not captured; 'level' and 'msg' are
'log[INFO] hello' -match 'log\[(?<level>[A-Z]+)\]\s+(?<msg>.*)'

Multiple Matches [regex]::Matches

-match returns at most one match. For all matches, drop to the underlying class:

$text = 'order #4521, ref #998, batch #12'
[regex]::Matches($text, '#(\d+)') | ForEach-Object {
    [pscustomobject]@{
        Whole = $_.Value
        Number = [int]$_.Groups[1].Value
    }
}

Or use Select-String -AllMatches:

Select-String -InputObject $text -Pattern '#(\d+)' -AllMatches |
    ForEach-Object { $_.Matches.Value }

Compiled Regex Reuse Wins

Anything you'll run more than a few thousand times deserves a compiled, reusable [regex]:

$rx = [regex]::new('^(?<level>[A-Z]+)\s+(?<msg>.*)', 'Compiled, Multiline')

Get-Content ./big.log | ForEach-Object {
    $m = $rx.Match($_)
    if ($m.Success)
    {
        [pscustomobject]@{
            Level = $m.Groups['level'].Value
            Msg   = $m.Groups['msg'].Value
        }
    }
}
Approach 1M lines
-match (recompiles per call internally) ~3,800 ms
Cached [regex] ~2,900 ms
Cached [regex] with Compiled flag ~1,500 ms

The Compiled flag has a one-time JIT cost (~50 ms). For anything in a tight loop, it pays for itself before the first second.

Multiline Mode ^ and $ Per Line

Default mode treats ^ and $ as start/end of the whole string. With Multiline, they match per line:

$rx = [regex]::new('^ERROR.*', 'Multiline')
$rx.Matches($logText).Value

Combined with Singleline (where . matches newlines), you get the four mode combinations:

Singleline Multiline . matches \n ^/$ per line
off (default) off no no
off on no yes
on off yes no
on on yes yes

Pick deliberately. The default is the most surprising for log parsing.

-replace With a Callback

The -replace operator usually takes two strings. The right side can also be a script block when you use [regex]::Replace directly:

$text = 'temperatures: 23C, 18C, 31C'
[regex]::Replace($text, '(\d+)C', {
    param($m)
    $c = [int]$m.Groups[1].Value
    $f = [math]::Round($c * 9 / 5 + 32)
    "$f F"
})
# 'temperatures: 73 F, 64 F, 88 F'

Anything you can compute, you can substitute. Most scripted regex use cases that "needed Python" actually fit here.

Backreferences in the Replacement

Substitution strings use $1, $2, ${name} for back-references not \1:

'[email protected]' -replace '^(\w+)\.(\w+)@', '$2.$1@'
# '[email protected]'

'2025-04-13' -replace '(?<y>\d{4})-(?<m>\d{2})-(?<d>\d{2})', '${d}/${m}/${y}'
# '13/04/2025'

A literal $ in the replacement is $$. Forgetting this is the classic "where did my dollar sign go" bug.

Useful RegexOptions Flags

$opts = [System.Text.RegularExpressions.RegexOptions]::Compiled -bor
        [System.Text.RegularExpressions.RegexOptions]::IgnoreCase -bor
        [System.Text.RegularExpressions.RegexOptions]::CultureInvariant
[regex]::new($pattern, $opts)

The flags worth knowing:

  • IgnoreCase case-insensitive matching.
  • Compiled JIT the regex for speed.
  • Multiline per-line ^/$.
  • Singleline . matches \n.
  • IgnorePatternWhitespace allow whitespace and # comments inside the pattern. Lifesaver for complex regex.
  • CultureInvariant disable culture-specific casing (Turkish "I" / "ı" being the famous trap).
  • ExplicitCapture (...) becomes non-capturing; only (?<name>...) captures. Cleans up $matches.
$rx = [regex]::new(@'
    ^                       # start of line
    (?<ts>\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2})
    \s+
    \[(?<level>[A-Z]+)\]
    \s+
    (?<msg>.*)
    $
'@, 'Compiled, Multiline, IgnorePatternWhitespace')

Verbose, readable, fast. Pick this style for anything beyond a one-liner.

Anchors and Word Boundaries

'\bword\b'        # word boundary
'\Aword\Z'        # absolute start/end of string (ignores Multiline)
'(?=foo)'         # lookahead - "followed by foo"
'(?<=foo)'        # lookbehind - "preceded by foo"
'(?!foo)'         # negative lookahead
'(?<!foo)'        # negative lookbehind

Lookarounds don't consume input. Useful for "match X but only when surrounded by Y" without including Y in the result.

# Match a number not preceded by a $
'price 100, gain $50, count 12' | Select-String -Pattern '(?<!\$)\b\d+\b' -AllMatches |
    ForEach-Object { $_.Matches.Value }
# 100, 12

Common Pitfalls

  • -replace is regex by default. 'a.b' -replace '.', 'X' = 'XXX'. Use [regex]::Escape($literal) if your input is user-controlled and meant as literal text.
  • Greedy by default. <(.+)> on <a><b> matches a><b. Add ? for lazy: <(.+?)>.
  • $matches only populates on success. Check the boolean result of -match before reading.
  • Newline handling differs by host. Files written on Windows have \r\n. The $ anchor in default mode matches before \n but not before \r. If you're matching at the end of a line, use (?:\r?\n|\Z).

A Real-World Example

Parse Nginx access logs into structured objects:

$rx = [regex]::new(@'
    ^(?<ip>\S+)\s+
    \S+\s+
    (?<user>\S+)\s+
    \[(?<ts>[^\]]+)\]\s+
    "(?<method>\S+)\s+(?<path>\S+)\s+(?<proto>[^"]+)"\s+
    (?<status>\d+)\s+
    (?<size>\d+|-)\s+
    "(?<ref>[^"]*)"\s+
    "(?<ua>[^"]*)"
'@, 'Compiled, IgnorePatternWhitespace')

Get-Content ./access.log | ForEach-Object {
    $m = $rx.Match($_)
    if ($m.Success)
    {
        [pscustomobject]@{
            Ip     = $m.Groups['ip'].Value
            Time   = [DateTimeOffset]::ParseExact($m.Groups['ts'].Value,
                       'dd/MMM/yyyy:HH:mm:ss zzz',
                       [Globalization.CultureInfo]::InvariantCulture)
            Method = $m.Groups['method'].Value
            Path   = $m.Groups['path'].Value
            Status = [int]$m.Groups['status'].Value
            Size   = if ($m.Groups['size'].Value -eq '-') { 0 } else { [int]$m.Groups['size'].Value }
            UA     = $m.Groups['ua'].Value
        }
    }
} | Where-Object Status -ge 500 |
    Select-Object Time, Ip, Status, Path -First 50

One regex, one pipeline, structured objects out the other side. From here it's Group-Object, Export-Csv, or pipe into a SIEM.

What to Do Next

-match and -replace are convenient. The .NET regex engine underneath is industrial-strength: named captures, compiled reuse, multiline mode, lookarounds, callback substitution. Reach for [regex]::new() whenever you'd otherwise call the operator more than a few times in a row, name your captures so the script survives a refactor, and use IgnorePatternWhitespace to keep complex patterns readable.

Three concrete moves to upgrade your regex code today:

  1. Audit the most-frequently-run script in your pipeline. Search for -match inside loops; each match recompiles the regex on every iteration. Lift the pattern to a [regex]::new($pattern, 'Compiled') outside the loop and benchmark with Measure-Command. The speedup is usually order-of-magnitude on >10k iterations.
  2. Find a regex with three or more numeric groups ($matches[1], $matches[2]) in your codebase. Convert each to a named capture ((?<year>\d{4})). The first time someone reorders the pattern, the named version doesn't break.
  3. Take any complex regex longer than 50 characters and rewrite it as a verbose multi-line here-string with IgnorePatternWhitespace and # comments. Future-you (and any reviewer) will read it instead of running it through regex101 first.

Pairs naturally with the hashtables vs PSCustomObject post (a [regex]::Matches result is a structured object you'll want to project with Select-Object) and the logging post (script-block log parsing in DFIR is regex applied at scale, and the verbose-pattern style above pays off).