Skip to content
Java regex 6 min read

Regular Expressions

Regular expressions (regex) let you describe text patterns — then search, validate, or transform strings that match those patterns. Java provides first-class regex support through the java.util.regex package, which is both powerful and highly performant.

What Is a Regular Expression?

A regular expression is a sequence of characters that defines a search pattern. You write a pattern like \d{3}-\d{4} and Java checks whether a string (say, a phone number) fits that shape.

Java’s regex engine follows the POSIX NFA (non-deterministic finite automaton) flavour — the same family used by Perl and Python — so experience in those languages transfers almost directly.

The Core Classes

Three classes do almost all the work:

ClassRole
PatternCompiles a regex string into an efficient internal form
MatcherApplies a compiled Pattern against a specific input string
PatternSyntaxExceptionThrown when your regex syntax is invalid

You never instantiate Pattern with new. Instead, use the factory method Pattern.compile(regex).

import java.util.regex.Pattern;
import java.util.regex.Matcher;

public class RegexBasics {
    public static void main(String[] args) {
        Pattern pattern = Pattern.compile("\\d+");   // one or more digits
        Matcher matcher = pattern.matcher("Order 42 ships in 3 days");

        while (matcher.find()) {
            System.out.println("Found: " + matcher.group()
                + " at index " + matcher.start());
        }
    }
}

Output:

Found: 42 at index 6
Found: 3 at index 19

Quick Pattern Syntax Reference

Character Classes

SyntaxMatches
[abc]a, b, or c
[^abc]anything except a, b, or c
[a-z]any lowercase letter
[a-zA-Z0-9]alphanumeric
.any character except newline
\ddigit (0–9)
\Dnon-digit
\wword character ([a-zA-Z0-9_])
\Wnon-word character
\swhitespace (space, tab, newline…)
\Snon-whitespace

Note: In a Java string literal you must escape the backslash, so \d in a regex is written "\\d" in Java source code.

Quantifiers

SyntaxMeaning
?0 or 1 times
*0 or more
+1 or more
{n}exactly n times
{n,}at least n times
{n,m}between n and m times (inclusive)

Append ? to make a quantifier lazy (match as few characters as possible): +?, *?, {n,m}?.

Anchors and Boundaries

SyntaxMatches
^start of input (or start of line with MULTILINE)
$end of input
\bword boundary
\Bnon-word boundary

Checking for a Full Match

matches() on Matcher (or the shorthand String.matches()) checks whether the entire string matches the pattern — useful for input validation.

public class EmailValidator {
    public static void main(String[] args) {
        String emailRegex = "^[\\w.+-]+@[\\w-]+\\.[a-zA-Z]{2,}$";

        String[] emails = {"[email protected]", "bad@", "[email protected]"};

        for (String email : emails) {
            boolean valid = email.matches(emailRegex);
            System.out.println(email + " -> " + (valid ? "valid" : "invalid"));
        }
    }
}

Output:

[email protected] -> valid
bad@ -> invalid
[email protected] -> valid

Tip: String.matches(regex) implicitly anchors the pattern at both ends — it is equivalent to Pattern.matches("^" + regex + "$", input). Use Matcher.find() when you only want to locate a match anywhere in the string.

Capturing Groups

Wrap part of a pattern in ( ) to capture that portion separately. Groups are numbered left to right by their opening parenthesis.

import java.util.regex.*;

public class DateParser {
    public static void main(String[] args) {
        Pattern p = Pattern.compile("(\\d{4})-(\\d{2})-(\\d{2})");
        Matcher m = p.matcher("Invoice date: 2024-03-15");

        if (m.find()) {
            System.out.println("Year:  " + m.group(1));
            System.out.println("Month: " + m.group(2));
            System.out.println("Day:   " + m.group(3));
        }
    }
}

Output:

Year:  2024
Month: 03
Day:   15

Named Groups (Java 7+)

Use (?<name>...) to give a group a readable name instead of a number:

Pattern p = Pattern.compile("(?<year>\\d{4})-(?<month>\\d{2})-(?<day>\\d{2})");
Matcher m = p.matcher("2024-03-15");

if (m.matches()) {
    System.out.println("Year: " + m.group("year"));
}

Find and Replace

Matcher.replaceAll() and replaceFirst() let you swap matched text:

public class Censor {
    public static void main(String[] args) {
        String text = "The price is 100 and 200 dollars.";

        // Replace all numbers with ***
        String censored = text.replaceAll("\\d+", "***");
        System.out.println(censored);
    }
}

Output:

The price is *** and *** dollars.

You can also use String.replaceAll() directly — it compiles the pattern internally every call, so prefer Pattern.compile() when you reuse the same pattern.

Splitting Strings

Pattern.split() (or String.split()) divides a string wherever the pattern matches:

public class SplitDemo {
    public static void main(String[] args) {
        String csv = "apple , banana,  cherry , date";

        // Split on comma + optional surrounding whitespace
        String[] fruits = csv.split("\\s*,\\s*");

        for (String fruit : fruits) {
            System.out.println(fruit.trim());
        }
    }
}

Output:

apple
banana
cherry
date

Pattern Flags

Pass flags as a second argument to Pattern.compile() to change matching behaviour:

Flag constantShorthandEffect
Pattern.CASE_INSENSITIVE(?i)Ignore letter case
Pattern.MULTILINE(?m)^/$ match line boundaries
Pattern.DOTALL(?s). matches newline too
Pattern.COMMENTS(?x)Ignore whitespace and # comments in pattern
Pattern p = Pattern.compile("hello", Pattern.CASE_INSENSITIVE);
System.out.println(p.matcher("Hello World").find()); // true

You can also embed flags inline: "(?i)hello" is equivalent.

Non-Capturing Groups and Lookaheads

Sometimes you want to group without capturing. Use (?:...):

// Match "color" or "colour" without capturing the 'u?'
Pattern p = Pattern.compile("colou?r");

Lookaheads let you assert context without consuming characters:

SyntaxMeaning
(?=...)Positive lookahead — must be followed by
(?!...)Negative lookahead — must NOT be followed by
(?<=...)Positive lookbehind — must be preceded by
(?<!...)Negative lookbehind — must NOT be preceded by
// Find "Java" only when followed by " 21"
Pattern p = Pattern.compile("Java(?= 21)");
Matcher m = p.matcher("Java 8 and Java 21 are LTS");
while (m.find()) {
    System.out.println("Matched at: " + m.start());  // only the Java 21 occurrence
}

Under the Hood

How Pattern.compile() Works

Pattern.compile(regex) tokenises the regex string and constructs an internal NFA graph. Each node in the graph represents one regex “state” and holds a reference to the next possible states. This compilation step is relatively expensive — that is why you should store Pattern instances in a static field when the same pattern is used repeatedly rather than recompiling on every method call.

// Good practice — compile once
private static final Pattern PHONE =
    Pattern.compile("\\+?\\d[\\d\\s()-]{7,14}\\d");

Backtracking and Catastrophic Backtracking

Java’s NFA engine uses backtracking: when a path fails it rewinds and tries an alternative. Most patterns are fine, but nested quantifiers like (a+)+ on a non-matching input can trigger exponential backtracking — sometimes called ReDoS. The fix is to use possessive quantifiers (++, *+) or atomic groups ((?>...)) where available, or restructure the pattern.

Warning: Never apply user-supplied regex strings directly in Pattern.compile() without validation — a malicious input can cause catastrophic backtracking and hang your application.

Thread Safety

Pattern objects are immutable and thread-safe — share them freely across threads. Matcher objects are not thread-safe — create a new Matcher per thread (or per call) via pattern.matcher(input).

Practical Example: Password Strength Validator

import java.util.regex.*;

public class PasswordValidator {
    // At least 8 chars, one uppercase, one lowercase, one digit, one special char
    private static final Pattern STRONG =
        Pattern.compile("^(?=.*[A-Z])(?=.*[a-z])(?=.*\\d)(?=.*[@#$%^&+=!]).{8,}$");

    public static boolean isStrong(String password) {
        return STRONG.matcher(password).matches();
    }

    public static void main(String[] args) {
        System.out.println(isStrong("Weak1"));          // false
        System.out.println(isStrong("Str0ng@Pass!"));   // true
    }
}

Output:

false
true
  • Strings — understand Java’s String class before applying regex to it
  • String Methodsmatches(), replaceAll(), and split() are the bridge between String and regex
  • StringTokenizer — a simpler (but less flexible) alternative for splitting strings by delimiter
  • Pattern Matching — Java 16+ instanceof pattern matching, a different but related “matching” concept
  • Stream API — combine regex with streams to filter and transform collections of strings elegantly
  • Custom Exceptions — handle PatternSyntaxException gracefully in user-facing validation code
Last updated June 13, 2026
Was this helpful?