Regular Expressions
Regex
Regex
The Pattern class compiles regex strings into pattern objects.
Key Methods:
compile(String regex): Compiles a regex.
matcher(CharSequence input): Creates a matcher to search a string.
matches(String regex, CharSequence input): Checks full-string match.
split(CharSequence input): Splits input based on the pattern.
The Matcher class performs matching operations for input strings.
Key Methods:
find(): Searches for pattern occurrences.
start() / end(): Returns start and end indices of a match.
group() / groupCount(): Retrieves matched subsequences.
matches(): Checks if the entire input matches the pattern.
replaceAll(String replacement): Replaces all matches.
replaceFirst(String replacement): Replaces first match.
results(): Returns a Stream<MatchResult> for all matches.
String regex = "test.*";
// returns true
Pattern.matches(regex, "test is ok");
// returns false
Pattern.matches(regex, "this test is ok");
String regex = "test[0-9]+";
// returns true;
Pattern.matches(regex, "test12");
// returns false;
Pattern.matches(regex, "test12s");
String text = "Email me at example@test.com or visit test.com.";
String regex = "\\b\\w+@\\w+\\.\\w+\\b"; // simple email pattern
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(text);
while (matcher.find()) {
System.out.println(matcher.group());
}
// Prints: example@test.com
String text = "testisok12";
String regex = "ok";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(text);
while (matcherm.find()) {
System.out.println("Pattern found from " + matcher.start() + " to " + (matcher.end() - 1));
}
// Prints: Pattern found from 6 to 7
Pattern.CASE_INSENSITIVE: Makes matching case-insensitive.
Pattern.MULTILINE: ^ and $ match line boundaries.
Pattern.DOTALL: . matches newline characters.
Pattern.UNICODE_CHARACTER_CLASS: Enables Unicode-aware matching.
Pattern.COMMENTS: Allows whitespace and comments in regex.
Example: Pattern p = Pattern.compile("hello", Pattern.CASE_INSENSITIVE);
. : Any character
\d : Digit [0-9]
\D : Non-digit
\s : Whitespace
\S : Non-whitespace
\w : Word character [a-zA-Z0-9_]
\W : Non-word character
\b : Word boundary
\B : Non-word boundary
Example: Pattern.matches("\\d+", "1234"); // returns true
[xyz]: Matches x, y, or z
[^xyz]: Matches any character except x, y, or z
[a-zA-Z]: Matches any character in the specified range
[a-f[m-t]]: Union of ranges a–f and m–t
[a-z && [^m-p]]: Intersection of a–z excluding m–p
Example: Pattern.matches("[a-z]", "g"); // returns true
X?: X appears 0 or 1 time
Example:
Pattern p = Pattern.compile("colou?r");
"u" is optional
Matches: "color", "colour"
X+: X appears 1 or more times
X*: X appears 0 or more times
Example:
Pattern p = Pattern.compile("go*gle");
Matches: "ggle", "gogle", "google", "gooogle"
Allows zero or more occurrences (optional and repeatable).
X{n}: X appears exactly n times
X{n,}: X appears n or more times
X{n,m}: X appears between n and m times