Regex Character Classes Part 1 Tutorial

The regex "character classes" are used to create a compiled Pattern object that will instruct our Matcher object to match only one out of several possible characters. The term "classes" in "character classes" is somewhat confusing as it does not refer to a Java class, but rather to regex class. A regex character class is not a compiled .class file, the term "class" should be thought of as a simple classification or category. The character classification defines several ways to search for or exclude single characters using the square brackets [].

Simple character classes · single characters enclosed in square brackets

Pattern         Matcher                             .find() result
[aeiou]         "Java Tutorials"                    true · six finds
[bXy]           "Java Tutorials"                    false · no find
f[oa]x          "fax machine"                       true · one find
f[oa]x          "the quick brown fox"               true · one find
sim[ia]l[iae]r  "catch misspellings like simaler"   true · one find
col[ou]r        "coloring book"                     true · one find
col[ou]r        "colouring book"                    false · no find
[ou]            "colouring book"                    true · five finds

Negation · "^" character followed by single characters enclosed in square brackets

The negation "^" character, or metacharacter if you want to be technical, causes the .find() method to match all characters except for the ones inside of the brackets.

Pattern             Matcher                             .find() result
[^aeiou]            "Java Tutorials"                    true · eight finds
[^bXy]              "Java Tutorials"                    true · fourteen finds
f[^oa]x             "fax machine"                       false · no find
f[^oa]x             "the quick brown fox"               false · no find
sim[^ia]l[^iae]r    "catch misspellings like simaler"   false · no find
col[^ou]r           "coloring book"                     false · no find
col[^ou]r           "colouring book"                    false · no find
[^ou]               "colouring book"                    true · nine finds

Ranges · Characters seperated with a "-" enclosed in square brackets

The dash "-" character, or metacharacter if you want to be technical, causes the .find() method to match a range of characters inside of the brackets.

Pattern             Matcher                             .find() result
[a-d]              "Java Tutorials"                     true · three finds
[A-M]              "Java Tutorials"                     true · one find
[1-5]              "5 limes for $1"                     true · two finds
[^a-d]             "Java Tutorials"                     true · eleven finds
[^1-5]             "5 limes for $1"                     true · twelve finds
col[o-u]r          "coloring book"                      true · one find
col[o-u]r          "colouring book"                     false · no find



Open the command prompt (CMD - see the Getting Started ) and type in the following commands.

C:\Windows\System32>cd \
C:\>md Java
C:\>cd Java
C:\Java>
C:\Java>md RegexCharacters
C:\Java>cd RegexCharacters
C:\Java\RegexCharacters>Notepad RegexCharacters.java

Copy and Paste, or type the following code into Notepad and be sure to save the file when you are done.


import java.util.regex.*;
class RegexCharacters {
    public static void main(String args[]) {
        displayFind("[aeiou]","Java Tutorials");
        displayFind("[bxy]","Java Tutorials");
        displayFind("f[oa]x","fax machine");
        displayFind("f[oa]x","the quick brown fox");
        displayFind("sim[ia]l[iae]r","catch misspellings like simaler");
        displayFind("col[ou]r","coloring book");
        displayFind("col[ou]r","colouring book");
        displayFind("[ou]","colouring book");

        displayFind("[^aeiou]","Java Tutorials");
        displayFind("[^bxy]","Java Tutorials");
        displayFind("f[^oa]x","fax machine");
        displayFind("f[^oa]x","the quick brown fox");
        displayFind("sim[^a]l[^ie]r","catch misspellings like simaler");
        displayFind("col[^ou]r","coloring book");
        displayFind("col[^ou]r","colouring book");
        displayFind("[^ou]","colouring book");

        displayFind("[a-d]","Java Tutorials");
        displayFind("[A-M]","Java Tutorials");
        displayFind("[1-5]","5 limes for $1");
        displayFind("[^a-d]","Java Tutorials");
        displayFind("[^1-5]","5 limes for $1");
        displayFind("col[o-u]r","coloring book");
        displayFind("col[o-u]r","colouring book");
    }

    static void displayFind(String regex, String searchMe) {
        boolean foundIt = false;
        Pattern p = Pattern.compile(regex);
        Matcher m = p.matcher(searchMe);
        while(m.find()){
            System.out.println("Matcher found " + m.group() + " at index "+ m.start() + " for regex " + regex + " in string \"" + searchMe +"\"" );
            foundIt = true;
        }
        if(!foundIt){
            System.out.println("No matches found for " + regex + " in string \"" + searchMe +"\'");
        }
        System.out.println();	    
    }
}

Now switch back to the command prompt (CMD) and type in javac RegexCharacters.java and press Enter.
Now type in java RegexCharacters and press Enter.


C:\Java\RegexCharacters>javac RegexCharacters.java
C:\Java\RegexCharacters>java RegexCharacters
Matcher found a at index 1 for regex [aeiou] in string "Java Tutorials"
Matcher found a at index 3 for regex [aeiou] in string "Java Tutorials"
Matcher found u at index 6 for regex [aeiou] in string "Java Tutorials"
Matcher found o at index 8 for regex [aeiou] in string "Java Tutorials"
Matcher found i at index 10 for regex [aeiou] in string "Java Tutorials"
Matcher found a at index 11 for regex [aeiou] in string "Java Tutorials"

No matches found for [bxy] in string "Java Tutorials'

Matcher found fax at index 0 for regex f[oa]x in string "fax machine"

Matcher found fox at index 16 for regex f[oa]x in string "the quick brown fox"

Matcher found simaler at index 24 for regex sim[ia]l[iae]r in string "catch misspellings like simaler"

Matcher found color at index 0 for regex col[ou]r in string "coloring book"

No matches found for col[ou]r in string "colouring book'

Matcher found o at index 1 for regex [ou] in string "colouring book"
Matcher found o at index 3 for regex [ou] in string "colouring book"
Matcher found u at index 4 for regex [ou] in string "colouring book"
Matcher found o at index 11 for regex [ou] in string "colouring book"
Matcher found o at index 12 for regex [ou] in string "colouring book"

Matcher found J at index 0 for regex [^aeiou] in string "Java Tutorials"
Matcher found v at index 2 for regex [^aeiou] in string "Java Tutorials"
Matcher found   at index 4 for regex [^aeiou] in string "Java Tutorials"
Matcher found T at index 5 for regex [^aeiou] in string "Java Tutorials"
Matcher found t at index 7 for regex [^aeiou] in string "Java Tutorials"
Matcher found r at index 9 for regex [^aeiou] in string "Java Tutorials"
Matcher found l at index 12 for regex [^aeiou] in string "Java Tutorials"
Matcher found s at index 13 for regex [^aeiou] in string "Java Tutorials"

Matcher found J at index 0 for regex [^bxy] in string "Java Tutorials"
Matcher found a at index 1 for regex [^bxy] in string "Java Tutorials"
Matcher found v at index 2 for regex [^bxy] in string "Java Tutorials"
Matcher found a at index 3 for regex [^bxy] in string "Java Tutorials"
Matcher found   at index 4 for regex [^bxy] in string "Java Tutorials"
Matcher found T at index 5 for regex [^bxy] in string "Java Tutorials"
Matcher found u at index 6 for regex [^bxy] in string "Java Tutorials"
Matcher found t at index 7 for regex [^bxy] in string "Java Tutorials"
Matcher found o at index 8 for regex [^bxy] in string "Java Tutorials"
Matcher found r at index 9 for regex [^bxy] in string "Java Tutorials"
Matcher found i at index 10 for regex [^bxy] in string "Java Tutorials"
Matcher found a at index 11 for regex [^bxy] in string "Java Tutorials"
Matcher found l at index 12 for regex [^bxy] in string "Java Tutorials"
Matcher found s at index 13 for regex [^bxy] in string "Java Tutorials"

No matches found for f[^oa]x in string "fax machine'

No matches found for f[^oa]x in string "the quick brown fox'

No matches found for sim[^a]l[^ie]r in string "catch misspellings like simaler'

No matches found for col[^ou]r in string "coloring book'

No matches found for col[^ou]r in string "colouring book'

Matcher found c at index 0 for regex [^ou] in string "colouring book"
Matcher found l at index 2 for regex [^ou] in string "colouring book"
Matcher found r at index 5 for regex [^ou] in string "colouring book"
Matcher found i at index 6 for regex [^ou] in string "colouring book"
Matcher found n at index 7 for regex [^ou] in string "colouring book"
Matcher found g at index 8 for regex [^ou] in string "colouring book"
Matcher found   at index 9 for regex [^ou] in string "colouring book"
Matcher found b at index 10 for regex [^ou] in string "colouring book"
Matcher found k at index 13 for regex [^ou] in string "colouring book"

Matcher found a at index 1 for regex [a-d] in string "Java Tutorials"
Matcher found a at index 3 for regex [a-d] in string "Java Tutorials"
Matcher found a at index 11 for regex [a-d] in string "Java Tutorials"

Matcher found J at index 0 for regex [A-M] in string "Java Tutorials"

Matcher found 5 at index 0 for regex [1-5] in string "5 limes for $1"
Matcher found 1 at index 13 for regex [1-5] in string "5 limes for $1"

Matcher found J at index 0 for regex [^a-d] in string "Java Tutorials"
Matcher found v at index 2 for regex [^a-d] in string "Java Tutorials"
Matcher found   at index 4 for regex [^a-d] in string "Java Tutorials"
Matcher found T at index 5 for regex [^a-d] in string "Java Tutorials"
Matcher found u at index 6 for regex [^a-d] in string "Java Tutorials"
Matcher found t at index 7 for regex [^a-d] in string "Java Tutorials"
Matcher found o at index 8 for regex [^a-d] in string "Java Tutorials"
Matcher found r at index 9 for regex [^a-d] in string "Java Tutorials"
Matcher found i at index 10 for regex [^a-d] in string "Java Tutorials"
Matcher found l at index 12 for regex [^a-d] in string "Java Tutorials"
Matcher found s at index 13 for regex [^a-d] in string "Java Tutorials"

Matcher found   at index 1 for regex [^1-5] in string "5 limes for $1"
Matcher found l at index 2 for regex [^1-5] in string "5 limes for $1"
Matcher found i at index 3 for regex [^1-5] in string "5 limes for $1"
Matcher found m at index 4 for regex [^1-5] in string "5 limes for $1"
Matcher found e at index 5 for regex [^1-5] in string "5 limes for $1"
Matcher found s at index 6 for regex [^1-5] in string "5 limes for $1"
Matcher found   at index 7 for regex [^1-5] in string "5 limes for $1"
Matcher found f at index 8 for regex [^1-5] in string "5 limes for $1"
Matcher found o at index 9 for regex [^1-5] in string "5 limes for $1"
Matcher found r at index 10 for regex [^1-5] in string "5 limes for $1"
Matcher found   at index 11 for regex [^1-5] in string "5 limes for $1"
Matcher found $ at index 12 for regex [^1-5] in string "5 limes for $1"

Matcher found color at index 0 for regex col[o-u]r in string "coloring book"

No matches found for col[o-u]r in string "colouring book'


Final thoughts

Stay tuned for part two of my character class tutorial where I will demonstrate unions, intersections, and subtraction.


Tutorials