Regex Capturing Groups Introduction Tutorial

Capturing Groups are built by enclosing regular expressions inside of a pair of parenthesis. A simple example of a capturing group would be a regex string literal enclosed in parenthesis like this: "(Lizard)". On the surface regex capturing groups appear to be very similar to plain old regex string literals, and the example above will actually be functionally identical. When Capturing groups are used in complicated expressions such as "(([01]?\\d\\d?|2[0-4]\\d|25[0-5])\\.){3}([01]?\\d\\d?|2[0-4]\\d|25[0-5])", they not only make the expression more readable, but they group regular expressions into things known as tokens.

Many operations can be performed on capturing groups. Capturing groups are automatically numbered, they can be assigned a name or alias, and they can be used as backreferences. This tutorial will not discuss the topics of capturing group numbering, naming, or backreferencing. The purpose of this tutorial is to simply demonstrate a basic usage of capturing groups so that I can introduce you to quantifiers in my next tutorial. Once I have discussed quantifiers and some other stuff, I will return to the topics of capturing group numbering, naming, and backreferencing so I can demonstrate more meaningful usage examples. This tutorial builds on concepts from my other regex tutorials so I highly recommend watching them all before continuing.



Open the command prompt (CMD - see the Getting Started ) and type in the following commands.

C:\Windows\System32>cd \
C:\>md Java
C:\>cd Java
C:\Java>
C:\Java>md RegexGroupInto
C:\Java>cd RegexGroupInto
C:\Java\RegexGroupInto>Notepad RegexGroupInto.java

Copy and Paste, or type the following code into Notepad and be sure to save the file when you are done.


import java.util.regex.*;

class RegexGroupIntro {
    public static void main(String args[]) {
        displayFind("(Lizard)[sz]","Should I name my new pet store Happy Lizards or Happy Lizardz?");
        displayFind("(lizard)[sz]","Is a snake a lizard?");
        displayFind("Lizard[sz]","Should I name my new pet store Happy Lizards or Happy Lizardz?");
        displayFind("(Lizards)|(Lizardz)","Should I name my new pet store Happy Lizards or Happy Lizardz?");
        displayFind("(Lizard)s|(Lizard)z","Should I name my new pet store Happy Lizards or Happy Lizardz?");
        displayFind("(Lizard[s])|(Lizard[z])","Should I name my new pet store Happy Lizards or Happy Lizardz?");
        displayFind("((Lizard)[s])|((Lizard)[z])","Should I name my new pet store Happy Lizards or Happy Lizardz?");
        displayFind("(Lizard)[sz]|(my)","Should I name my new pet store Happy Lizards or Happy Lizardz?");
    }

    static void displayFind(String regex, String searchMe) {
        boolean foundIt = false;
        //Pattern p = Pattern.compile(regex);
        //Matcher m = p.matcher(searchMe);
        Matcher m = Pattern.compile(regex).matcher(searchMe);
        while(m.find()){
            System.out.println("Matcher found " + m.group() + " at index "+ m.start() + " for regex " + regex + " in string \"" + searchMe +"\"" );
            foundIt = true;
        }
        if(!foundIt){
            System.out.println("No matches found for " + regex + " in string \"" + searchMe +"\'");
        }
        System.out.println();	    
    }
}

Now switch back to the command prompt (CMD) and type in javac RegexGroupInto.java and press Enter.
Now type in java RegexGroupInto and press Enter.


C:\Java\RegexGroupInto>javac RegexGroupInto.java
C:\Java\RegexGroupInto>java RegexGroupInto
Matcher found Lizards at index 37 for regex (Lizard)[sz] in string "Should I name my new pet store Happy Lizards or Happy Lizardz?"
Matcher found Lizardz at index 54 for regex (Lizard)[sz] in string "Should I name my new pet store Happy Lizards or Happy Lizardz?"

No matches found for (lizard)[sz] in string "Is a snake a lizard?'

Matcher found Lizards at index 37 for regex Lizard[sz] in string "Should I name my new pet store Happy Lizards or Happy Lizardz?"
Matcher found Lizardz at index 54 for regex Lizard[sz] in string "Should I name my new pet store Happy Lizards or Happy Lizardz?"

Matcher found Lizards at index 37 for regex (Lizards)|(Lizardz) in string "Should I name my new pet store Happy Lizards or Happy Lizardz?"
Matcher found Lizardz at index 54 for regex (Lizards)|(Lizardz) in string "Should I name my new pet store Happy Lizards or Happy Lizardz?"

Matcher found Lizards at index 37 for regex (Lizard)s|(Lizard)z in string "Should I name my new pet store Happy Lizards or Happy Lizardz?"
Matcher found Lizardz at index 54 for regex (Lizard)s|(Lizard)z in string "Should I name my new pet store Happy Lizards or Happy Lizardz?"

Matcher found Lizards at index 37 for regex (Lizard[s])|(Lizard[z]) in string "Should I name my new pet store Happy Lizards or Happy Lizardz?"
Matcher found Lizardz at index 54 for regex (Lizard[s])|(Lizard[z]) in string "Should I name my new pet store Happy Lizards or Happy Lizardz?"

Matcher found Lizards at index 37 for regex ((Lizard)[s])|((Lizard)[z]) in string "Should I name my new pet store Happy Lizards or Happy Lizardz?"
Matcher found Lizardz at index 54 for regex ((Lizard)[s])|((Lizard)[z]) in string "Should I name my new pet store Happy Lizards or Happy Lizardz?"

Matcher found my at index 14 for regex (Lizard)[sz]|(my) in string "Should I name my new pet store Happy Lizards or Happy Lizardz?"
Matcher found Lizards at index 37 for regex (Lizard)[sz]|(my) in string "Should I name my new pet store Happy Lizards or Happy Lizardz?"
Matcher found Lizardz at index 54 for regex (Lizard)[sz]|(my) in string "Should I name my new pet store Happy Lizards or Happy Lizardz?"




Final thoughts

None


Tutorials