Regex Non-Capturing Groups Tutorial

A non-capturing group allows you to group a pattern (token) without the regex engine automatically assigning a group number. There are many reasons for using non-capturing groups, some common uses are: faster search results, inline modifiers (or embedded flag expressions), and excluding patterns from the .group(...) method results. At this point I would like to take a moment to make you aware that the ? metacharacter is just about the most versatile of all the metacharacters and where it is placed in a regex can make a dramatic difference on search results. Non-capturing groups use the same syntax as capturing groups only you must include a ?: just inside of the opening parenthesis.
   (?:regex)
   (lizard[^s])|(lizard[s]) = two regular capturing groups, one searching for singular lizard and the other searching for plural lizards.
   (?:lizard|lizards) = non-capturing group, won't be assigned a group number.
   (?:lizard[s]*) = non-capturing group with optional s at the end - * quantifier zero or more.
   (?:lizards*) = non-capturing group with optional s at the end - * quantifier zero or more.

Inline Modifiers, aka Embedded Flag Expressions

If you have been watching my tutorial series thus far, then you have seen me demonstrate the case-insensitive inline modifier (?i). Consider the following:
   Pattern p = Pattern.compile("the", Pattern.CASE_INSENSITIVE);
   Pattern p = Pattern.compile("(?i)(the)");
These two statements are identical and all instances of the will return true regardless of their case. What if we have a regex that we only want a portion of the search pattern to be case-insensitive? The answer is that there are few ways to accomplish this feat, but non-capturing groups make this task simple.
   Pattern p = new Pattern.compile("<xml>(?i:yes)</xml>");
In the example above, I sandwiched the i inline modifier in between the non-capturing group syntax (?i: so the following token yes can be a mix of both upper and lowercase characters.

Here is list of the inline modifiers and their equivalent Pattern class constants:
   (?i) = Pattern.CASE_INSENSITIVE
   (?x) = Pattern.COMMENTS
   (?m) = Pattern.MULTILINE
   (?s) = Pattern.DOTALL
   (?u) = Pattern.UNICODE_CASE
   (?d) = Pattern.UNIX_LINES


Open the command prompt (CMD - see the Getting Started ) and type in the following commands.

C:\Windows\System32>cd \
C:\>md Java
C:\>cd Java
C:\Java>
C:\Java>md RegexNonCapturingGroups
C:\Java>cd RegexNonCapturingGroups
C:\Java\RegexNonCapturingGroups>Notepad RegexNonCapturingGroups.java

Copy and Paste, or type the following code into Notepad and be sure to save the file when you are done.


import java.util.regex.*;
class RegexNonCapturingGroups {
    public static void main(String args[]) {
        Matcher m = null;

        m = Pattern.compile("<xml>(?:yes)</xml>").matcher("<xml>YES</xml>");
        while(m.find()) {
            System.out.println(m.group());
        }

        m = Pattern.compile("<xml>(?:yes)</xml>").matcher("<xml>yes</xml>");
        while(m.find()) {
            System.out.println(m.group());
        }
        System.out.println("First two tests\n");

        m = Pattern.compile("<xml>(?i:yes)</xml>").matcher("<xml>YES</xml>");
        while(m.find()) {
            System.out.println(m.group());
        }

        m = Pattern.compile("<xml>(?i:yes)</xml>").matcher("<xml>yes</xml>");
        while(m.find()) {
            System.out.println(m.group());
        }

        m = Pattern.compile("<xml>(?i:yes)</xml>").matcher("<xml>Yes</xml>");
        while(m.find()) {
            System.out.println(m.group());
        }

        m = Pattern.compile("<xml>(?i:yes)</xml>").matcher("<xml>yEs</xml>");
        while(m.find()) {
            System.out.println(m.group());
        }
        System.out.println("Four tests on non-capturing groups with inline modifiers.\n");

        m = Pattern.compile("<(xml)>(?i:yes)</\\1>").matcher("<xml>Yes</xml>");
        while(m.find()) {
            System.out.println(m.group());
        }

        m = Pattern.compile("<(xml>)(?i:yes)</\\1").matcher("<xml>Yes</xml>");
        while(m.find()) {
            System.out.println(m.group());
        }

        m = Pattern.compile("(</*xml>)(?i:yes)\\1").matcher("<xml>Yes</xml>"); 
        while(m.find()) {
            System.out.println(m.group()); // no find - see video
        }
        System.out.println("Capturing groups, backreferences, non-capturing groups, and inline modifiers.\n");

        m = Pattern.compile("<xml>(?i)(yes)(?-i)</xml>").matcher("<xml>Yes</xml>");
        while(m.find()) {
            System.out.println(m.group());
        }

        m = Pattern.compile("<xml>(?i)(yes)(?-i)</xml>").matcher("<xml>yES</xml>");
        while(m.find()) {
            System.out.println(m.group());
        }
    }
}

Now switch back to the command prompt (CMD) and type in javac RegexNonCapturingGroups.java and press Enter.
Now type in java RegexNonCapturingGroups and press Enter.


C:\Java\RegexNonCapturingGroups>javac RegexNonCapturingGroups.java
C:\Java\RegexNonCapturingGroups>java RegexNonCapturingGroups
See video for results


Final thoughts

None


Tutorials