Regex Boundaries Tutorial

Regex boundaries allow you assert things about where a token can exist in a search string. Let's imagine for a moment that we want to make sure that the first three characters of a string are actually digits followed by a dash(-). We would do this using the caret(^) metacharacter which instructs the regex engine to make sure the token to the right is the first thing in the beginning of a line.
     "^(\\d){3}-" · "867-5309" will return TRUE when invoking the .find() method.

We can also use boundaries to make sure that a token is the last thing to exist in a line. We do this by using the dollar sign($) metacharacter to make sure that the token to the left is the last thing at the end of a line.
     "(\\d){4}$" · "867-5309" will return TRUE when invoking the .find() method.
     "^(\\d){3}-(\\d){4}$" · "867-5309" will return TRUE when invoking the .find() method.

Another useful regex boundary is \\b which you can use to define a word boundary.
     "\\bcar\\b" · "Your car is being towed!" will return TRUE when invoking the .find() method.
     "\\bcar\\b" · "Goldfish are members of the carp family." will return FALSE when invoking the .find() method.
     "\\bcar\\b" · "I have a scar on my leg." will return FALSE when invoking the .find() method.
     "\\bcar\\b" · "That scarf looks warm." will return FALSE when invoking the .find() method.

Finally, another commonly used regex boundary is \\B which you can use to define a non-word boundary.
     "\\bcar\\B" · "Goldfish are members of the carp family." will return TRUE when invoking the .find() method.
     "\\Bcar\\b" · "I have a scar on my leg." will return TRUE when invoking the .find() method.
     "\\Bcar\\B" · "That scarf looks warm." will return TRUE when invoking the .find() method.

There are other rarely-if-ever-used regex boundaries that I am going to leave out of this tutorial. I really just want to introduce you the boundaries that you will see and use on a regular basis. There are many thick books that go into the fine details of the regex language, but in a real-world scenario only about 25% of the regex language gets used 95% of the time.



Open the command prompt (CMD - see the Getting Started ) and type in the following commands.

C:\Windows\System32>cd \
C:\>md Java
C:\>cd Java
C:\Java>
C:\Java>md RegexBoundaries
C:\Java>cd RegexBoundaries
C:\Java\RegexBoundaries>Notepad RegexBoundaries.java

Copy and Paste, or type the following code into Notepad and be sure to save the file when you are done.


import java.util.regex.*;

class RegexBoundaries {
    public static void main(String args[]) {
        displayFind("^(\\d){3}-", "867-5309");
        displayFind("(\\d){4}$", "867-5309");
        displayFind("^(\\d){3}-(\\d){4}$", "867-5309");
        displayFind("^(\\d){3}-(\\d){4}$", "4867-530");
        System.out.println();

        displayFind("\\bcar\\b", "Your car is being towed!");
        displayFind("\\bcar\\b", "That was your car");
        displayFind("\\bcar\\b", "Car spelled backwards is rac.");
        displayFind("\\bcar\\b", "Goldfish are members of the carp family.");
        displayFind("\\bcar\\b", "I have a scar on my leg.");
        displayFind("\\bcar\\b", "That scarf looks warm.");
        displayFind("\\b(9){2}\\b", "99 bottles of beer on the wall...");
        displayFind("\\b1(9){3}\\b", "tonight we\'re going to party like its 1999");
        System.out.println();

        displayFind("\\bcar\\B", "Goldfish are members of the carp family.");
        displayFind("\\Bcar\\b", "I have a scar on my leg.");
        displayFind("\\Bcar\\B", "That scarf looks warm.");
    }

    static void displayFind(String regex, String searchMe) {
        boolean foundIt = false;
        Pattern p = Pattern.compile(regex, Pattern.CASE_INSENSITIVE);
        Matcher m = p.matcher(searchMe);
        while(m.find()){
            System.out.println("Regex " + regex + " found " + m.group() +  " in \"" + searchMe +"\"" );
            foundIt = true;
        }
        if(!foundIt){
            System.out.println("No matches found for " + regex + " in string \"" + searchMe +"\"");
        }  
    }
}

Now switch back to the command prompt (CMD) and type in javac RegexBoundaries.java and press Enter.
Now type in java RegexBoundaries and press Enter.


C:\Java\RegexBoundaries>javac RegexBoundaries.java
C:\Java\RegexBoundaries>java RegexBoundaries
See video for explanation


Final thoughts

None


Tutorials