Introduction To Regex - Regular Expressions

Regular expressions, or regex for short, are used to provide a shorthand way of searching, formatting, or editing character data. The syntax used in regular expressions is somewhat cryptic at first, but you will soon get the hang of how it all works. The easiest way to learn how to use regex is to jump right into it with some simple examples. The java.util.regex packgage contains various regex classes, but the most commonly used ones are Pattern and Matcher. I will give you a brief overview of both of these classes and how they work in conjunction.

The Pattern Class

A Pattern object is a compiled representation of a regex string literal. We use the .compile() method to create the Pattern object. Let's say we want to search for all occurrences of the words mop or top in this string literal: "The top of the mop is called a handle." We could simply use the .indexOf() method in the String class and check for a return value of > -1; that will work just fine. However let's demonstrate doing the same thing with regex.
Pattern p = Pattern.compile("[mt]op");
The mt inside of the brackets is regex syntax. I won't go into too much detail in this introduction, but our regular expression is basically saying that we are specifying either mop or top.

The Matcher Class

We use the Matcher class methods to perform various operations on a character sequence like a String instance. There are a bunch of methods in the Matcher class, but for this tutorial we are going to use the .find(), .group(), and .start() methods. We create a Matcher object by invoking the .matcher() method on our Pattern object.
Matcher m = p.matcher("The top of the mop is called a handle.");
Now that we have our Matcher object we can invoke the .find() method to locate the subsequence that matches the regex pattern. The .find() method returns a boolean value that I am going to use that method to control iterations in a while() statement.
while(m.find()){
}

Now let's display what what words were found by invoking the .group() method.
while(m.find()){
     System.out.println("Found "+m.group());
}

Let's build on that a little more by displaying the index at which each search expression was found at by using the .start() method.
while(m.find()){
     System.out.println("Found "+m.group() + " at index "+ m.start());
}



Open the command prompt (CMD - see the Getting Started ) and type in the following commands.

C:\Windows\System32>cd \
C:\>md Java
C:\>cd Java
C:\Java>
C:\Java>md RegexIntro
C:\Java>cd RegexIntro
C:\Java\RegexIntro>Notepad RegexIntro.java

Copy and Paste, or type the following code into Notepad and be sure to save the file when you are done.


import java.util.regex.*;
class RegexIntro {
    public static void main(String args[]) {
        String searchMe = "The top of the mop is called a handle.";

        System.out.println("String indexOf top: "+ searchMe.indexOf("top"));
        System.out.println("String indexOf mop: "+ searchMe.indexOf("mop"));
    
        Pattern p = Pattern.compile("[mt]op");
        Matcher m = p.matcher(searchMe);
        while(m.find()){
            System.out.println("Matcher found "+m.group() + " at index "+ m.start());
        }

        System.out.println();	
        
        p = Pattern.compile("l{2}");
        m = p.matcher(searchMe);
        while(m.find()){
            System.out.println("Matcher found "+m.group() + " at index "+ m.start());
        }
    }
}

Now switch back to the command prompt (CMD) and type in javac RegexIntro.java and press Enter.
Now type in java RegexIntro and press Enter.


C:\Java\RegexIntro>javac RegexIntro.java
C:\Java\RegexIntro>java RegexIntro
String indexOf() top: 4
String indexOf() mop: 15
Matcher found top at index 4
Matcher found mop at index 15

Matcher found ll at index 24


Final thoughts

My tutorials on Regex will be mostly mixed throughout other tutorial topics. While they are a valuable tool to have in your arsenal, the things they do can typically be done using various methods from other classes like String or Stringbuilder. You will come across regex from time to time and you should be familiar with the syntax so you can understand what is going on.


Tutorials