Regex Expression Usage

Regular Expression (regex) are generally used for searching, modifying or extracting text. Following are the regex types

  • Character classes: They are used to define the content of the pattern. Eg What should the pattern look like?
\d Digits
\D Non Digits
\w Word Character
\W A Non Word Character
\s A white space character: [ \t\n\x0B\f\r]
\S A nonwhite space character

Note: In java, you need to double escape these backslashes: \\d \\D \\s \\S \\w \\W  while writing the regex pattern.

  • Quantifiers: They are used to specify the number or length that part of a pattern should match or repeat.
* Match 0 or more times
+ Match 1 or more times
? Match 1 or 0 times
{n} Match exactly n times
{n,} Match at least n times
{n,m} Matches at least n times but not more than m times
  • Meta Characters: Are used to group, divide and perform special operations with patterns.
\ Escape the next meta characters. It becomes a normal characters/literal
^ Match the beginning of the line
. Match any character except new line
$ Match the end of the line or before newline at the end.
| Alternation
() Grouping
[] Custom Character Classes.

Example Set 1: BASIC EXPRESSION

In order to search for all the expression that start with “I lost my wallet”, a basic regex expression would be:

Case1: “I lost my \\w+

\\w implies matches a  word character [character]

+ implies matches 1 or more times [quantifier]

This pattern can match anything like I lost my pen, I lost my wallet etc but it can not match I lost my: wallet because : is not included in regex.

Case2: “I lost my: ? \\w+

? implies it will either match its occurrence or without it. [meta character]

\\w implies matches a  word character [character]

+ implies matches 1 or more times [quantifier]

This pattern can match I lost my: pen and I lost my pen.

Example Set 2: BASIC GROUPING

An important feature of regular expressions is the ability to group sections of a pattern and provide alternate matches.

| => Alternation/ OR Statement

() => Grouping

Case 1:  If we need to search for instance specific 3 words wallets, pen and book then regex would look like:

“I lost my (wallet|pen|book)

This would search for only three sentences. I lost my wallet, I lost my pen and I lost my book

Case 2: If we need to search for instance 3 words but with a possibility of having a : then

“I lost my:? (wallter|pen|book)

This would search for: I lost my wallet, I lost my: wallet

I lost my pen, I lost my: pen

I lost my book, I lost my: book

It would not match the following for instance:

I lost my wallets or I lost my- pens

Examples Set 3: Matching/Validating:

Regex make it possible to find all instances of text that matches a certain pattern and return a Boolean value if the pattern is found or not. This can be for instance used to validate ssn, phone numbers etc

Sample Code 1: Validation

public class ValidateSSN {

public static void main(String[] args) {

List<String> input = new ArrayList<String>();

input.add(“123-45-3456”);                 //good

input.add(“9876-5-4321”);                //wrong

input.add(“987-86-3221 (shshs)”); //wrong

input.add(“123-45-3456 ”);                //wrong because of space after 6

input.add(“123-345-3444”);               // good

for(String ssn: input) {

if(ssn.matches(“^(\\d{3}-?\\d{2}-?\\d{4})$”)) {

system.out.println(“found good ssn”+ssn);

}

}

}

Sample Code2: Matching

This Pattern.split() example splits the text in the text variable into 6 separate strings. Each of these strings are included in the String array returned by the split() method. The parts of the text that matched as delimiters are not included in the returned string array.

public void testPatternSplit() {

String input = “This one Is one Good one Book one Of one HistoryOne”;

String pattern = “one”;

Pattern p = Pattern.compile(pattern);

String[] returnedValue = p.split(input);

System.out.println(“Length of returned set is:” + returnedValue.length);

for(String element: returnedValue) {

System.out.println(element);

}

}

Output is: Length of the returned set is: 6

This Is Good Book Of HistoryOne

 

 

 

 

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s