Regular expressions

You can use regular expressions (“regexes”) in content rules to find and match very specific content in instant messages such as social security numbers, credit card numbers, phone numbers, ZIP codes, or any other highly specific text and/or number pattern.

To use regular expressions, enable the “Regular Expression” option in the content rule.

You can provide multiple regexes in a single content rule document because the “Content contains” field is a multi-item list value field. To separate multiple entries use a new line or blank line. Make sure you paste long regular expressions completely as a single string of text (which might wrap).

If you’re having trouble getting multiple regexes to work with a single content rule then the issue is most likely related to incorrectly ‘broken up’ regular expression strings.

To check multiple regular expression list items for correctness, check the field “rule.content” in the document properties (via the document properties infobox); the data type should be “Text List” and each regex should be enclosed in double quotes.

Instant IMtegrity uses the Microsoft C++ TR1 regex specification and operates in standard ECMAScript mode. Regexes come in different flavors (they differ for Java, Perl, PHP, Python, etc.) so make sure that the regex you use or test for is in the Javascript/ECMAScript flavor.

Valid

\d{3}-\d{2}-\d{4}

Invalid

/\d{3}-\d{2}-\d{4}/

Invalid:

\d{3}-\d{2}-\d{4}/gmi

Sample regular expressions

Match a URL or domain name, optionally starting with http or https

(https?:\/\/)?([\da-z\.-]+)\.([a-z\.]{2,6})([\/\w \.-]*)*?

Matches any of these:

Visit http://www.imtegrity.com/ for details
Visit http://imtegrity.com for details
Visit https://imtegrity.com for details
Visit www.imtegrity.com for details
Visit imtegrity.com for details
imtegrity.com

Match an email address

[^\s@]+@([\w\-]+\.)+[\w\-]{2,}

Matches any of these:

Email us at support@instant-tech.com
sales@instant-tech.com is the right address!

Match an IPv4 address

([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3})

Matches any of these:

The Localhost address is 127.0.0.1
127.0.0.1 is the localhost address

Match a US Social Security Number (SSN)

\d{3}-\d{2}-\d{4}

Matches any of these:

My SSN is 011-23-1245
011-23-1245 is the SSN

Match a typical credit card number

\b(?:\d[ -]*?){13,16}\b

Matches any of these:

My AMEX credit card number is 3782-822463-10005
My MasterCard number is 5555555555554444
4012 8888 8888 1881 is my Visa card number

Support for regular expressions

The regular expression examples above have been tested and found to work for a variety of test cases and samples, incl. the ones listed above. They are not guaranteed to work for all possible cases, specific requirements require more specialized regexes.

Instant Technologies does not provide support for individual regular expressions.

Testing and debugging regular expressions

A great online resource to test and debug regular expressions in ECMAScript/JavaSscript flavor is: https://regex101.com/#javascript

For more details on the specifics of the Microsoft C++ TR1 implementation: https://msdn.microsoft.com/en-us/library/bb982727(v=vs.100).aspx