Skip to content

YuriyGuts/regex-builder

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 

Repository files navigation

RegexBuilder

Just another day at the office, you write a .NET Regex like a boss, and suddenly realize that you need to declare, say, a non-capturing group. It's (?:pattern), right? Wait, or was it (?=pattern)? No no, (?=pattern) must be a positive lookahead or something. But if (?<=pattern) is a positive lookbehind, then maybe positive lookahead would be (?>=pattern)?

"Aaargh! Now where's that Regex cheat sheet?.." And make sure to share it with your five colleagues who might be maintaining this code later. Also, remember to use comments inside the Regex pattern, and maybe a few third-party tools to be sure what the expression does.

"How did it come to this?"

Inspired by the Expression Trees feature in .NET, the RegexBuilder library provides a more verbose but more human-readable way of declaring regular expressions, using a language friendly to the .NET world instead of two lines of cryptic mess.

When it might be useful:

  • When the expressions are complex and might be frequently changed.
  • When you can tolerate 20 lines of understandable code instead of 1 hardly understandable.
  • If you can spare a bit of CPU time and memory for constructing the Regex object for the sake of readability.

Example

Let's say you want to make a simple HTML parser and capture the value of every href attribute from hyperlinks, like shown in the MSDN example.

The usual way:

Regex hrefRegex = new Regex("href\\s*=\\s*(?:[\"'](?<Target>[^\"']*)[\"']|(?<Target>\\S+))", RegexOptions.IgnoreCase);

With RegexBuilder:

const string quotationMark = "\"";
Regex hrefRegex = RegexBuilder.Build
(
    RegexOptions.IgnoreCase,
    // Regex structure declaration
    RegexBuilder.Literal("href"),
    RegexBuilder.MetaCharacter(RegexMetaChars.WhiteSpace, RegexQuantifier.ZeroOrMore),
    RegexBuilder.Literal("="),
    RegexBuilder.MetaCharacter(RegexMetaChars.WhiteSpace, RegexQuantifier.ZeroOrMore),
    RegexBuilder.Alternate
    (
        RegexBuilder.Concatenate
        (
            RegexBuilder.NonEscapedLiteral(quotationMark),
            RegexBuilder.Group
            (
                "Target",
                RegexBuilder.NegativeCharacterSet(quotationMark, RegexQuantifier.ZeroOrMore)
            ),
            RegexBuilder.NonEscapedLiteral(quotationMark)
        ),
        RegexBuilder.Group
        (
            "Target",
            RegexBuilder.MetaCharacter(RegexMetaChars.NonwhiteSpace, RegexQuantifier.OneOrMore)
        )
    )
);

See CustomRegexTests.cs for more examples.

Feature Support

RegexBuilder currently supports all regular expression language elements except substitution/replacement patterns.

The following elements are supported:

How to Integrate RegexBuilder

Add a reference to YuriyGuts.RegexBuilder.dll in your project manually, or use NuGet Package Manager:

PM> Install-Package RegexBuilder 

Usage Guide

There are 3 classes you'll need. They all expose their functionality via static members and work statelessly.

  1. RegexBuilder: a factory class that produces and glues together different parts of a regular expression.
  2. RegexQuantifier: produces quantifiers (?, + {4,}, etc.) for regex parts that support them.
  3. RegexMetaChars: named constants for character classes (word boundary, whitespace, tab, etc.).

Start with var regex = RegexBuilder.Build(...); and replace ... with the parts of your regular expression by calling the corresponding methods of RegexBuilder.

Testing

RegexBuilder uses MSTest for unit testing. To run or add unit tests in Visual Studio, please see the YuriyGuts.RegexBuilder.Tests project.

The YuriyGuts.RegexBuilder.TestApp project is a console application that can be used as a temporary testing workbench.

License

The source code is licensed under The MIT License.

About

.NET library for human-readable declaration of regular expressions without having to remember the regex syntax. Looks similar to Expression Trees in .NET.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages