- Introduction
- What Is a "Regular Expression"?
- String Pattern-Matching
- Searching Files
- Compiling Regular Expressions
- Dynamically Emitting Compiled Regular Expressions
- Summary
Dynamically Emitting Compiled Regular Expressions
You can emit a compiled regular expression to an assembly. The regular expression will load slower, but runs faster once loaded as a compiled regular expression assembly. With this emitter you can allow users to select which expressions they run frequently, and then compile and emit those expressions. Let's quickly review Reflection and emitting assemblies.
In the simplest terms possible, Reflection is a .NET technology that supports dynamic discovery and use of code. It's analogous to Run-time Type Information (RTTI), but Reflection is much more; .NET Reflection allows programmers to write codethat writes code. This is referred to as emitting. In short, your programs can write programs. This is precisely what CompileToAssembly does; it writes code that after you've compiled your application.
NOTE
For more on reflection, see my upcoming book The Visual Basic .NET Developer's Book (Addison-Wesley, scheduled for publication Fall 2002, ISBN 0-672-3240705).
Listing 2 demonstrates the brief code necessary to emit a regular expression to an assembly at runtime.
Listing 2Emitting a Regular Expression to an Assembly
const string expression = "mailto:\w+@\w+.senate.gov" RegexCompilationInfo[] info = new RegexCompilationInfo[] { new RegexCompilationInfo(Expression, RegexOptions.Compiled, "SenateMail", "CompiledExpressions", true)}; AssemblyName assemblyName = new AssemblyName(); assemblyName.Name = "Regex"; Regex.CompileToAssembly(info, assemblyName);
The code defines a regular expression that can easily be represented by some dynamic user input. The Regex.CompileToAssembly method requires an array of RegexCompilationInfo objects. RegexCompilationInfo is basically everything needed to define a custom Regex class:
The first argument (Expression) is the expression string.
The second argument (RegexOptions.Compiled) is the RegexOptions.
The third argument ("SenateMail") is the name of the Regex derivative class.
The fourth argument ("CompiledExpressions") is the namespace to emit.
The fifth argument (true) represents the access modifier for the new class.
Finally, we need an AssemblyName object, and we pass the RegexCompilationInfo array and the AssemblyName arguments to the Regex.CompileToAssembly method. When the last statement runs, there will be an assembly named Regex.dll on your disk containing a module with the namespace CompiledExpressions and one class, SenateMail. SenateMail will be subclassed from System.Text.RegularExpressions.Regex.