.NET Common Language Runtime Components
The Type System
The CLR has its own set of concepts and techniques for packaging, deploying, and discovering component code. These concepts and techniques are fundamentally different from those used by technologies such as COM, Java, or Win32. The difference is best understood by looking closely at the CLR loader, but first one must look at how code and metadata are actually packaged.
Modules Defined
Programs written for the CLR reside in modules. A CLR module is a byte stream, typically stored as a file in the local file system or on a Web server.
As shown in Figure 2.1, a CLR module uses an extended version of the PE/COFF executable file format used by Windows NT. By extending the PE/COFF format rather than starting from scratch, CLR modules are also valid Win32 modules that can be loaded using the LoadLibrary system call. However, a CLR module uses very little PE/COFF functionality. Rather, the majority of a CLR module's contents are stored as opaque data in the .text section of the PE/COFF file.
Figure 2.1: CLR Module Format
CLR modules contain code, metadata, and resources. The code is typically stored in common intermediate language (CIL) format, although it may also be stored as processor-specific machine instructions. The module's metadata describes the types defined in the module, including names, inheritance relationships, method signatures, and dependency information. The module's resources consist of static read-only data such as strings, bitmaps, and other aspects of the program that are not stored as executable code.
The file format used by CLR modules is fairly well documented; however, few developers will ever encounter the format in the raw. Even developers who need to generate programs on-the-fly will typically use one of the two facilities provided by the CLR for programmatically generating modules. The IMetaDataEmit interface is a low-level COM interface that can be used to generate module metadata programmatically from classic C++. The System.Reflection.Emit namespace is a higher-level library that can be used to generate metadata and CIL programmatically from any CLR-friendly language (e.g., C#, VB.NET). The CodeDOM works at an even higher layer of abstraction, removing the need to know or understand CIL. However, for the vast majority of developers, who simply need to generate code during development and not at runtime, a CLR-friendly compiler will suffice.
The C# compiler (CSC.EXE), the VB.NET compiler (VBC.EXE), and the C++ compiler (CL.EXE) all translate source code into CLR modules. Each of the compilers uses command-line switches to control which kind of module to produce. As shown in Table 2.1, there are four possible options. In C# and VB.NET, one uses the /target command-line switch (or its shortcut, /t) to select which option to use. The C++ compiler uses a combination of several switches; however, one always uses the /CLR switch to force the C++ compiler to generate CLR-compliant modules. The remainder of this discussion will refer to the C# and VB.NET switches, given their somewhat simpler format.
Table 2.1 Module Output Options
C#/VB.NET |
C++ |
Directly Loadable? |
Runnable from Shell? |
Access to Console? |
/t:exe |
/CLR |
Yes |
Yes |
Always |
/t:winexe |
/CLR /link |
Yes |
Yes |
Never |
|
/subsystem:windows |
|
|
|
/t:library |
/CLR /LD |
Yes |
No |
Host-dependent |
/t:module |
/CLR:NOASSEMBLY /LD |
No |
No |
Host-dependent |
The /t:module option produces a "raw" module that by default will use the .netmodule file extension. Modules in this format cannot be deployed by themselves as stand-alone code, nor can the CLR load them directly. Rather, developers must associate raw modules with a full-fledged component (called an assembly) prior to deployment. In contrast, compiling with the /t:library option produces a module that contains additional metadata that allows developers to deploy it as stand-alone code. A module produced by compiling with /t:library will have a .DLL file extension by default.
Modules compiled with /t:library can be loaded directly by the CLR but cannot be launched as an executable program from a command shell or the Windows Explorer. To produce this kind of module, you must compile using either the /t:exe or the /t:winexe option. Both options produce a file whose extension is .EXE. The only difference between these two options is that the former assumes the use of the console UI subsystem; the latter option assumes the GUI subsystem. If no /t option is specified, the default is /t:exe.
Modules produced using either the /t:exe or the /t:winexe option must have an initial entry point defined. The initial entry point is the method that the CLR will execute automatically when the program is launched. Programmers must declare this method static, and, in C# or VB.NET, they must name it Main. Programmers can declare the entry point method to return no value or to return an int as its exit code. They can also declare it to accept no parameters or to accept an array of strings, which will contain the parsed command-line arguments from the shell. The following are four legal implementations for the Main method in C#:
static void Main() { } static void Main(string[] argv) { } static int Main() { return 0; } static int Main(string[] argv) { return 0; }
These correspond to the following in VB.NET:
shared sub Main() : end sub shared sub Main(argv as string()) : end sub shared function Main() : return 0 : end function shared function Main(argv as string()) return 0 end function
Note that these methods do not need to be declared public. Programmers must, however, declare the Main method inside a type definition, although the name of the type is immaterial.
The following is a minimal C# program that does nothing but print the string Hello, World to the console:
class myapp { static void Main() { System.Console.WriteLine("Hello, World"); } }
In this example, there is exactly one class that has a static method called Main. It would be ambiguous (and therefore an error) to present the C# or VB.NET compiler with source files containing more than one type having a static method called Main. To resolve this ambiguity, programmers can use the /main command-line switch to tell the C# or VB.NET compiler which type to use for the program's initial entry point.