The CLR Loader
The CLR loader is responsible for loading and initializing assemblies, modules, resources, and types. The CLR loader loads and initializes as little as it can get away with. Unlike the Win32 loader, the CLR loader does not resolve and automatically load the subordinate modules (or assemblies). Rather, the subordinate pieces are loaded on demand only if they are actually needed (as with Visual C++ 6.0's delay-load feature). This not only speeds up program initialization time but also reduces the amount of resources consumed by a running program.
In the CLR, loading typically is triggered by the just in time (JIT) compiler based on types. When the JIT compiler tries to convert a method body from CIL to machine code, it needs access to the type definition of the declaring type as well as the type definitions for the type's fields. Moreover, the JIT compiler also needs access to the type definitions used by any local variables or parameters of the method being JIT-compiled. Loading a type implies loading both the assembly and the module that contain the type definition.
This policy of loading types (and assemblies and modules) on demand means that parts of a program that are not used are never brought into memory. It also means that a running application will often see new assemblies and modules loaded over time as the types contained in those files are needed during execution. If this is not the behavior you want, you have two options. One is to simply declare hidden static fields of the types you want to guarantee are loaded when your type is loaded. The other is to interact with the loader explicitly.
The loader typically does its work implicitly on your behalf. Developers can interact with the loader explicitly via the assembly loader. The assembly loader is exposed to developers via the LoadFrom static method on the System.Reflection.Assembly class. This method accepts a CODEBASE string, which can be either a file system path or a uniform resource locator (URL) that identifies the module containing the assembly manifest. If the specified file cannot be found, the loader will throw a System.FileNotFoundException exception. If the specified file can be found but is not a CLR module containing an assembly manifest, the loader will throw a System.BadImageFormatException exception. Finally, if the CODEBASE is a URL that uses a scheme other than file:, the caller must have WebPermission access rights or else a System.SecurityException exception is thrown. Additionally, assemblies at URLs with protocols other than file: are first downloaded to the download cache prior to being loaded.
Listing 2.2 shows a simple C# program that loads an assembly located at file://C:/usr/bin/xyzzy.dll and then creates an instance of the contained type named AcmeCorp.LOB.Customer. In this example, all that is provided by the caller is the physical location of the assembly. When a program uses the assembly loader in this fashion, the CLR ignores the four-part name of the assembly, including its version number.
Listing 2.2: Loading an Assembly with an Explicit CODEBASE
using System; using System.Reflection; public class Utilities { public static Object LoadCustomerType() { Assembly a = Assembly.LoadFrom( "file://C:/usr/bin/xyzzy.dll"); return a.CreateInstance("AcmeCorp.LOB.Customer"); } }
Although loading assemblies by location is somewhat interesting, most assemblies are loaded by name using the assembly resolver. The assembly resolver uses the four-part assembly name to determine which underlying file to load into memory using the assembly loader. As shown in Figure 2.9, this name-to-location resolution process takes into account a variety of factors, including the directory the application is hosted in, versioning policies, and other configuration details (all of which are discussed later in this chapter).
Figure 2.9: Assembly Resolution and Loading
The assembly resolver is exposed to developers via the Load method of the System.Reflection.Assembly class. As shown in Listing 2.3, this method accepts a four-part assembly name (either as a string or as an AssemblyName reference) and superficially appears to be similar to the LoadFrom method exposed by the assembly loader. The similarity is only skin deep because the Load method first uses the assembly resolver to find a suitable file using a fairly complex series of operations. The first of these operations is to apply a version policy to determine exactly which version of the desired assembly should be loaded.
Listing 2.3: Loading an Assembly Using the Assembly Resolver
using System; using System.Reflection; public class Utilities { public static Object LoadCustomerType() { Assembly a = Assembly.Load( "xyzzy, Version=1.2.3.4, " + "Culture=neutral, PublicKeyToken=9a33f27632997fcc"); return a.CreateInstance("AcmeCorp.LOB.Customer"); } }
The assembly resolver begins its work by applying any version policies that may be in effect. Version policies are used to redirect the assembly resolver to load an alternate version of the requested assembly. A version policy can map one or more versions of a given assembly to a different version; however, a version policy cannot redirect the resolver to an assembly whose name differs by any facet other than version number (i.e., an assembly named Acme.HealthCare cannot be redirected to an assembly named Acme.Mortuary). It is critical to note that version policies are applied only to assemblies that are fully specified by their four-part assembly name. If the assembly name is only partially specified (e.g., the public key token, version, or culture is missing), then no version policy will be applied. Also, no version policies are applied if the assembly resolver is bypassed by a direct call to Assembly.LoadFrom because you are specifying only a physical path and not an assembly name.
Version policies are specified via configuration files. These include a machine-wide configuration file and an application-specific configuration file. The machine-wide configuration file is always named machine.config and is located in the %SystemRoot%\Microsoft.Net\Framework\V1.0.nnnn\CONFIG directory. The application-specific configuration file is always located at the APPBASE for the application. For CLR-based .EXE programs, the APPBASE is the base URI (or directory) for the location the main executable was loaded from. For ASP.NET applications, the APPBASE is the root of the Web application's virtual directory. The name of the configuration file for CLR-based .EXE programs is the same as the executable name with an additional ".config" suffix. For example, if the launching CLR program is in C:\myapp\app.exe, the corresponding configuration file would be C:\myapp\app.exe.config. For ASP.NET applications, the configuration file is always named web.config.
Configuration files are based on the Extensible Markup Language (XML) and always have a root element named configuration. Configuration files are used by the assembly resolver, the remoting infrastructure, and by ASP.NET. Figure 2.10 shows the basic schema for the elements used to configure the assembly resolver. All relevant elements are under the assemblyBinding element in the urn:schemas-microsoft-com:asm.v1 namespace. There are application-wide settings to control probe paths and publisher version policy mode (both of which are described later in this chapter). Additionally, the dependentAssembly elements are used to specify version and location settings for each dependent assembly.
Figure 2.10: Assembly Resolver Configuration File Format
Listing 2.4 shows a simple configuration file containing two version policies for one assembly. The first policy redirects version 1.2.3.4 of the specified assembly (Acme.HealthCare) to version 1.3.0.0. The second policy redirects versions 1.0.0.0 through 1.2.3.399 of that assembly to version 1.2.3.7.
Listing 2.4: Setting the Version Policy
<?xml version="1.0" ?> <configuration xmlns:asm="urn:schemas-microsoft-com:asm.v1" > <runtime> <asm:assemblyBinding> <!-- one dependentAssembly per unique assembly name --> <asm:dependentAssembly> <asm:assemblyIdentity name="Acme.HealthCare" publicKeyToken="38218fe715288aac" /> <!-- one bindingRedirect per redirection --> <asm:bindingRedirect oldVersion="1.2.3.4" newVersion="1.3.0.0" /> <asm:bindingRedirect oldVersion="1-1.2.3.399" newVersion="1.2.3.7" /> </asm:dependentAssembly> </asm:assemblyBinding> </runtime> </configuration>
Version policy can be specified at three levels: per application, per component, and per machine. Each of these levels gets an opportunity to process the version number, with the results of one level acting as input to the level below it. This is illustrated in Figure 2.11. Note that if both the application's and the machine's configuration files have a version policy for a given assembly, the application's policy is run first, and the resultant version number is then run through the machine-wide policy to get the actual version number used to locate the assembly. In this example, if the machine-wide configuration file redirected version 1.3.0.0 of Acme.HealthCare to version 2.0.0.0, the assembly resolver would use version 2.0.0.0 when version 1.2.3.4 was requested because the application's version policy maps version 1.2.3.4 to 1.3.0.0.
Figure 2.11: Version Policy
In addition to application-specific and machine-wide configuration settings, a given assembly can also have a publisher policy. A publisher policy is a statement from the component developer indicating which versions of a given component are compatible with one another.
Publisher policies are stored as configuration files in the machine-wide global assembly cache. The structure of these files is identical to that of the application and machine configuration files. However, to be installed on the user's machine, the publisher policy configuration file must be wrapped in a surrounding assembly DLL as a custom resource. Assuming that the file foo.config contains the publisher's configuration policy, the following command line would invoke the assembly linker (AL.EXE) and create a suitable publisher policy assembly for AcmeCorp.Code version 2.0:
al.exe /link:foo.config /out:policy.2.0.AcmeCorp.Code.dll /keyf:pubpriv.snk /v:2.0.0.0
The name of the publisher policy file follows the form policy.major.minor.assmname.dll. Because of this naming convention, a given assembly can have only one publisher policy file per major.minor version. In this example, all requests for AcmeCorp.Code whose major.minor version is 2.0 will be routed through the policy file linked with policy.2.0.AcmeCorp.Code.DLL. If no such assembly exists in the global assembly cache (GAC), then there is no publisher policy. As shown in Figure 2.11, publisher policies are applied after the application-specific version policy but before the machine-wide version policy stored in machine.config.
Given the fragility inherent in versioning component software, the CLR allows programmers to turn off publisher version policies on an application-wide basis. To do this, programmers use the publisherPolicy element in the application's configuration file. Listing 2.5 shows this element in a simple configuration file. When this element has the attribute apply="no", the publisher policies will be ignored for this application. When this attribute is set to apply="yes" (or is not specified at all), the publisher policies will be used as just described. As shown in Figure 2.10, the publisherPolicy element can enable or disable publisher policy on an application-wide or an assembly-by-assembly basis.
Listing 2.5: Setting the Application to Safe Mode
<?xml version="1.0" ?> <configuration xmlns:rt="urn:schemas-microsoft-com:asm.v1"> <runtime> <rt:assemblyBinding> <rt:publisherPolicy apply="no" /> </rt:assemblyBinding> </runtime> </configuration>