- ADO.NET Defined
- System.Data Architecture
- Application Scenario
- Summary
System.Data Architecture
You can think of the System.Data namespace as consisting of two primary parts. The first part is the managed providers that allow you to connect to a data source, issue commands against the data source, and read data directly from the data store or into a DataSet. The managed providers contain classes analogous to the Connection, Command, and Parameter objects in classic ADO as well as adding support for iterating through a result set in a forward-only manner. The second part is the DataSet and its various supporting classes that allow you to manipulate data in a disconnected fashion. As mentioned previously the DataSet is most like a disconnected Recordset, although much more powerful and flexible.
Managed Providers
To manipulate data in a data store you obviously first need to open a connection to it and pass it commands to execute. In ADO.NET this is accomplished by using a managed provider. ADO.NET ships with two managed providers, the OleDb managed provider and the SqlClient managed provider. The former is contained in the System.Data.OleDb namespace, which like classic ADO, allows you to access any data source for which an OLE DB provider is available. The latter is in System.Data.Sql and provides native access to SQL Server by writing directly to the tabular data stream protocol (TDS) used by SQL Server. In other words, the OleDb managed provider simply offers a data access model that sits on top of the existing OLE DB infrastructure and requires data source specific OLE DB providers or ODBC drivers. Conversely, the SqlClient managed provider takes the place of the existing OLE DB provider for SQL Server (SQLOLEDB) because it writes directly to SQL Server without the assistance of any other software. A diagram of these components and their relation to OLE DB and ADO can be seen in Figure 7.1.
Figure 7.1 ADO.NET Data Access Architecture. This diagram depicts the data access components used in a VB.NET application. Note that OLE DB and ODBC are still a part of this data access model.
Each of the managed providers includes a set of classes that implement interfaces and derive from classes found in System.Data. The common types of classes included in a managed provider are as follows:
ConnectionA class that implements IDbConnection and includes members to connect to a data source, handle transactional behavior and connection pooling, and receive notifications when the state of the connection changes.
CommandA class that implements IDbCommand and includes members used to execute queries against a Connection. It can execute queries that do not return results and queries that return an IDataReader object for iterating through a result set. Also includes events for responding to schema changes in the underlying data source.
DataAdapterA class derived from DbDataAdapter that implements IDbDataAdapter. It includes a set of Command objects and a Connection used to populate a DataSet and update the underlying data source. It also maps data source tables and columns to tables and columns in a DataSet. Note that only relational providers will inherit directly from DbDataAdapter.
ParameterA class that implements IDataParameter used to pass parameters to Command objects. Parameters are tracked through a ParameterCollection object that implements the IDataParameter collection interface.
TransactionA class that implements IDbTransaction used to represent a transaction in the data source.
DataReaderA class that implements the IDataReader interface used to read a forward-only stream of records from the data source. Note that this is analogous to a firehose cursor in SQL Server.
The managed providers can also include their own classes to represent properties, events, exceptions, command builders for use with a DataAdapter, and errors. The naming convention used is to prefix the name of the base class or interface with the namespace for the managed provider. For example, for the SqlClient managed provider the classes shown in Table 7.1 are available.
Table 7.1 Classes found in the System.Data.Sql namespace for the Sql Server managed provider.
Class |
Description |
SqlConnection |
Implements IDbConnection and represents the connection to the SQL Server database. Includes its own syntax for the connection string. |
SqlCommand |
Implements IDbCommand and is used to execute Transact-SQL commands such as stored procedures. |
SqlDataAdapter |
Derived from DbDataAdaper and used to map data from SQL Server to a DataSet for select, insert, update, and delete. |
SqlDataReader |
Implements IDataReader as a means of returning a forward-only, read-only cursor on a result set. |
SqlParameter |
Represents a parameter to a SqlCommand object. |
SqlParameterCollection |
Represents a collection of SqlParameter objects. |
SqlTransaction |
Represents a transaction in SQL Server. |
SqlError |
Represents an error returned by the SqlDataAdapter. |
SqlException |
Thrown by the SqlDataAdapter when a SQL Server error occurs. |
Obviously, by abstracting the basic functionality for managed providers into base classes and interfaces it is possible for database vendors and corporate developers to write their own managed providers. Although this is possible, it is probably not recommended for the majority of developers because the managed providers shipped with .NET handle all the connectivity required for most data sources. However, it might be warranted for companies that have large-scale systems and have developed a proprietary data access interface to their backend.
In one respect the availability of multiple managed providers represents a decision point for developers using SQL Server (or a backend for which a managed provider is available) as their backend database. Because the SqlClient managed provider handles TDS natively it performs better than incurring the extra translation layer when using the OleDb managed provider against SQL Server as noted in Figure 7.1. As a result, most SQL Server developers will likely want to use the SqlClient managed provider. The downside of this choice is that if you change your backend database you'll have to change and recompile your code to use the OleDb provider as a result. To avoid this, where it is likely that the backend database may change, you may want to create your own set of generic classes that abstract the instantiation of objects from the managed providers. Obviously these classes could run the gamut from statically invoking a particular managed provider such as SqlClient to dynamically invoking the correct one based on a property setting. The cost of creating and maintaining this extra layer may be justified if your target database changes.
That being said, for most projects a better approach is to be sure the data access code is abstracted from the user and business logic in custom classes so that if necessary the code will be fairly easy to change and yet not incur the overhead of creating and maintaining a separate set of generic data access classes. This is the approach used in this chapter.
Using a Managed Provider
The basics of using a managed provider are analogous to using the connection and command objects in classic ADO. However, a managed provider also includes a DataAdapter used to populate and control DataSet objects that have no equivalent in classic ADO.
To illustrate the use of the connection and command objects, the code snippet in Listing 7.1 shows a simple example of connecting to SQL Server using the SqlClient managed provider and executing a simple stored procedure to return the contents of the Courses table using the SqlCommand object. The resulting rows are then iterated upon using the SqlDataReader class.
Listing 7.1 Using a managed provider. This listing shows the simplest example of executing a stored procedure in SQL Server using the SqlClient managed provider and streaming through the resulting rows.
Imports System.Data Imports System.Data.SqlClient ... Dim cnSQL As SqlConnection Dim cmSQL As SqlCommand Dim drCourses As SqlDataReader cnSQL = New SqlConnection("server=ssosa\sql2k;trusted_connection=yes;database=enrollment") cmSQL = New SqlCommand("usp_ListCourses", cnSQL) cmSQL.CommandType = CommandType.StoredProcedure Try cnSQL.Open() drCourses = cmSQL.ExecuteReader() Do While drCourses.Read() Console.WriteLine(drCourses("Description").ToString()) Loop Catch e As Exception Console.WriteLine(e.ToString()) Finally cmSQL.Close() cnSQL.Close() End Try
There are several items to note in Listing 7.1 including the fact that constructor strings are used with both the SqlConnection and SqlCommand objects to pass the connection string and the command text with a reference to the connection respectively. Furthermore, you'll notice that the syntax of the connection string passed to SqlConnection is similar to that used with the SQLOLEDB provider. In addition, the ExecuteReader method is called, which returns a SqlDataReader object used to simply iterate through the rows using the Read method. The SqlDataReader implements a forward-only cursor that allows access to the data through a simple default Item property. For optimum performance the DataReader provides only the basic constructs for reading data.
Note
While much of the code in this chapter uses the SqlClient managed provider, the code for using the OleDb managed provider is almost exactly analogous.
Using a DataReader is the fastest way to get data from SQL Server and involves no overhead on the client because data is not persisted. However, as with a "firehose" cursor in classic ADO, the database connection stays open until the Close method of the SqlDataReader or SqlConnection is called or until the object goes out of scope.
Although it may at first feel strange to access forward-only result sets in this manner, moving the functionality provided by the SqlDataReader into the managed provider and away from the DataSet (where you might expect to find it because it is analogous to the Recordset in most respects) is a good thing. This is because the implementation must clearly be managed by the data source and it allows the DataSet to be used strictly as a client-side data cache.
Tip
For more information on creating firehose cursors in classic ADO see Chapter 14 of Pure Visual Basic.
The interesting aspect of the SqlCommand is that it also implements other methods to execute commands against SQL Server as shown in Table 7.2.
Table 7.2 Execute methods of SqlCommand. These methods are used to retrieve other types of results from SQL Server.
Class |
Description |
ExecuteNonQuery |
Executes the command and does not return a result. Useful for executing statements that do not return result sets. |
ExecuteScalar |
Executes the command and returns the first column and first row of the result set only. Discards all other data. |
ExecuteResultset |
Reserved for future use. |
ExecuteStream |
Executes the command and returns the results as an XML stream. |
Several of the methods shown in Table 7.2 such as ExecuteNonQuery and ExecuteScalar also exist in the OleDb managed provider and will be discussed in more detail later in this chapter. However, as an example of the customization possible when using managed providers for a particular data source consider the code in Listing 7.2.
Listing 7.2 Returning a stream. This listing shows how a managed provider can implement specific features of the data source, in this case the ability to return XML as a stream.
Dim cnSQL As SqlConnection Dim cmSQL As SqlCommand Dim sCourses As Stream Dim objSr As StreamReader Dim strXML As String cnSQL = New SqlConnection("server=ssosa\sql2k;trusted_connection=yes;database=enrollment") cmSQL = New SqlCommand("usp_ListCourses", cnSQL) cmSQL.CommandType = CommandType.StoredProcedure Try cnSQL.Open() sCourses = cmSQL.ExecuteStream() objSr = New StreamReader(sCourses) strXML = objSr.ReadToEnd() Catch e As Exception Console.WriteLine(e.ToString()) Finally cmSQL.Close() cnSQL.Close() End Try
In this example the same stored procedure is executed as in Listing 7.1, however, rather than execute the procedure using the ExecuteReader method, the ExecuteStream method is invoked, which is unique to the SqlClient managed provider. In this case, instead of returning a result set that you can iterate on using a DataReader, an XML stream is returned and stored in a string variable using a StreamReader (discussed in Chapter 11). This is possible because SQL Server 2000 has the ability to generate XML at the server using the FOR XML clause. To generate the XML the usp_ListCourses stored procedure was modified as shown in this Transact-SQL snippet:.
CREATE PROCEDURE usp_ListCourses AS SELECT * FROM Course ORDER BY CourseNum FOR XML AUTO
Several other techniques for using command objects of the managed providers will be shown later in the chapter.
The other key aspect of using a managed provider is to use the DataAdapter to populate a DataSet and control how changes to the DataSet are sent to the underlying data store. To do this the DataAdapter references select, insert, update, and delete command objects that are invoked when the DataSet is manipulated. The code snippet shown in Listing 7.3 creates a new SqlDataAdapter and sets the SelectCommand and UpdateCommand properties.
Listing 7.3 Using SqlDataAdapter. This simple example creates a DataAdapter and uses it to specify a stored procedure for updating a DataSet.
Dim cnSQL As SqlConnection Dim daSQL As SqlDataAdapter Dim parmWork As SqlParameter Dim dsCourses As New DataSet("Offerings") Dim intRows as Integer cnSQL = New SqlConnection("server=ssosa\sql2k;trusted_connection=yes;database=enrollment") ' Create the adapter and set the SelectCommand through the constructor daSQL = New SqlDataAdapter("usp_ListCourses", cnSQL) daSQL.SelectCommand.CommandType = CommandType.StoredProcedure ' Create the update command daSQL.UpdateCommand = New SqlCommand("usp_UpdateCourse", cnSQL) daSQL.UpdateCommand.CommandType = CommandType.StoredProcedure ' Set the parameters for the update command parmWork = daSQL.UpdateCommand.Parameters.Add(New SqlParameter("@Days", SqlDbType.TinyInt)) parmWork.SourceColumn = "Days" parmWork.SourceVersion = DataRowVersion.Current parmWork = daSQL.UpdateCommand.Parameters.Add(New SqlParameter("@CourseID", SqlDbType.Int)) parmWork.SourceColumn = "CourseID" parmWork.SourceVersion = DataRowVersion.Original parmWork = daSQL.UpdateCommand.Parameters.Add(New SqlParameter("@Cost", SqlDbType.Money)) parmWork.SourceColumn = "CourseWareCost" parmWork.SourceVersion = DataRowVersion.Current ' Populate the DataSet intRows = daSQL.Fill(dsCourses, "Courses") ' Make a change to the underlying data dsCourses.Tables("Courses").Rows(0).Item("Days") = 0 Try ' Save the changes daSQL.Update(dsCourses,"Courses") Catch e As Exception ' Report the exception End Try
Note that like any command object in classic ADO the SqlCommand object referenced in the UpdateCommand property accepts parameters stored in a Parameters collection. The constructor of SqlParameter accepts the name of the parameter and the data type using the SqlDbType enumeration. The SourceColumn and SourceVersion properties are used to instruct the SqlDataAdapter as to which column and which version of the column (Current, Default, Original, Proposed) should be used from the DataSet to populate the parameter. In this case the original version of the CourseID is used because it is the primary key while the current versions of Days and CourseWareCost are used because they are updateable.
Tip
Unlike in classic ADO when using parameters with a SqlCommand or OleDbCommand, the parameters needn't be added to the parameters collection in the order in which they are defined in the stored procedure because the procedure will be executed using named arguments. For example in Listing 7.3 the @Days parameter is defined prior to the @CourseID although in the stored procedure the reverse is true. Of course, this means that you must use the exact name of the parameter when adding it to the collection.
The SqlDataAdapter can then populate the underlying DataSet using the Fill method, which returns the number of rows added to the underlying DataTable. Note that if the SqlConnection object is not yet open, the Fill method will first open it before executing the query used to retrieve the data before closing it. If the connection is already open, the Fill method simply uses it and does not close it. This behavior is efficient because holding on to an expensive resource such as database connections is not desirable in distributed applications.
A simple change is then made to the Days column of the first row and the Update method of the SqlDataAdapter is then called. The end result is a single call to the usp_UpdateCourse stored procedure with the appropriate parameters.
In both the Fill and Update methods, the second argument is used to specify the DataTable object within the DataSet that is to be populated (more about this follows).
As mentioned previously, instead of specifying the insert, update, and delete commands executed by the DataAdapter in code as shown in Listing 7.3, it is possible to instruct the managed provider to create them on-the-fly. This can be done with SqlCommandBuilder and OleDbCommandBuilder classes. In fact by simply instantiating a command builder and passing it the DataAdapter in the constructor, the commands will be built on-the-fly by the CommandBuilder when they are needed. The code in Listing 7.4 shows how this works using the SqlCommandBuilder class.
Listing 7.4 Using a CommandBuilder. This listing shows how to use the SqlCommandBuilder to autogenerate commands for data manipulation through the SqlDataAdapter.
Dim cnSQL As SqlConnection Dim daSQL As SqlDataAdapter Dim cmdSql As New SqlCommandBuilder(daSQL) Dim cmUpd As SqlCommand Dim dsCourses As New DataSet("Offerings") cnSQL = New SqlConnection("server=ssosa\sql2k;trusted_connection=yes;database=enrollment") ' Create the adapter and set the SelectCommand through the constructor daSQL = New SqlDataAdapter("SELECT * FROM fn_ListCourses()", cnSQL) ' Create the command builder cmdSql = New SqlCommandBuilder(daSQL) ' Populate the DataSet daSQL.Fill(dsCourses, "Courses") ' Make a change to the underlying data dsCourses.Tables("Courses").Rows(0).Item("Days") = 0 daSQL.Update(dsCourses,"Courses") ' Get the command that was used to update the database cmUpd = cmdSql.GetUpdateCommand Console.WriteLine(cmUpd.CommandText)
Notice that rather than create UpdateCommand explicitly, the SqlCommandBuilder is instantiated and passed a reference to daSQL. At this point no further intervention is required. When the Update method of the SqlDataAdapter is invoked, the command builder sends a request to SQL Server to discover the column names and data types of the base table used in the SelectCommand. In this way the command builder only incurs the overhead of the extra roundtrip to the server when it is required. It then populates the underlying SqlDataAdapter command objects, which are then immediately used to execute the insert, delete, or update (as in this case) using the sp_executesql system stored procedure. The commands are then cached for the lifetime of the SqlDataAdapter. Note that the code in Listing 7.4 also uses the GetUpdateCommand method of the SqlCommandBuilder to get a reference to the SqlCommand used for the update and simply prints the CommandText that was generated.
Before you rely on the command builders, there are a couple of caveats to be aware of. First, command builders only work when the statement used for the SelectCommand pulls data from a single base table. In this example, the SelectCommand uses a call to an in-line table function in SQL Server 2000 that queries only the Course table and so it is successful. However, if the function were to use a SELECT statement with aggregate functions or a JOIN clause, the command builder would return an exception when trying to generate the commands because either no base table or multiple base tables would be returned. As shown however, with SQL Server you can use functions and stored procedures that return data from a single table.
Tip
A second type of function supported in SQL Server 2000 is the multistatement table function. This function returns a result set built on-the-fly by Transact-SQL inside the function. While you can use multistatement table functions in a SelectCommand by simply referencing them in the FROM clause, the SqlCommandBuilder cannot create commands from such a statement because base table information is missing. This is the case even if the function returns data from a single table.
The second point to note is that the SQL syntax that gets built when using a command builder is not necessarily optimum. For example, all the columns returned in the SelectCommand are included inside the WHERE clause for the update and delete commands even if the primary key of the underlying DataTable is set. This obviously wastes network bandwidth and causes the SQL Server to have to parse the statement and create an optimized query plan before it can be executed. An example of the update CommandText is shown here. Note that CourseID is the primary key.
UPDATE Course SET CourseNum = @p1 , Description = @p2 , ProductID = @p3 , LOBID = @p4 , Days = @p5 , CourseWareCost = @p6 WHERE ( CourseID = @p7 AND CourseNum = @p8 AND Description = @p9 AND ProductID = @p10 AND LOBID = @p11 AND Days = @p12 AND CourseWareCost = @p13 )
In addition, the insert command is built based on only the columns that are set when populating the new row. In other words, you must ensure that you populate all the columns that are required and that do not have defaults if the insert is to succeed.
The end result is that for most sophisticated applications you'll want to populate the data modification commands yourself using more efficient techniques such as stored procedures.
Using Events
Just as the Connection and Command objects in classic ADO support events so do the managed providers. The connection and DataAdapter classes for both OleDb and SqlClient include events that your code can capture. For example, both the SqlConnection and the OleDbConnection classes support the InfoMessage and StateChanged events to capture informational messages from the data source and monitor when the connection changes its state. SqlDataAdapter and OleDbDataAdapter support RowUpdating and RowUpdated events that fire before and after the UpdateCommand of the adapter is executed.
To handle one of these events you can declare the object using the WithEvents keyword, use a dynamic event handler, or the Delegate class explicitly as discussed in Chapter 4. As an example, consider a client that wants to capture messages produced with the PRINT statement in Transact-SQL. To do this the InfoMessage event must be handled. Assuming that the SqlConnection object is declared as cnSQL, a dynamic event handler can be constructed using the AddHandler statement.
AddHandler cnSQL.InfoMessage, AddressOf SqlConnInfoMessage
The private SqlConnInfoMessage method must then take as its second argument the type SqlInfoMessageEventArgs, which exposes a collection of SqlError objects. The code to handle the event, construct a message string, and pass it to a method used to log the message to a file is shown here:.
Private Sub SqlConnInfoMessage(ByVal sender As Object, ByVal e As SqlInfoMessageEventArgs) Dim i As Integer Dim strMessage As String For i = 0 to e.Errors.Count - 1 strMessage = "Source: " & e.Errors(i).Source & vbCrlf & _ "Number: " & e.Errors(i).Number.ToString() & vbCrlf & _ "State: " & e.Errors(i).State.ToString() & vbCrlf & _ "Class: " & e.Errors(i).Class.ToString() & vbCrlf & _ "Server: " & e.Errors(i).Server & vbCrlf & _ "Message: " & e.Errors(i).Message & vbCrlf & _ "Procedure: " & e.Errors(i).Procedure & vbCrlf & _ "LineNumber: " & e.Errors(i).LineNumber.ToString() Next LogToFile(strMessage) End Sub
Connection Pooling
Developers interested in developing scalable distributed applications for use by hundreds or thousands of concurrent users must concern themselves with using expensive resources such as database connections. As a result you'll want to be sure your applications do not hold connections for an extended period of time and that multiple users can reuse connections no longer in use without having to incur the overhead of rebuilding the connection.
In classic ADO, pooling database connections was handled either through the session pooling mechanism of OLE DB or the connection pooling code implemented by the ODBC Driver manager when using the MSDASQL provider.
Tip
For more information on the differences and specifics of these two methods see Chapter 14 of Pure Visual Basic.
When using the OleDb managed provider both of these features are still available and are handled automatically by OLE DB and ODBC depending on which OLE DB provider you use. However, because the SqlClient managed provider communicates directly with SQL Server using TDS, it must implement its own form of connection pooling and does so by relying on Windows 2000 Component Services. As discussed in chapter 8, COM+ in Windows 2000 supports pooled components whereby a predefined number of instantiated object instances are hosted in a pool managed by COM+. If an object instance is available, one is handed out to a client application when it requests a new instance and subsequently returned to the pool when dereferenced for use by another client. This scheme allows clients to reuse objects that are expensive to create thereby reducing the resources required on the server. The class that represents a connection to SQL Server, SqlConnection, is a perfect candidate for a pooled component.
To support creating and configuring a connection pool several additional properties are included in the ConnectionString for the SqlConnection. For example, by simply instantiating and opening a SqlConnection object with the Pooled property set to True the connection will be added to a new or existing pool. As with OLE DB session pooling, connections are grouped into pools based on the distinct text of the ConnectionString property. Any differences at all in connection strings results in the creation of a new pool.
As an example consider this code snippet:
cnSQL = New SqlConnection("server=ssosa\sql2k;uid=dfox;pwd=dlf;database=enrollment; pooling=true") cnSQL.Open cnSQL2 = New SqlConnection("server=ssosa\sql2k;trusted_connection=yes;database=enrollment; pooling=true") cnSQL2.Open cnSQL3 = New SqlConnection("server=ssosa\sql2k;trusted_connection=yes;database=enrollment; pooling=true") cnSQL3.Open cnSQL3.Close cnSQL4 = New SqlConnection("server=ssosa\sql2k;trusted_connection=yes;database=enrollment; pooling=true") cnSQL4.Open
In this example, four SqlConnection objects are instantiated and all are left open with the exception of cnSQL3. In this case because all have the Pooling attribute set to True two pools are created: one containing cnSQL1; and the other containing cnSQL2 and cnSQL3 because the ConnectionString for cnSQL1 differs from that of cnSQL2 and cnSQL3. However, when the Close method of cnSQL3 is called the connection is not torn down but returned to the pool. As a result when the Open method of cnSQL4 is called the connection previously used by cnSQL3 will be pulled from the pool and assigned to cnSQL4 rather than a new connection.
Obviously, this contrived example concerns a single client application but if you assume that each connection represents a separate user that attempts to create a database connection nearly simultaneously and you begin to see the usefulness of connection pooling.
Unlike OLE DB session pooling, by default the pooled SqlConnection objects are not destroyed until the process that created them ends or the connection is somehow broken. In addition, the pools created are also subdivided based on whether the connection is enlisted in a particular transaction context. This allows the client application to use distributed transactions with SqlConnection objects and be sure that all work done by the transaction is committed together.
Finally the syntax of the ConnectionString has been augmented with additional properties that control the size and behavior of the pool as shown in Table 7.3. Note that because these are included in the ConnectionString they are set when the pool is first created because pools are created on the basis of the syntax of the ConnectionString.
Table 7.3 SqlClient Connection pooling properties. These properties are used in the ConnectionString of the SqlConnection object to configure connection pooling.
Property |
Description |
Connect Timeout |
Length of time to wait for a connection to succeed. Default is 15 seconds. |
Connection Lifetime |
Specifies the maximum amount of time in seconds that a connection can live. Default is 0, meaning for the duration of the process. Checked when the connection is returned to the pool. |
Connection Reset |
Specifies whether the connection is reset when returned to the pool by undoing any session level (SET) statements issued during the user session. Defaults to True but incurs an extra roundtrip when the connection is returned to the pool. |
Enlist |
Defaults to True and automatically enlists the connection in the current transaction context. |
Min Pool Size, Max Pool Size |
Defaults to 0 and 100, respectively, and determines how many connections may live in the pool. |
Disconnected Data
As previewed in the code in the previous section, the second major partition of the System. Data namespace is the DataSet and its related classes. As previously mentioned the DataSet is fundamental to ADO.NET because it represents data from one or more underlying data sources in a disconnected fashion. In addition it includes metadata such as relationships and constraints that provide structure for the data and allow code to navigate through the data. And of course the DataSet can be defined using an XSD schema and stored as XML so that it is a great vehicle for passing data between tiers of a distributed application.
Note
The DataSet class is derived from the System.ComponentModel. MarshalByValueComponent class, which allows it to be marshaled by value across process boundaries whereby a copy of the serialized object is passed rather than a reference to it. In this way it is analogous to the disconnected Recordset in classic ADO.
In addition to scalar properties used to configure the DataSet, the DataSet contains references to the major classes shown in Figure 7.2 and discussed in Table 7.4.
Figure 7.2 The DataSet object model. The primary classes referenced by the DataSet.
Table 7.4 DataSet classes. The primary classes referenced by the DataSet.
Class |
Description |
DataRelationsCollection |
Exposed through the Relations property and contains a collection of DataRelation objects that map all the parent/child relationships between DataTable objects in a DataSet. |
PropertyCollection |
Exposed through the ExtendedProperties property and contains a collection of custom properties based on the Hashtable class. |
DataTablesCollection |
Exposed through the Tables property and contains a collection of DataTable objects contained in the DataSet. |
DataViewManager |
Exposed through the DefaultViewManager property and used to create custom sort and filter settings (DataView) for each DataTable in the DataSet. Includes a reference to a DataViewSettingsCollection where settings for each DataTable are stored. |
As you'll notice from Table 7.4, the DataSet contains references to objects that expose collections containing the data stored in the DataSet. From that perspective you can also think of the DataSet as simply a container for tables, properties, and relations.
As depicted in Figure 7.2 the DataTableCollection exposes the collection of DataTable objects that are used to represent the actual data. Figure 7.3 and Table 7.5 expand the DataTable and show its constituent classes.
Figure 7.3 The DataTable model. This diagram depicts the relationships between the DataTable and other System.Data classes.
Table 7.5 DataTable classes. The primary classes referenced by the DataTable.
Class |
Description |
DataColumnCollection |
Exposed through the Columns property and contains a collection of DataColumn objects that represent each column in the DataTable. |
DataRowCollection |
Exposed through the Rows property and contains a collection of DataRow objects in the DataTable. |
DataView |
Exposed through the DefaultView property and contains a reference to a DataView object that contains sort and filter settings for this DataTable object. |
DataRelationCollection |
Exposed through the ChildRealtions and ParentRelations properties and contains a collection of DataRelation objects contained in the DataSet. |
PropertyCollection |
Exposed through the ExtendedProperties property and contains a collection of custom properties based on the Hashtable object. |
ConstraintCollection |
Exposed through the Constraints property and references a collection of objects derived from Constraint such as ForeignKeyConstraint and UniqueConstraint. |
DataColumn |
In addition to being referenced in ColumnsCollection, also exposed in the PrimaryKey property as an array of objects to enforce the primary key. |
The DataTable is the heart of the DataSet and contains the representation of the actual data. You'll notice that like the DataSet, it contains a series of collection objects that reference the columns, rows, properties, constraints, and relations. From this perspective the DataTable is analogous to the classic ADO Recordset.
It also contains a reference to a default DataView object that controls how the data is sorted and filtered. DataView objects can also be created independently to provide multiple views of the same data, for example, in the case where changed rows need to be viewed in one grid control while new rows are viewed in another.
As an example of using the DataSet and its associated collections consider the code in Listing 7.5.
Listing 7.5 Manipulating a DataSet. This code example populates a DataSet using a SqlDataAdapter and then exercises the various collections of the DataSet and DataTable.
Dim cnSQL As SqlConnection Dim daSql As SqlDataAdapter Dim daSql1 As SqlDataAdapter Dim drProdCourse As DataRelation Dim objCon As Constraint Dim objUn As UniqueConstraint Dim rowProd As DataRow Dim rowCourse As DataRow Dim CourseRows() As DataRow Dim dtProducts As DataTable Dim dtCourses As DataTable Dim pk() As DataColumn Dim dsCourses As New DataSet("Offerings") cnSQL = New SqlConnection("server=ssosa\sql2k;trusted_connection=yes;database=enrollment") ' Create the adapters and set the SelectCommand through the constructor daSql = New SqlDataAdapter("usp_ListCourses", cnSQL) daSql.SelectCommand.CommandType = CommandType.StoredProcedure daSql1 = New SqlDataAdapter("usp_ListProducts", cnSQL) daSql1.SelectCommand.CommandType = CommandType.StoredProcedure ' Populate the DataSet daSql.Fill(dsCourses, "Courses") daSql1.Fill(dsCourses, "Products") dtProducts = dsCourses.Tables("Products") dtCourses = dsCourses.Tables("Courses") ' Set the primary keys ReDim pk(1) pk(0) = dtProducts.Columns("ProductID") dtProducts.PrimaryKey = pk ReDim pk(1) pk(0) = dtCourses.Columns("CourseID") dtCourses.PrimaryKey = pk ' Add a relationship drProdCourse = new DataRelation("ProdCourse", _ dtProducts.Columns("ProductID"), _ dtCourses.Columns("ProductID")) drProdCourse.Nested = True dsCourses.Relations.Add(drProdCourse) ' Look at each row in Products For Each rowProd in dtProducts.Rows CourseRows = rowProd.GetChildRows("ProdCourse") Console.WriteLine(rowProd("Name").ToString() & " has " & _ CourseRows.Length.ToString() & " Courses") ' Loop through each correlated row for the Courses For Each rowCourse in CourseRows Console.WriteLine(vbTab & rowCourse("CourseNum").ToString() _ & ":" & rowCourse("Description")) Next Console.WriteLine() Next ' Print out the constraints For Each objCon in dtProducts.Constraints If TypeOf objCon Is UniqueConstraint Then objUn = objCon Console.WriteLine("UniqueConstraint on " & objUn.Columns(0).ColumnName) If objUn.IsPrimaryKey Then Console.WriteLine("It is the primary key") End If End If Next
In this example, two SqlDataAdapter objects use stored procedures to fill two DataTable objects within the DataSet dsCourses with data for products and their associated courses.
Tip
Although this example illustrates the use of several DataAdapters being used to populate a single DataSet, multiple resultsets can be returned from a single stored procedure or SQL batch to the same effect. Doing so will increase performance because only a single roundtrip to the server will be incurred. Keep in mind, however, that you can only associate a single insert, update, and delete command to a DataAdapter. This means that if you plan on updating multiple DataTables within a DataSet with a single DataAdapter you'll need to programmatically switch the insert, update, and delete commands before invoking Update or exceptions will likely result.
The remainder of this section discusses several features of Listing 7.5 that illustrate the structure and uses of the DataSet and DataTable classes.
Mapping Data to the DataSet
Note that the second argument of the overloaded Fill method specifies the name of the DataTableMapping to populate. In this case because no mapping was created before the Fill method was invoked, the SqlDataAdapter creates DataTable objects using the provided mapping names and maps each DataColumn in the DataSet to a column from the underlying database. However, you can programmatically create a mapping layer before populating the DataSet using the TableMappings collection. This is important for situations where the schema of the DataSet will not exactly match the names of the columns in the database. For example, instead of simply using the database column names in Listing 7.5 the following code could be inserted before the Fill method is invoked to map some of the columns of the Course table to new values.
daSql.TableMappings.Add("Table","Courses") daSql.TableMappings(0).ColumnMappings.Add("CourseNum","CourseNumber") daSql.TableMappings(0).ColumnMappings.Add("Description","CourseDesc") daSql.TableMappings(0).ColumnMappings.Add("ProductID","CourseProdID") daSql.TableMappings(0).ColumnMappings.Add("LOBID","CourseLineOfBusinessID") daSql.TableMappings(0).ColumnMappings.Add("Days","CourseDays") daSql.TableMappings(0).ColumnMappings.Add("CourseWareCost","Cost") daSQL.Fill(dsCourses)
In this example a DataTableMapping object is added to the collection exposed by the DataAdapter and called "Table." In this case "Table" has a special connotation as it represents the default table mapping. In other words, when a DataSet is filled the SqlDataAdapter looks for a table mapping with the name of "Table" and uses it to map the column information in the DataTable. Optionally, another name for the mapping could be used and then passed as the second argument to the Fill method. Because of this default behavior the second argument to the Fill method is omitted.
Once the mapping has been created, columns can be added to it using the ColumnMappings collection by simply passing the name of the database column followed by the name it will have in the resulting DataTable. In this, all the columns except CourseID are being mapped, if the database column does not exist in the ColumnMappings collection it will by default automatically be added to the resulting DataTable (although this can be changed by setting the MissingSchemaAction property of the DataAdapter). After the Fill method is invoked a DataTable is created called "Courses" and populated with the columns from the ColumnsMapping collection. A snippet of the resulting XML from the DataSet returned using the Xml or ReadXml properties of the DataSet follows (note that the Nested property is set to False for this example as discussed later).
<Courses> <CourseID>12</CourseID> <CourseNumber>SYB </CourseNumber> <CourseDesc>FastTrack to Adaptive Server</CourseDesc> <CourseProdID>8</CourseProdID> <CourseLineOfBusinessID>2</CourseLineOfBusinessID> <CourseDays>5</CourseDays> <Cost>300</Cost> </Courses> <Courses> <CourseID>15</CourseID> <CourseNumber>SYBA</CourseNumber> <CourseDesc>Administering Adaptive Server</CourseDesc> <CourseProdID>8</CourseProdID> <CourseLineOfBusinessID>2</CourseLineOfBusinessID> <CourseDays>5</CourseDays> <Cost>300</Cost> </Courses>Tip
Optionally, you can programmatically create the DataTable and add new columns and rows to it with the Add method of the Columns collection and the NewRow method, respectively.
After the DataSet has been populated, the DataTablesCollection can then be accessed through the Tables property of the DataSet, in Listing 7.5 it is used to assign the tables to object variables for easier access. Note that because the DataAdapters are separate objects the data that they use to populate the tables of the DataSet could come from heterogeneous sources. You also can set the PrimaryKey property of the DataTable to an array of DataColumn objects. This also has the effect of automatically creating a UniqueConstraint object and placing it in the ConstraintCollection exposed through the Constraints property.
In situations such as that shown in Listing 7.5 where you want to populate the schema of the DataSet directly from a database table, you can use the FillSchema method of a DataAdapter. This method accepts the DataSet to populate in addition to the SchemaType and optionally arguments such as the table mapping to populate. In addition to the column information, FillSchema retrieves both the primary key and unique constraints. For example, the following line of code will prepopulate the schema of the Courses table mapping used in Listing 7.5.
daSql.FillSchema(dsCourses, SchemaType.Mapping, "Courses")
Note that this method returns the DataTable object that was populated.
Tip
A more convenient way to retrieve the constraint information automatically is to set the MissingSchemaAction property of the DataSet equal to MissingSchemaAction.AddWithKey before invoking the Fill method.
Relating Data
To associate the rows in the Courses table with their parent rows in the Products table you can create a DataRelation object as shown in the listing. In this case the constructor of the DataRelation accepts the name of the relation in addition to the parent column and child column within their respective tables. The DataRelation is then added to the appropriate DataRelationCollection for each table and exposed through the ChildRelations and ParentRelations properties. For example, in this case the relation is added to the ChildRelations collection of dtProducts and the ParentRelations collection of dtCourses.
You'll also notice that the Nested property of the DataRelation is set to True. Setting this property changes the XSD schema produced by the DataSet from one that produces a union like view of the data with one that is hierarchical. For example, changing the Nested property from the defaulted value of False to True changes the resulting XML from the following:
<Offerings> <Courses> ... </Courses> <Courses> ... </Courses> <Products> ... </Products> <Products> ... </Products> </Offerings>
where all the Products elements follow all the Courses elements, to this:
<Offerings> <Products> <Courses> ... </Courses> <Courses> ... </Courses> </Products> <Products> <Courses> ... </Courses> </Products> </Offerings>
where all the Courses for a particular product are nested within the appropriate Products element. Of course, the underlying XSD also changes to reflect the new hierarchical nature of the DataSet and can be queried using the XmlSchema property.
Tip
Keep in mind that even though all the database columns are mapped by default to elements in the XML representation of the DataSet, you can override this behavior by altering the schema used to produce the XML. One technique for doing so is to change the ColumnMapping property of the DataColumn object to a value from the MappingType enumeration. For example, to change the ProductID column to an attribute you would use the syntax: dtProducts.Columns("ProductID"). ColumnMapping = MappingType.Attribute in Listing 7.5.
Regardless of which way the XML is nested for display, once the relationship is in place the code can navigate through the DataSet. In this example, each DataRow in the dtProducts table is iterated on using a For Each loop. The GetChildRows method of the DataRow object can then be called to return an array of DataRow objects, in this case CourseRows, found via the relationships. In turn, the rows in the array are iterated to print out various columns.
Using Constraints
Finally, the code in Listing 7.5 iterates through each Constraint object in the ConstraintCollection exposed by the Constraints property of the dtProducts DataTable. Because Constraint is the base class, the TypeOf statement is used to determine if the constraint is of type UniqueConstraint. If so, the first column of the constraint is printed to the console. In this case the act of creating the relation automatically puts a unique constraint on the ProductID column of the dtProducts DataTable. However, unless the primary key is explicitly created as in this example, the IsPrimaryKey property of the UniqueConstraint object will be defaulted to False.
An excerpt of the sample output for Listing 7.5 is shown here:
SQL Server has 3 Courses 1140:SQL Server 7.0 Intro 2072:System Administration for SQL Server 2000 2073:Implementing a Database Design on SQL Server 2000 Visual Basic has 2 Courses 1013:Mastering Visual Basic 1016:Enterprise Development with Visual Basic Visual Interdev has 1 Courses 1017:Mastering Web Site Development
Using Select
The final important concept that you should be familiar with when using the DataTable class is the capability to select rows with the Select method. This overloaded method returns an array of DataRow objects corresponding to the criteria provided. For example, in the case of the Courses DataTable shown in Listing 7.5 you can use the Select method to return only those rows for a particular product.
Dim drRows() As DataRow drRows = dtCourses.Select("ProductID = 1","CourseNum ASC") For Each rowProd in drRows StrCourseNum = rowProd("CourseNum")) Next
The second argument (exposed in an overloaded signature) also can include a sort criteria, in this case sorting by CourseNum in ascending order. Further, a third overloaded method allows you to select rows based on the values of the DataViewRowState enumeration, which includes CurrentRows, Deleted, ModifiedCurrent, ModifiedOriginal, New, None, OriginalRows, and Unchanged.
Note
Other techniques can also be used to return a subset of data from a DataSet or DataTable. These include the GetErrors method of the DataTable, which returns an array of rows that contains errors after a call to Update, and the GetChanges method of the DataSet which returns a new DataSet that contains only data that has been modified.
Although this discussion highlights the constituent pieces and mechanics of a DataSet and using the managed providers, it does not place them in the context of an actual application. The remainder of the chapter is devoted to using ADO.NET in a distributed application.
Using Events
Just as the Recordset object in classic ADO supports events, both the DataSet and DataTable classes support a variety of events that you can use to respond to changes in the structure and the data in the application. For example, the DataSet supports the MergeFailed and PropertyChanged events that fire when errors occur as two DataSet objects are merged with the Merge method and when any property of the DataSet changes respectively. The DataTable supports seven events, six of which are found in pairs as shown in Table 7.6.
Table 7.6 DataTable events. This table lists the events for the DataTable class.
Events |
Event Args |
Description |
ColumnChanging, ColumnChanged |
DataColumnChange EventArgs |
Fired immediately before and immediately after a column value is modified in the DataTable. |
RowChanging, RowChanged |
DataRowChange EventArgs |
Fired immediately before and immediately after a row has been successfully edited in a DataTable. |
RowDeleting, RowDeleted |
DataRowChange EventArgs |
Fired immediately before and immediately after a row has been successfully deleted in a DataTable. |
PropertyChanged |
PropertyChanged EventArgs |
Fired whenever a property of the DataTable has changed. |
These events can be used to add functionality to your applications such as creating a client-side audit trail by capturing the RowChanged event and inspecting the DataRowAction enumeration exposed through the Action property of the DataRowChangeEventArgs. For each row marked as Delete, Add, Change, and Commit, your application could log the fact to a file. For example, the RowChanged event of the dtCourses table created in Listing 7.5 can be handled by using the AddHandler statement shown here:
AddHandler dtCourses.RowChanged, AddressOf CoursesRowChanged
The CoursesRowChanged method is then implemented to handle the event, extract the Action and the primary key of the row from the event arguments, and log them to a log file using a private LogToFile method.
Private Sub CoursesRowChanged(ByVal sender As Object, ByVal e As DataRowChangeEventArgs) Dim strAction As String Select Case e.Action Case DataRowAction.Add strAction = "Action=Add" Case DataRowAction.Change strAction = "Action=Change" Case DataRowAction.Commit strAction = "Action=Commit" Case DataRowAction.Delete strAction = "Action=Delete" Case Else ' Do Nothing Return End select LogToFile(strAction & ", CourseID = " & e.Row.Item("CourseID") & _ " at " & DateTime.Now.ToShortTimeString() & " " & DateTime.Now.ToShortDateString()) End Sub