Background on the DataSet
When I worked with ADO and VB6, I often used disconnected ADO Recordsets. They were nice, especially regarding marshaling, because they used custom marshaling and could "travel" between processes. But there were caveats, of course. One was that you needed to send several resultsets, for example, when you had used several SELECT statements within one stored procedure; you couldn't disconnect a Recordset with several resultsets. The solution I most often used was to move the resultsets from the first Recordset into an array of Recordsets. It was kind of a hack, but it was useful.
Another important occurrence in the dark ages before .NET was that Microsoft showed the in-memory database (IMDB) in the beta versions of COM+ 1.0. The idea was to have an in-memory cache of data to work with so that the ordinary database server wasn't touched as often and, consequently, performance and scalability would increase. For different reasons, IMDB was withdrawn before the final release of COM+ 1.0.
Rumors say that IMDB was withdrawn for one or more of the following reasons:
In tests, IMDB-based solutions gave worse performance than when SQL Server was used as a cache server.
Beta testers were disappointed when they found that the IMDB didn't understand SQL but instead had a more ISAM-ish programming model.
According to Microsoft, beta testers showed very little interest.
With the release of ADO.NET, we got the DataSet class, which addresses both the need for a disconnected Recordset with several resultsets and the need for an in-memory cache. In Figure 1, you can see the different classes from which the DataSet is aggregated.
Figure 1 Class model of the DataSet.
A DataSet can have one or more DataTables. (A DataTable can be thought of as a resultset.) Each DataTable can have DataRows, DataColumns, and Constraints. The DataSet also can have DataRelations between the different DataTables. With this model, you can build a complete representation of a database, but in memory.
When I discussed the DataReader, I said that concrete DataReadersfor example, SqlDataReader and OleDbDataReaderare called. That is not the case with the DataSet; it is independent of providers and data sources. You can create and fill a DataSet completely with simple code if you want to, without touching a database at all.
As a matter of fact, there is much to say about the DataSetprobably enough for complete books. Before moving to the next section, I'd like to summarize some of the other key built-in features of the DataSet:
Support for sorting, filtering, and searchingThe DataSet comes out of the package with a wealth of built-in functionality that will boost your productivity. It will take you some time to get used to all the features of the DataSet, but it's not that hard. Not everything works exactly the way you want it to, of course, but not have everything exactly your way that's the price you pay by reusing built-in functionality.
Control of what commands to useIf you worked with disconnected Recordsets in ADO, you probably remember that you had little control over the UpdateBatch() method. With the DataSet (or, rather, the DataAdapter), you can decide what command to use for updates, inserts, deletes, and selects. That's a huge improvement.
Support for concurrency controlThe DataSet has built-in support for optimistic concurrency control, too. Because the DataSet is about working with disconnected data, it automatically adds to the WHERE clause for DELETE and UPDATE so that no conflicts affect the specific row(s) between SELECT and UPDATE/DELETE. This default model often works well enough.
Support for events when the DataSet is affectedThanks to a wealth of events that are fired when the DataSet is affected somehow, you have good control over the DataSet.
XML integrationOne of the benefits of the strong XML integration with the DataSet is that you can work with the data in a DataSet both relationally as tables and hierarchically as XML.