DataSet Code Examples
Now let's take a look at some code samples, both from the server side and from the client side. First we'll address the server side. In Listing 1, you find some code for fetching data from the database with help from a stored procedure.
Listing 1: Code for Filling a DataSet
Dim aCommand As New SqlCommand _ (SprocOrder_FetchWithLines, _GetClosedConnection) aCommand.CommandType = _ CommandType.StoredProcedure aCommand.Parameters.Add _ ("@id", SqlDbType.Int).Value = id Dim anAdapter As New SqlDataAdapter(aCommand) anAdapter.TableMappings.Add("Table", "Orders") anAdapter.TableMappings.Add _ ("Table1", "OrderLines") anAdapter.TableMappings _ (OrderTables.Orders).ColumnMappings.Add _ ("Customer_Id", "CustomerId") anAdapter.TableMappings _ (OrderTables.OrderLines).ColumnMappings.Add _ ("Orders_Id", "OrderId") anAdapter.TableMappings _ (OrderTables.Orders).ColumnMappings.Add _ ("Product_Id", "ProductId") anAdapter.Fill(dataSet)
Note the basic pattern shown in Listing 1. First a Command is set up. Then comes a DataAdapter, and, finally, Fill is called on the DataAdapter.
NOTE
The code in Listing 1 doesn't show how the DataSet is instantiated. This is because the code in Listing 1 is from a utility method that can be used for filling both typed and untyped DataSets. For that to work, the DataSet is instantiated as a DataSet or an OrderDataSet, for example, outside of the utility method and is sent as a parameter.
Quite a lot of the code in Listing 1 relates to mappings. First is some mapping code for giving the first DataTable the Orders name and then for giving the second DataTable the OrderLines name. For typed DataSets, this is important: Without this, you will end up with four DataTables in the DataSet instead of two. For untyped DataSets, this is important only for creating meaningful names for the DataTables.
The second mapping section is for changing some of the column names used in the stored procedure. Again, for the typed DataSet, this is important, but for the untyped DataSet, this is merely for convenience.
Now let's look at some code from the client side. To browse the information in the DataSet, we could use the code in Listing 2. Note that here I'm browsing a DataSet with two resultsets (or, rather, DataTables).
Listing 2: Code for Browsing a DataSet
Dim anOrderDS As DataSet = _ _service.FetchOrderAndLines(_GetRandomId()) Dim anOrder As DataRow = _ anOrderDS.Tables(OrderTables.Orders).Rows(0) _id = DirectCast(anOrder(OrderColumns.Id), _ Integer) _customerId = DirectCast _ (anOrder(OrderColumns.CustomerId), Integer) _orderDate = DirectCast _ (anOrder(OrderColumns.OrderDate), Date) Dim anOrderLine As DataRow For Each anOrderLine In anOrderDS.Tables _ (OrderTables.OrderLines).Rows _productId = DirectCast(anOrderLine _ (OrderLineColumns.ProductId), Integer) _priceForEach = CType(anOrderLine _ (OrderLineColumns.PriceForEach), Decimal) _noOfItems = DirectCast(anOrderLine _ (OrderLineColumns.NoOfItems), Integer) _comment = DirectCast(anOrderLine _ (OrderLineColumns.Comment), String) Next
NOTE
You might wonder about the idea of running a loop for the order lines and then just pushing the value of each column of each order line to a private variable, such as _productId. I do this so that the test runs end to end, all the way from the database to variables in the client. Therefore, I want to touch all columns in all rows of the data container.
Note in Listing 2 that I am referring to DataTables and DataColumns with enumerations. This is to make the code more readable than when magic integers are used and more efficient than when strings are used.
Let's compare the browse code for an untyped DataSet (just shown) with similar code for a typed DataSet. The version for the typed DataSet is found in Listing 3.
Listing 3: Code for Browsing a Typed DataSet
Dim anOrderDs As OrderDs = _ _service.FetchOrderAndLines(_GetRandomId()) Dim anOrder As OrderDs.OrdersRow = _ anOrderDs.Orders(0) _id = anOrder.Id _customerId = anOrder.CustomerId _orderDate = anOrder.OrderDate Dim anOrderLine As OrderDs.OrderLinesRow For Each anOrderLine In anOrderDs.OrderLines _productId = anOrderLine.ProductId _priceForEach = anOrderLine.PriceForEach _noOfItems = anOrderLine.NoOfItems _comment = anOrderLine.Comment Next
The code in Listing 3 is clearer and much shorter than the "same" code in Listing 2. This is because the schema is created at compile time, so you don't have to describe it over and over again in your code. Instead of referring to, for example, the generic DataRow class in Listing 2, I'm programming against specific types. I also can skip all the casting and conversions because all columns are in the "correct" data type already. That's definitely a way of reducing code bloat.
DataSet Tests
Time to discuss the test results. As with all the other test cases, there is a service-layer class for each test case. The service-layer classes for the DataSet test cases are shown in Figure 3.
Figure 3 One example of a service-layer class.
The service-layer classes inherit, as usual, from MarshalByRefObject. They should be suitable as root classes when used via remoting.
NOTE
Note that the second method in class for the typed DataSet returns OrderDs2. That typed DataSet class has only an OrderLines DataTable. Otherwise, I would have had to use a workaround to avoid getting a constraint error when fetching only OrderLines from the database.
You might think that it would be more appropriate to send just a DataTable instead of a complete DataSet in this case. I will discuss that further in Part 5 of this series.
Result of the Tests
In the first part of the articles series, I gave you a sneak peak regarding the throughput test results of the untyped DataSet. Now it's time to show you the results for all test cases discussed so far.
Once again, I will use DataReader as a baseline. Therefore, I have recalculated all the values so that I get value 1 for DataReader; the rest of the data containers will have a value that is relative to the DataReader value, for easy comparison. The higher the value, the better.
Table 1: Results for the First Test Case: Reading One Row
1 User, in AppDomain |
5 Users, in AppDomain |
1 User, Cross-Machines |
5 Users, Cross-Machines |
|
DataReader |
1 |
1 |
1 |
1 |
Untyped DataSet |
0.6 |
0.6 |
1.4 |
1.7 |
Typed DataSet |
0.4 |
0.5 |
1 |
1.1 |
Table 2: Results for the Second Test Case: Reading Many Rows
1 User, in AppDomain |
5 Users, in AppDomain |
1 User, Cross-Machines |
5 Users, Cross-Machines |
|
DataReader |
1 |
1 |
1 |
1 |
Untyped DataSet |
0.6 |
0.6 |
6.9 |
9.7 |
Typed DataSet |
0.5 |
0.5 |
6 |
8.6 |
Table 3: Results for the Third Test Case: Reading One Master Row and Many Detail Rows
1 User, in AppDomain |
5 Users, in AppDomain |
1 User, Cross-Machines |
5 Users, Cross-Machines |
|
DataReader |
1 |
1 |
1 |
1 |
Untyped DataSet |
0.5 |
0.5 |
6.1 |
8.5 |
Typed DataSet |
0.4 |
0.4 |
5.1 |
6.9 |
As you might guess, the five-users test uses 100% of the CPU because I'm not using any think time. That goes for both the AppDomain test and the cross-machines test.
In the cross-machines test, I should switch to several client machines, but I haven't done that yet. Perhaps I will rerun the tests in Part 5. On the other hand, the server in the five-users, cross-machines test uses approximately 80% of the CPU, so that would be the bottleneck.
This reminds me that I need to mention the test equipment. Because my company is small one (it's just me), I don't have a full-blown lab. Therefore, I have used three ordinary machines:
1.8GHz, 512MB RAM. This serves as everything except the database server in the AppDomain tests. It's the client for the cross-machines tests.
1.7GHz, 512MB RAM. This is the server for the cross-machines tests.
750MHz, 255MB RAM. This is the database server for all the tests.
As you learned earlier, both the Untyped DataSet and the typed DataSet have more overhead than the DataReader in the AppDomain. On the other hand, they perform better than the DataReader in the cross-machines test, especially when several rows are fetched. This is just as expected. It's also expected that the typed DataSet carries more overhead than the untyped DataSet.
But some forthcoming results aren't as you might expect. I'll whet your appetite a bit by telling you that with custom classes for the third test1 user and cross-machinesI get 16! (That is, it's 16 times more efficient to use custom classes than a DataReader for that specific test.) That is probably not what you expect from all talk about how efficient DataSets are. The untyped DataSet performs almost three times as poorly as custom classes when serialized across machines because DataSets are serialized as XML, even with a binary formatter. Test the code snippet in Listing 4, and open the results file in Notepad to see for yourself.
Listing 4: Code for Serializing a DataSet to a File
Dim fs As IO.FileStream = _ New IO.FileStream("c:\temp\ds.txt", IO.FileMode.Create) Dim bf As New _ System.Runtime.Serialization. _ Formatters.Binary.BinaryFormatter _ (Nothing, New Runtime.Serialization.StreamingContext _ (Runtime.Serialization.StreamingContextStates.Remoting)) bf.Serialize(fs, anOrderDS) fs.Close()
NOTE
You can read more about serialization aspects of DataSets in Dino Esposito's article "Binary Serialization of ADO.NET Objects" and in his book Applied XML Programming for Microsoft .NET (Microsoft Press, 2002). There Dino also discusses some workarounds to this problem. I will discuss the test result involved when using a workaround in Part 5 of this series.
Highly Subjective Results
It's time to add some grades for untyped and typed DataSets to my list of "highly subject results." In Table 4, you will find that I have assigned some grades according to the qualities discussed at the beginning of the article. A score of 5 is excellent, and a score of 1 is poor.
Table 4: Grades According to Qualities
|
Performance in AppDomain/Cross-Machines |
Scalability in AppDomain/Cross-Machines |
Productivity |
Maintainability |
Interoperability |
DataReader |
5/1 |
4/1 |
2 |
1 |
1 |
DataSet |
3/3 |
3/3 |
4 |
3 |
4 |
Typed DataSet |
2/2 |
2/2 |
5 |
4 |
5 |
I'd like to say a few words about each quality grade next.
Performance
Unlike the DataReader, both types of DataSets are marshalled by value. Therefore, performance is okay cross-machines, too.
Scalability
As I said last time, in this specific test I think performance and scalability go hand in hand, as those qualities were defined for this series of articles. It's important to note that DataSets won't hold open connections against the database, so using them entails less risk of killing scalability from holding on to connections too long.
Productivity
DataSets are great for productivity because you get a lot of functionality built in, debugged, and ready to use. Productivity is especially good for typed DataSets because there is a lot of design-time support for them in Visual Studio .NET.
In my opinion, the DataSet is very much about rapid application development (RAD) and does a good job regarding that.
Maintainability
I believe that maintainability will be pretty good for both types of DataSets. It's especially good for the typed DataSet because you have a strong contract against your code accessing it. On the other hand, I really like the idea of keeping the behavior together with the data, as with classic object-oriented solutions, thereby making it possible to get a very high degree of encapsulation. DataSets are useful for a more data-centric or document-centric approach so that you can let the behavior act on the data in the DataSets. This works very well, of course, but, in my opinion, in many situations long-term maintainability suffers.
Also worth mentioning is the loosely coupled model that typed DataSets use. That is, with an event-based model, you can use a specific typed DataSet in many situations, using different rules for each situation. You put the rules in event procedures in other classes instead of within the typed DataSet itself.
Interoperability
Finally, interoperability is pretty good for both types of DataSets. They serialize themselves to XML, but the DataSet also has the built-in possibility of a WriteXml() method that can be used to get a format other than the diffgram format that you get from the ordinary serialization of DataSets.
I decided to give the typed DataSet a score of 5 instead of 4 for interoperability because the XSD means a stronger contract with the client. In my opinion, that is desirable when it comes to interoperability.