asp tutorials, asp.net tutorials, sample code, and Microsoft news from 15Seconds
Data Access  |   Troubleshooting  |   Security  |   Performance  |   ADSI  |   Upload  |   Email  |   Control Building  |   Component Building  |   Forms  |   XML  |   Web Services  |   ASP.NET  |   .NET Features  |   .NET 2.0  |   App Development  |   App Architecture  |   IIS  |   Wireless
 
Pioneering Active Server
 Power Search








Active News
15 Seconds Weekly Newsletter
• Complete Coverage
• Site Updates
• Upcoming Features

More Free Newsletters
Reference
News
Articles
Archive
Writers
Code Samples
Components
Tools
FAQ
Feedback
Books
Links
DL Archives
Community
Messageboard
List Servers
Mailing List
WebHosts
Consultants
Tech Jobs
15 Seconds
Home
Site Map
Press
Legal
Privacy Policy
internet.commerce














internet.com
IT
Developer
Internet News
Small Business
Personal Technology
International

Search internet.com
Advertise
Corporate Info
Newsletters
Tech Jobs
E-mail Offers

HardwareCentral
Compare products, prices, and stores at Hardware Central!

The DataSet Grows Up in ADO.NET 2.0 - Part 1
By Alex Homer
Rating: 3.5 out of 5
Rate this article


  • email this article to a colleague
  • suggest an article

    Table of Contents

    • Part 1 - Fundamental Enhancements to the Primary Data Store
      • Page 1
        • Loading a DataSet, and the new LoadOption enumeration.
        • Using stand-alone DataTable instances.
      • Page 2
        • Streaming data into and out of DataSet tables with a DataTableReader.
        • Performing batched updates from a DataSet.
        • Performance and scalability improvements.
    • Part 2 - XML and UDT Integration with SQL Server 2005
      • Page 1
        • Using the XML data type in a DataSet.
      • Page 2
        • How the DataSet supports user-defined types.
    • Download Sample Code

    Part 1 - Fundamental Enhancements to the Primary Data Store

     

    Note: the code described in this article is based on the Beta 1 release of the .NET Framework v 2.0, with notes as to the upcoming changes in the Beta 2 release. All the examples can be downloaded from our Web site at http://www.daveandal.net/articles/datasetv2/, and you can also run most of them online from the same location.

     

    Version 1.0 of the .NET Framework introduced a completely new approach to handling relational data, by separating out the two basics functions of all data access application requirements into separate classes that are optimized for each specific task. For applications that just need to access rowsets in a stream fashion, the DataReader classes are lightweight and give great performance in a connected scenario.

     

    On the other hand, the DataSet class is designed for applications that require disconnected access to data, or which need to persist data without maintaining an open connection to the database. The DataSet is rather like the Recordset object from classic ADO, but has several useful advantages:

     

    • The DataSet can store more than one table or rowset.
    • The DataSet can store the relationships between these rowsets.
    • The DataSet provides excellent integration with XML, through a range of methods and properties that control the format and serialization capabilities of the stored data when reading and writing XML.

     

    Although the DataSet class in version 1.x of the .NET Framework performs well, and can provide a suitable data store for all kinds of disconnected applications, Web Services, etc., there are some shortcomings. These have been addressed in version 2.0, where the DataSet grows up into a fully-fledged and extremely powerful data container that supports much better serialization, update capabilities and performance than in version 1.x.

     

    In this and a follow-up article, we look at the changes between the version 1.x and version 2.0 DataSet and their associated classes, showing you how you can take advantage of the new features to improve your applications' capabilities and performance. The topics we'll be covering in this part are:

     

    • Loading a DataSet, and the new LoadOption enumeration.
    • Using stand-alone DataTable instances.
    • Streaming data into and out of DataSet tables with a DataTableReader.
    • Performing batched updates from a DataSet.
    • Performance and scalability improvements.

     

    Then, in a follow-up article, we'll look in more detail at:

     

    • Manipulating the XML data type with a DataSet.
    • How the DataSet supports user-defined data types (UDTs).

    Loading a DataSet, and the New LoadOption Enumeration

    One of the issues developers came across when using version 1.x of the DataSet was the awkward way that you have to use the GetChanges and Merge methods of the DataSet to support updates where you want to allow user-reconciliation of update errors. You must keep a copy of the DataSet containing the user's updates, so that the current values (the values they entered) are not lost when the Fill method of the DataAdapter refreshes the rows after attempting an update (or if the stored procedures you use for the update return the values from the database after the update has been executed).

     

    In version 1.x, this could result in a six-step process, where you had to:

     

    • Fill the DataSet with the rows for display to the user
    • Allow the user to edit the rows
    • Call GetChanges on the DataSet to create a copy of the edited rows
    • Execute the Update method to push the changes into the database
    • Call Fill again to refresh the rows in the DataSet
    • Use the Merge method to get the user's original edited values into the current values of the DataSet for rows where the update failed.

     

     

    In version 2.0, Microsoft has addressed this issue by providing an enumeration called LoadOption, which allows you to control precisely how the values in existing rows in a DataSet are updated when incoming rows (from the Fill or the new Load methods) match the existing rows on the primary key - and hence effectively "replace" these existing rows. And they have also filled out the API for the DataSet, DataAdapter and DataView classes by adding a new property called AcceptChangesDuringUpdate, implementing methods that allow you (under certain circumstances) to change the RowState property value for rows in the DataSet tables, and adding a method to convert a DataView instance into a DataTable instance. We'll look at all of these topics in the following sections of this article.

    The LoadOption Enumeration

    The LoadOption enumeration is used to specify how the values in incoming rows replace (update) the values in existing rows in the DataSet tables. Remember that this only applies if the table has a primary key defined (so that incoming rows can be matched with existing rows), and where the incoming row actually does match (and hence replace or refresh) an existing row. The names of the enumeration members differ between the Beta 1 and Beta 2 releases, as shown in Table 1. The Beta 2 names should be the ones used in the release version of ADO.NET 2.0.

     

    Table 1 - The Values in the LoadOption Enumeration for Beta 1 and Beta 2

    Beta 1 Enumeration Member

    Beta 2 Enumeration Member

    Description

    PreserveCurrentValues

    PreserveChanges

    Synchronize the rows and keep the modifications to rows in the DataSet. Effectively, only the original values in the existing rows are replaced with incoming values.

    UpdateCurrentValues

    Upsert

    Aggregate the rows in the DataSet with the new rows. Effectively, only the current values in the existing rows are replaced with incoming values.

    OverwriteRow

    OverwriteChanges

    Synchronize the rows and abandon modifications to rows in the DataSet. Effectively, both the current and original values in the existing rows are replaced with incoming values.

     

    Table 2 shows the effects of each of the LoadOption values in more detail. You can see, for example, that the PreserveCurrentValues option only updates the original value in matching rows in the DataSet tables, while keeping the user's edited value as the current value. This is the ideal situation where you want to attempt to push the user's changes to the rows into the database, but allow them to reconcile any failed updates afterwards.

     

    Usually an update will fail because another user has performed an update on the row in the database while the first user is holding a disconnected copy of the row in their DataSet (the application should, of course, validate values they enter and prevent most of the other common update errors). The application will be able to display the value that is in the database now (the underlying value) and the proposed value that was entered by the user (the current value), so that they can make an informed decision about how to resolve the failed update.

     

    Table 2 - The Effects of the LoadOption Enumeration in More Detail

    RowState of Existing Row

    PreserveCurrentValues

    PreserveChanges

    UpdateCurrentValues

    Upsert

    OverwriteRow (default)

    OverwriteChanges

     

    Added

    Current = Existing

    Original = Incoming

    RowState = Modified

    Current = Incoming

    Original = Existing

    RowState = Added

    Current = Incoming

    Original = Incoming

    RowState = Unchanged

    Modified

    Current = Existing

    Original = Incoming

    RowState = Modified

    Current = Incoming

    Original = Existing

    RowState = Modified

    Current = Incoming

    Original = Incoming

    RowState = Unchanged

    Deleted

    Current = Existing

    Original = Incoming

    RowState = Deleted

    In Beta 1:

    Undo Delete

    Current = Incoming

    Original = Existing

    RowState = Modified

    See note below for Beta 2 behavior.

    Undo Delete

    Current = Incoming

    Original = Incoming

    RowState = Unchanged

    Unchanged

    Current = Incoming

    Original = Incoming

    RowState = Unchanged

    Current = Incoming

    Original = Existing

    if new value = existing:

      RowState = Unchanged

    else:

      RowState = Modified

    Current = Incoming

    Original = Incoming

    RowState = Unchanged

    No existing matching row in table

    Current = Incoming

    Original = Incoming

    RowState = Unchanged

    Current = Incoming

    Original = [n/a]

    RowState = Added

    Current = Incoming

    Original = Incoming

    RowState = Unchanged

     

    Using the LoadOption Enumeration Values

    The values in the LoadOption enumeration are used when populating a DataSet table with the Fill method or the new Load method. To use them in the Fill method, you set the FillLoadOption property of the DataAdapter that is performing the Fill:

     

    myDataAdapter.FillLoadOption = LoadOption.PreserveCurrentValues   // Beta 1

    myDataAdapter.FillLoadOption = LoadOption.PreserveChanges         // Beta 2

     

    The new Load method takes a LoadOption value as its second parameter, for example:

     

    myDataSet.Load(data-reader, LoadOption.PreserveCurrentValues, table-name array)  

     

    We look at the Load method in more detail later in this article. Meanwhile, to see the effects of the LoadOption enumeration values, the next example uses them to control how rows are updated when using the Fill method of the DataAdapter. Figure 1 shows the example page we provide that demonstrates some of the effects of the LoadOption enumeration. At the top are a set of option (radio) buttons that you can use to specify the LoadOption value to use, followed by a list of three rows from the Northwind database. The page shows the current and original values, and the value of the RowState property, for the three stages that each row goes through as the page is executed.

     

    Figure 1 - The Example Page that Demonstrates the Effects of the LoadOption Enumeration

     

    The three stages of processing in the example page are:

     

    • Fill a DataSet table with two rows from the database.
    • Edit these rows within the code by modifying the first one, deleting the second one, and adding a new row to the table.
    • Refresh the table by calling the Fill method again, using the selected value from the LoadOption enumeration in the FillLoadOption property of the DataAdapter.

     

    There is quite a lot of code in the page, because it has to keep copies of the DataSet at each stage to be able to display the values afterwards. It also has to create a primary key on the table in the DataSet before making copies, so that the Fill method will match incoming rows with the existing rows (and hence update them). All these techniques are the same as you would use in version 1.x, and so we haven’t listed all the code here. You can use the [view source] link at the foot of the page to see it, or download the examples and play with the code yourself.

     

    Examining the Effects of the LoadOption Enumeration Values

    To see the effects of the LoadOption enumeration values, you can select each one and view the results in the example page. The following three screenshots (Figures 2, 3 and 4) show these results. When the LoadOption value is PreserveCurrentValues (PreserveChanges in Beta 2), the current value of the customer name in the modified row is maintained and the RowState remains set to Modified (see Figure 2). For the deleted row, there are no current values and the Fill method does not re-instance them - the RowState remains set to Deleted.

     

    In the row that was added to the table, there is no change to the current values or the RowState because there is no matching incoming row (this row does not exist in the database). If another user had added a row to the database with the same primary key, however, the original values in this row would be set to the incoming values, and the row would be marked as Modified.

     

    The overall effect is to synchronize the original values in the rows with the values in the database and keep the modifications to rows in the DataSet.  

     

    Figure 2 - The Effects of the PreserveCurrentValues LoadOption value

     

    When the LoadOption is UpdateCurrentValues (Upsert in Beta 2), the modified customer name in the current value of first row is replaced by the incoming row value, and the RowState remains set to Modified (see Figure 3). For the deleted row, where there are no current values, the incoming row re-instates these with the values in the database. It also changes the RowState from Deleted to Modified, because the row exists in the database and so an UPDATE command is required for the next update, and not an INSERT command.

     

     

    The third row, which was added to the DataSet, remains as an Added row because no row in the database matches it. If another user had added a row with the same primary key, however, their values would replace the current value. But even if another user had added rows to the database that match any of the rows in the DataSet, these values would only replace the current values - while the original values would be retained.

     

    The overall effect, therefore, is to aggregate the rows in the DataSet with the new rows, without losing the original values.

     

    Figure 3 - The Effects of the UpdateCurrentValues LoadOption value

     

    Finally, when the LoadOption value is OverwriteRow (OverwriteChanges in Beta 2), the current and the original values in each matching row (the first and second rows) are replaced by the values in the incoming rows (see Figure 4). And, because these match the values in the database, the rows are marked as Unchanged because there is no need to push the values back into the database when the next update is attempted. The Update method will process them and update the database only if you edit the rows again.

     

    The only exception is the row that was added to the DataSet, which remains marked as an Added row ready to be inserted into the database when the next Update takes place. However, if another user had added a row to the database with the same primary key, their values would replace both the current and original values in the DataSet and the RowState would be set to Unchanged.

     

    The overall effect is to synchronize the rows, and abandon modifications to rows in the DataSet.

     

    Figure 4 - The Effects of the OverwriteRow LoadOption value

    Updating the RowState Property

    Notice how, in Table 2, the LoadOption values affect the setting of the RowState property for each row that an update is attempted on. A common problem in version 1.x of ADO.NET is that the RowState values are read-only, and cannot be changed. While this is fine for common update operations on the rows, it does make it hard to perform custom management of the data - perhaps when you want to manipulate the contents of the DataSet so that updates and/or inserts are attempted on rows that are currently marked as Unchanged (and so the Update method will ignore them).

     

    In version 2.0, you can change the values of the RowState property using two new methods that are added to the DataRow class. These are SetAdded and SetModified, and change the RowState property of the row to Added or Modified. However, they can only be called on rows that are currently marked at Unchanged, and not on rows that are marked as Added, Deleted or Modified.

     

    The DataAdapter.AcceptChangesDuringUpdate Property

    In version 1.x of the DataAdapter class, you can change the AcceptChangesDuringFill property from its default value of true to false, so that incoming row values only set the current values in the rows in the DataSet table, and are not propagated to the original values. In other words, the Update method does not call the AcceptChanges method on each of the rows as it fills them. This is useful when, for example, you want to keep existing values in the original values of the rows - perhaps to check for changes to the data since the DataSet was last filled.

     

    In version 2.0, Microsoft has added the AcceptChangesDuringUpdate property to the DataAdapter. This has the same effect as AcceptChangesDuringFill, except it affects the outcome when the Update method is called. Effectively, it prevents the AcceptChanges method from being executed on each row. It is useful where, for example, a SQL batch statement or stored procedure you are using to perform the updates returns values. These values will replace the current values of the row in the DataSet table, and by default are copied to the original values by the AcceptChanges method. By changing the value of this property from the default true to false, you prevent the values being copied into the original values of the rows.

    The DataView.ToTable Method

    In version 2.0 of ADO.NET, the DataView class gains a new method named ToTable. This allows you to convert an existing DataView instance into a DataTable¸ which might be useful if you have existing code that creates a DataView, or a class that exposes only a DataView and not the original DataTable. There are three overloads of the ToTable method that allow you to use the existing view and column names as the new table and column names, or specify different table and/or column names. You can find out more from the .NET Framework SDK - search for "DataView.ToTable".

    Stand-alone DataTable Instances

    Those developers who looked at the Technical Preview release of ADO.NET will have noticed a class named DbTable that appeared there - with no obvious links to existing classes or the natural hierarchy of the other ADO.NET objects. The idea was to allow developers to work with single tables or rowsets, without having to create a DataSet to manage them. DbTable was designed to be used standalone, but could also be stored in a DataSet.

     

    However, the whole approach was confusing and this class disappeared in Beta 1. Instead, the more obvious approach was taken, by surfacing the existing DataTable class with new methods and properties that allow it to be used in stand-alone mode. Effectively, it still lives in a DataSet but you no longer have to create and manipulate this DataSet. You just work directly with the individual DataTable instance.

     

    What this means is that common operations for which you would have required a DataSet in version 1.x are now supported against DataTable. These include the ReadXml, ReadXmlSchema, WriteXml, WriteXmlSchema, Clear, Clone, Copy, Merge, GetChanges methods. The DataSet is also auto-serializable, and so can be returned from a Web Service or via Remoting. And you can use the Fill and Update methods of the DataAdapter on a DataTable:

     

    myDataAdapter.Fill(myDataTable)

    myDataAdapter.Update(myDataTable)

     

    You can also fill an array of DataTable instances:

     

    myDataAdapter.Fill(myDataTableArray)

     

    There are also other overloads of the Fill method that allow you to fill subsets of rows in a DataTable.

    Using the Load Method with a DataSet and DataTable

    The stand-alone DataTable also supports the Load method, as mentioned in connection with the DataSet class earlier in this article. Here, we'll look at the Load method in more detail, both for the DataTable and the DataSet classes. The only difference between the two is that, when calling Load on DataSet, you must specify the tables to load. When calling it on DataTable, it loads the current DataTable instance.

     

    The syntax for the three overloads of the Load method for the DataSet class is shown in Table 3. You must always specify a DataReader containing the source data (it can, of course, return more than one rowset), and a LoadOption value. You must also specify the target tables as an array, either by name or as references to DataTable instances. And, finally, you can specifya handler for the FillError event that is raised if an error occurs during the Load process.  

     

    Table 3 - The Load Method Overloads for the DataSet Class

    Method overload

    Description

    Load(data-reader, load-option, table-name-array)

    Takes a reference to a class that implements the IDataReader interface, such as a SqlDataReader or OleDbDataReader, and loads the rows it exposes into the specified tables of the DataSet. The target tables are specified as a String array of table names (less than or equal to the number of rowsets available through the DataReader). The load-option parameter is a value from the LoadOptions enumeration that determines how the values of incoming rows will be used when existing matching rows are present in the DataSet tables.

    Load(data-reader, load-option,data-table-array)

    Loads data from a reader instance that implements the IDataReader interface into the tables in a DataSet. Works exactly like the previous overload except that the list of target tables is specified as an array of DataTable instances instead of the table names.

    Load(data-reader, load-option, fill-error-handler, data-table-array)

    Loads data from a reader instance that implements the IDataReader interface into the tables in a DataSet. Works exactly like the previous overload except that a reference to a delegate (event handler) can be specified. This event handler will be executed if there is an error while loading the data.

     

    For a DataTable, there are also three overloads of the Load method, as shown in Table 4. These also allow you to specify a LoadOption value, and (if required) a handler for the FillError event that is raised if an error occurs during the Load process. There is no need to specify the target table, as the data is loaded into the DataTable instance on which the method is called.

     

    Table 4 - The Load Method Overloads for the DataTable Class

    Method overload

    Description

    Load(data-reader)

    Takes a reference to a class that implements the IDataReader interface, such as a SqlDataReader or OleDbDataReader, and loads the rows it exposes into this DataTable.

    Load(data-reader, load-option)

    Loads data from a reader instance that implements the IDataReader interface into this DataTable. The load-option parameter is a value from the LoadOptions enumeration that determines how the values of incoming rows will be used when existing matching rows are present in the DataSet tables.

    Load(data-reader, load-option, fill-error-handler)

    Loads data from a reader instance that implements the IDataReader interface into this DataTable. Works exactly like the previous overload except that a reference to a delegate (event handler) can be specified. This event handler will be executed if there is an error while loading the data.

     

    Remember that the Load method (like the Fill method) will match incoming rows against any existing rows if a primary key is declared in the table. It must do this to maintain data integrity - it cannot insert a duplicate row that has the same primary key value as an existing row. However, if the table already contains rows and there is no primary key defined, the incoming rows are added to the table after the existing rows.

     

    If the schema of the table (i.e. the names and ordering of any existing columns) does not match the incoming rows, the schema may be adjusted. If there are columns in the incoming rows that are not in the existing table, any that are missing are appended to the end of the Columns collection of the table. This can cause Load errors if the new columns have constraints, for example if they cannot contain null values, and you have used PreserveCurrentValues (PreserveChanges in Beta 2) as the LoadOption for the process. The new columns in existing rows will contain null values (unless you have specified default values), and so an error will occur when the Load method tries to re-enable constraints for the row after loading the data.

    Using a DataReader to Load Tables in a DataSet

    As an example of using the Load method, the code listed below demonstrates how you can create a DataReader (in this case a SqlDataReader) that returns multiple rowsets using a SQL batch statement, and then push these rows into the tables in a DataSet. The code creates the DataReader, and then declares a String array containing two table names, Orders and OrderDetails. Then it creates a new empty DataSet and calls its Load method - passing in the DataReader and the array of table names.

     

    All the overloads of the Load method for a DataSet require a value from the LoadOption enumeration to be specified, and we use OverwriteRow (OverwriteChanges in Beta 2) here. In this case, because the DataSet is empty, the value we choose makes no difference - there are no existing rows that could match the incoming rows, and there is no primary key defined either. 

     

    // connection and query details

    String connect = "your-connection-string";

    const String SQL = "SELECT TOP 5 OrderID, OrderDate, ShipName, ShipCity, "

                     + "ShipCountry FROM [Orders];"

                     + "SELECT TOP 5 * FROM [Order Details];"

                     + "SELECT TOP 5 CustomerID, CompanyName, Address, City, "

                     + "Country FROM [Customers]";

     

    // create a connection, command and get a DataReader

    using (SqlConnection con = new SqlConnection(connect))

    {

      SqlCommand cmd = new SqlCommand(SQL, con);

      con.Open();

      reader = cmd.ExecuteReader(CommandBehavior.CloseConnection);

     

      // create an array of table names for first two incoming rowsets

      String[] tablenames = new String[] {"Orders", "Order Details"};

     

      // load the data from the DataReader

      // no existing data, but have to provide a value for LoadOptions

      // so use OverwriteRow - makes no difference in this case

      DataSet ds = new DataSet();

      ds.Load(reader, LoadOption.OverwriteRow, tablenames);

     

      // load another table directly using the Load method

      DataTable dt = new DataTable();

      dt.TableName = "Customers";

      dt.Load(reader);

      ds.Tables.Add(dt);

     

      // display the results

      ...

    }

     

    So far, however, we've only loaded two tables (Orders and OrderDetails). To demonstrate another way of using the Load method, the code then calls it on an individual DataTable. To achieve this, we have to first create a new DataTable. We assign the name Customers to this table then call its Load method to load the remaining rowset into it from the DataReader, and finally add it to the Tables collection of the DataSet. The remaining code (not shown here) displays the progress messages you see at the top of the page shown in Figure 5, and then assigns the tables to three GridView controls to display the contents:

     

    Figure 5 - Loading DataTables from a DataReader using the Load Method

    The BeginLoadData and EndLoadData Methods of the DataTable

    The DataTable class also exposes two more methods, named BeginLoadData and EndLoadData, which can be used to disable constraint checking and internal index maintenance in a DataTable while data is being loaded. This speeds up load times, and only raises any constraint violations (such as primary key duplication) when the EndLoadData method is called. If you intend to load a large number of rows, you should consider using these methods:

     

    try

    {

      myDataTable.BeginLoadData();

      myDataTable.Load(myDataReader);

      myDataTable.EndLoadData();

    }

    catch (Exception e)

    {

       // ... handle error ...

    }

    Using the FillErrorEventHandler to Catch Load Errors

    You can also create an error handler routine that will be executed if an error occurs during the loading of the data into the target table. This is basically the same process as handling the existing FillError event of the DataAdapter class in version 1.x of ADO.NET. However, as there is no DataAdapter involved in the Load process, the technique for connecting the event handler to the method involves passing a delegate reference into the Load method:

     

    myDataSet.Load(data-reader, load-option, new FillErrorHandler(myFillHandler), tables);

     

    Then you create the event handler - where, for example, you can access the table name, the values in the row that caused the error, and set the Continue property of the FillErrorEventArgs to specify if processing should halt or continue with the next row:

     

      private void MyFillHandler(Object sender, FillErrorEventArgs args)

      {

     

        // write message to a StringBuilder for display later

        builder.Append("Error while loading table " + args.DataTable.TableName);

        builder.Append("Values of the row being loaded are:<br />");

        foreach (Object o in args.Values)

        {

          builder.Append(" - " + o.ToString() + "<br />");

        }

     

        // tell Load method to continue with next row

        args.Continue = true;

      }

     

    Figure 6 shows an example that uses both the Load method by itself, and the sequence of BeginLoadData, Load and EndLoadData methods, to load two DataTable instances directly. It also contains a handler for the FillError event, which you can view from the source code, or by clicking the [view source] link that is at the foot of all the example pages on our Web site.

     

    Figure 6 - Using the BeginLoadData, Load and EndLoadData Methods