asp tutorials, asp.net tutorials, sample code, and Microsoft news from 15Seconds
Data Access  |   Troubleshooting  |   Security  |   Performance  |   ADSI  |   Upload  |   Email  |   Control Building  |   Component Building  |   Forms  |   XML  |   Web Services  |   ASP.NET  |   .NET Features  |   .NET 2.0  |   App Development  |   App Architecture  |   IIS  |   Wireless
 
Pioneering Active Server
 Power Search





Subscribe Now!
Free Newsletter
Reference
News
Articles
Archive
Writers
Code Samples
Components
Tools
FAQ
Feedback
Books
Links
DL Archives
Community
Messageboard
List Servers
Mailing List
WebHosts
Consultants
Tech Jobs
15 Seconds
Home
Site Map
Press
Legal
Privacy Policy
internet.commerce














internet.com
IT
Developer
Internet News
Small Business
Personal Technology

Search internet.com
Advertise
Corporate Info
Newsletters
Tech Jobs
E-mail Offers

HardwareCentral
Compare products, prices, and stores at Hardware Central!

-->
A Simpler Way of Getting .NET Objects out of ADO.NET
By Benjamin Hautefeuille
Rating: 4.2 out of 5
Rate this article


  • email this article to a colleague
  • suggest an article

    Introduction


    Microsoft .NET is rapidly becoming the platform of choice for implementing enterprise-class applications from reusable components. As an early Visual Studio .NET user, I have been able to appreciate Microsoft's consistent effort to provide a new stack of services and components in the .NET framework, built to support today's need for multi-tier programming environments and architected to seamlessly integrate with XML. The result is a robust platform for building scalable applications.

    Although ADO.NET is a major evolution of ADO that now provides native access to XML and support for in-memory database, it does not natively support access to business objects. However, I discovered that the ADO.NET data provider from Matisse Software is capable of natively manipulating objects.

    This article describes my experience in migrating a life science application from a relational design to Matisse - A Post-Relational SQL database - that provides an innovative ADO.NET data provider.

    The article also introduces the Matisse database (http://www.matisse.com/) and its native ADO.NET data provider. It compares two situations: developing a complex application, first using ADO.NET with a relational database, and then using Matisse. Furthermore, the article discusses the limits of the current ADO.NET feature set and presents a very viable alternative with Matisse.

    Modeling a Life Science Application

    Most life science projects can be characterized by a rich and complex data model as illustrated in Figure 1, and also often by data intensive processing. For the past few months, I have been involved in such a project, facing the common problem of seamlessly mapping the object representation in .NET to the underlying relational schema.

    Figure 1 - UML Class Diagram for a Post-Genomic Type of Application

    For the purpose of this article, I will not use my complex life science data model. Instead I will use a simpler one representative of a network of information. Consider the case of a project management application in which Projects, Tasks and Team Members are naturally tied together. While the object model naturally captures the real-world relationships and concept commonalities, the UML class diagram (see Figure 1) illustrates that indeed a simple real-world business process can end up being described by a somehow complex network of entities.

    This application is clearly representative of the increasing modeling complexity that architects and developers are facing when developing new enterprise-class applications and services, which serve today's rapidly changing business needs.

    The Microsoft .NET platform provides architects and developers with a powerful environment to implement enterprise-class solutions. Visual Studio .NET, in particular, facilitates the integration of reusable components using object-oriented technology.

    Figure 2 - UML Class Diagram for a Project Management Application

    While Microsoft has enhanced many of the existing components for the .NET platform, it would have been very beneficial to application design, as I will demonstrate in this article, that ADO.NET also provides a means to natively manipulate .NET objects.

    In the case of our life science application, this situation significantly increases complexity, forcing our .NET developer team to write a great deal of code to serialize objects into rows and to map back rows into networks of objects.

    I will describe below the development process I went through and highlight the ADO.NET limitations that led me to seek an alternative solution.

    Usual Implementation with a Relational Database

    The application class diagram in Figure 2 is representative of the type of object model that database architects have to map into a valid relational schema. It may not seem trivial for an Object developer to transform class inheritance and many-to-many relationships into tables, but for a skilled database architect, all that is required is applying a basic set of rules that have been proven to do the job.

    As the diagram in Figure 3 illustrates, a common way to deal with class inheritance consists of flattening the class hierarchy into a single table where class typing is preserved into an additional column. In this example, the a ProjectMember and a ProjectManager classes are flattened into a ProjectMembers table, which groups all properties defined in the class hierarchy and contains an additional a EmpType column.

    Figure 3 - Relational Diagram for a Project Management Application

    Modeling many-to-many relationships for relational databases requires the database architect to create a two-column table. The a Predecessors/Successors, a Members/WorksOn and a Assignee/AssignedTo associations defined in the UML diagram result in the addition of three intermediate tables, respectively PredecessorsSuccessors, AssignedProjects and AssignedTasks, in the relational schema.

    The a Manages/ManagedBy and a Tasks/Project one-to-many associations are mapped to foreign key columns and constraints in both the Project and Task tables.

    It did not take me long to turn the UML class diagram into its relational equivalent. But I must admit that after transforming the class diagram through the normalization process, the database schema of our life science application translated into a thicket of tables. In the end, my neatly designed .NET object model lost a great deal of its clarity and simplicity.

    Even though any model can ultimately be turned into a relational schema, the task can become quite daunting, especially when the business model is fairly complex. It then becomes very difficult and time consuming to program, verify, debug and maintain the relationships between the numerous tables. Performance of the queries is negatively affected as well.

    Once the database schema is created to support the application, it becomes time to work on the best strategy to move data back and forth between the application and the database, using ADO.NET.

    ADO.NET: Data Access Services for .NET

    If you already know ADO, you will have a head start on learning ADO.NET. The main difference with ADO is the re-factoring of the ADO Recordset object into three explicit ADO.NET objects: the DataReader, which provides fast, forward-only, read-only access to query results; the DataSet, which provides an in-memory relational representation of data; and the DataAdapter, which provides a bridge between the DataSet and the data source. This is actually a major breakthrough in terms of architecture.

    However, since ADO.NET objects return result-sets in a tabular format of atomic values regardless of whether DataReader or DataSet are used, rows need to be mapped to a set of objects to truly realize the benefits from the .NET object model and programming languages.

    Consequently, our .NET developer team needed to develop an Object-Relational mapping layer. There are many ways to do this; for instance, one can use an Object-Relational mapping tool to generate the source code, develop a persistent object layer, or embed SQL statements in the business layer and hard-code the mapping. Figure 4 shows the source code that needs to be executed each time data is brought back from the database into the application and thus converted into objects.

    Figure 4 - An O-R Mapping Layer Necessary to Recreate a Network of Objects From Rows.

    Object navigation commonly used in .NET applications requires extensive use of multi-table joins. However, joins are computationally intense and each join is computed at runtime to link information together. In this application example, reconstructing a Project object with its associated Tasks and Project Members requires to run three SQL queries and to execute a bunch of code.

    This object-relational mapping solution does not scale well enough to meet the business requirements of my life science application. A simple scalability benchmark demonstrates that the time to recreate a Project object with its associated Tasks and Project Members grows significantly as the number of Project and Tasks and Project Members increases.

    I, therefore, decided to use the newly created DataSet object. The DataSet is an in-memory database cache for use in disconnected operations; it contains a complete collection of tables, relationships, and constraints as shown in Figure 5. Within the DataSet you can define relations to navigate from a table populated from the a Projects database table to a related table populated from the a Tasks database table, etc. This recreates your database schema in-memory and keeps a limited amount of data at hand, thus avoiding multiple and costly accesses to the database. The relational capabilities of the DataSet provide an advantage over the Recordset, which is limited to exposing the results from multiple tables either as a single joined result, or by returning multiple distinct result sets, requiring the .NET developer to handle and relate the results manually. In conclusion, the DataSet provides much greater flexibility when dealing with related result sets and navigation for that matter.

    Figure 5 - DataSet, Just Another Syntax to Build an O-R Mapping Layer.

    While the use of DataSet improves the performance and scalability of the database updates, it does not provide any benefit compared to the DataReader solution for accessing the data. Since the .NET objects manipulated by the application are already caching the data, the DataSet caching feature does not add much value.

    According to the ADO.NET documentation, a DataReader is simply a stateless stream that allows you to read data as it arrives, and then drop it without storing it to a DataSet for more navigation. Therefore, the stream approach is faster with less overhead.

    Since ADO.NET combined with a relational database did little to meet the intrinsic requirement of scalability for my type of application, I became interested in alternative solutions. Consequently, I decided to look for database technologies that would scale on the .NET platform.

    Matisse: A Post-Relational SQL Database Solution

    When I first heard about the Matisse database management system, documented in Matt Culbreth's 15Seconds article, "Using Object Databases in .NET", I was excited to learn that it supports the .NET platform, provides a native ADO.NET data provider, and combines support for objects and SQL together.

    Matisse Software also claims support for scalable, enterprise capabilities implemented in relational databases. Like relational databases, Matisse supports the SQL-99 syntax, but the Matisse data model is also truly object-based. While Matisse SQL queries are relational in their syntax, they take advantage of the object paradigm by supporting inheritance, polymorphism and true navigation.

    For database architects who have been using relational products for years, it may require some effort to get adjusted to the Matisse terminology. It was, however, easy to discover that Classes, Objects and Attributes can be mapped into familiar relational concepts such as Tables, Rows and Columns, respectively. But when it comes to mapping primary keys, foreign keys, intermediate tables and joins to their Matisse counterpart, an open mind and a fair amount of documentation reading are required.

    In any event, my time was well spent. I later discovered that mastering these concepts had a significant impact on the database schema and consequently on the overall performance of the system. Figure 6 summarizes the corresponding terminologies between the relational technology and Matisse's. It certainly shows that they are not that far apart.

    Figure 6 - Mapping Between Relational and Object Terminology

    Defining the database schema for Matisse and loading the schema into the database turned out to be an easy task. Indeed, Matisse provides a plug-in for Rose Modeler from Rational that allows database architects to import and export a UML schema from and to the database. Later, I saw that database schemas could be defined in plain SQL DDL, which I used extensively to create and drop indexes and to alter classes of the application. However, unlike relational databases, Matisse preserves the original UML data model.

    Once the database schema was loaded, I used the Matisse mt_stbgen utility to generate the C# class files corresponding to the classes defined in the database schema. At this point, the C# classes mapped directly to the database classes and therefore eliminated the need for an Object-Relational mapping layer, as shown in Figure 7. Note that if you are a VB programmer, the Matisse utility can generate VB.NET source code as well.

    Figure 7 - No Mapping, Direct Object Manipulation.

    One could question why I still needed to use ADO.NET. My experience has been that to build scalable applications, data-intensive processing needs to take place where the data lands, on the server and not on the client, further reducing resource contention as well as network traffic and taking advantage of the faster processing speeds of most server architectures. So, in my opinion, SQL is the only scalable means to filter objects based on a complex set of predicates or to compute values that demand visiting large networks of objects.

    Matisse ADO.NET: Object Access Services for .NET

    As discussed earlier in this article, ADO.NET is primarily designed to access relational data sources. The DataSet and DataReader objects support only a two-dimension table format. Hence, you might be wondering, as I was, how one can get objects out of ADO.NET.

    Based on the ADO.NET architecture, the answer is that you can't unless you are using Matisse. Matisse has added a simple but very powerful feature to the standard ADO.NET interface: the GetObject() method. In addition to the implementation of the DataReader standard interface, Matisse has added the GetObject() method. GetObject() returns objects instead of rows (see Figure 8). This feature alone eliminates the need for an O-R mapping layer. Roughly, 25% of my application code was eliminated with that single function.

    Figure 8 - Matisse ADO.NET Can Return Objects, Not Just Tables.

    Once you get objects out, you can navigate from one object to another using standard .NET programming. Unlike in the relational solution shown in Figure 4, there is no need to rebuild objects from rows and to link them before you can navigate through a network of objects. Figure 9 illustrates the simplicity to navigate from one object to another. When you hold on a Project, access to the project manager and to the project members is pure relationship navigation.

    Figure 9 - Data Navigation Made Simple.

    While the Matisse ADO.NET data provider can return objects, it can also return data in table format. Objects are flattened into rows to integrate seamlessly with third party database tools handling only relational format. Other cases include returning tables resulting from a set of computed values (see Figure 10).

    Figure 10 - Matisse Also Supports Two-Dimensional Tables.

    Simplicity

    There is no doubt that I simplified the application code while preserving the original .NET object model. Once the database schema was loaded straight from the UML class diagram and the corresponding C# classes were generated, porting tasks mainly consisted of removing source code from the existing Life science application.

    Moreover, I found that Matisse SQL queries were much simpler than their relational counterparts, even if I had to get used to a different thought process. Database developers know relational algebra, but they are not used to Matisse extensions to the relational algebra. Inevitably, there is a small learning curve to understand the relationship between join and navigation, which consists of defining join expressions with OID as primary key and relationship as foreign key. But, once again, the few hours I invested in learning Matisse vastly paid off in terms of simplicity and performance.

    Below are queries written for a relational database (Figure 11) and for Matisse (Figure 12), which clearly illustrate the difference, in terms of simplicity of the code, between the two approaches.

    Figure 11 - A Relational Query Can Be Complex to Figure Out.

    Figure 12 - The Matisse SQL Query is Compact and Intuitive.

    The queries in Figure 11 and 12 list managers currently managing projects and those that are also team members of their projects, but it is much easier and quicker for developers that are shown both approaches to figure out what the Matisse SQL query does compared to the relational one.

    Performance Improvement

    One of the other reasons why I migrated the life science application to Matisse was that I wanted to improve the overall scalability of the application.

    The benchmark I ran was quite simple, yet representative of the application's behavior. It consisted of selecting projects with their team members. It consisted of basically running a SQL query with one Equi-Join. While the execution time increased proportionally with the volume of data in the relational database, the execution time only slightly increased for Matisse. The results were quite impressive with Matisse. They ranged from 10 times to 30 times faster for a database with about one million rows. Further comparative testing showed that the relational product was very sensitive to the number of joins executed in a single query. Furthermore, with queries executing two joins or more, the performance was severely impacted whereas with Matisse, the change in performance impact was negligible.

    Conclusion

    Next generation database management systems, such as Matisse, have their place in the Microsoft .NET environment. While object techniques are now ubiquitous in application design and programming, architects and .NET developers until now still needed to cope with inefficient means to map reusable business components into relational database tables.

    Although I was initially skeptical about the business value of non-relational databases, I found that Matisse had a very practical approach to supporting database modeling with object techniques. The blending of SQL, tables and object modeling classifies Matisse in the Post-Relational databases category. Nonetheless, the quasi-relational Matisse model proved to be somewhat of a mental shift mostly because of my strongly rooted habits.

    I was quite impressed with the Matisse innovation with respect to ADO.NET. It demonstrates that ADO.NET can evolve beyond the Microsoft architecture to benefit .NET developers. One method in particular - GetObject() - added by Matisse to the ADO.NET DataReader object transformed the architecture of my application and made it easier to maintain and more scalable.

    I also discovered that ADO.NET is a natural evolution of the data access architecture of ADO, built to support n-tier programming environments and architected to seamlessly integrate with XML. With the new DataSet object, ADO.NET provides developers with first-class support for disconnected operations, greatly improving the scalability of n-tier web-based applications and services. While ADO.NET is not a revolution in the data access services model, it is definitely a balanced evolution, which provides most of the ingredients to support today's rapidly changing business needs.

    In conclusion, Matisse represents a pragmatic evolution of database technology, which seamlessly integrates data access services in the object framework of the .NET platform.

    About the Author

    Benjamin Hautefeuille is a Project Manager with Victoire, Inc in San Mateo, California. Victoire builds custom Web Applications and manages IT infrastructure for companies in the financial, travel, life science and wireless industries. In his prior position, he worked as a consultant and head of R&D with Victoire in Europe.

    Ben can be reached at ben@victoire.com.

  • Rate This Article
    Not HelpfulMost Helpful
    1 2 3 4 5
    Supporting Products/Tools
    Proposion N2N
    Proposion N2N connects Microsoft .NET applications to Lotus Notes and Lotus Domino databases. This ADO.NET managed data provider allows you to perform blindingly fast queries and updates of Notes data from ASP.NET pages, .NET web services, Windows, or Mobile applications. An innovative SQL-like query language leverages the unique features of Notes and makes collaborative software accessible to relational database programmers.
    [Top]
    Other Articles
    Sep 15, 2005 - Building an Image Keyword System
    Unlike text-based file formats image files aren't made up of words, which makes searching for an image file by keyword difficult. Instead of being able to simply open the file to see what it contains, we're stuck looking at the text around it and other metadata to determine the image's meaning. In this article, Ziran Sun shows you how to build a simple database-based image keyword system that allows you to associate keywords with images and use these keywords to make finding images easier.
    [Read This Article]  [Top]
    Apr 7, 2005 - A Step-by-Step Guide To Using MySQL with ASP.NET - Part 2
    In the second part of of his article on using MySQL with ASP.NET, Ziran Sun covers how to add a new MySQL user to the database server, assign the user the appropriate permissions, connect to the database, and build a simple ASP.NET page to perform a query.
    [Read This Article]  [Top]
    Feb 10, 2005 - A Step-by-Step Guide To Using MySQL with ASP.NET - Part 1
    Back in the days of classic ASP, if you were building a database-driven web site, your choice was either to invest a lot of money to get a copy of Microsoft SQL Server (or some other enterprise-ready database) or invest a lot of time finding a way to deal with the performance and scalability limitations of Microsoft Access. Luckily these days there's another viable alternative: MySQL.
    [Read This Article]  [Top]
    Jan 27, 2005 - Moving a Database from SQL Server 7.0 to SQL Server 2000
    Moving or copying a SQL Server database from one machine to another requires a lot of preparation in order to ensure a smooth transfer. In this article, Dina Fleet Berry examines the different methods and highlights the different issues associated with each of them.
    [Read This Article]  [Top]
    Jan 6, 2005 - Debugging a SQL Stored Procedure from inside SQL Server 2000 Query Analyzer
    There are many times when using SQL Server 2000 Query Analyzer to debug SQL statements is a better choice than debugging in Visual Studio .NET. In this article, Dina Fleet Berry explains why and walks you through the debugging process step-by step.
    [Read This Article]  [Top]
    Nov 24, 2004 - Persisting .NET Objects to SQL Server Using SQLXML and Serialization
    As a follow up to his article on retrieving objects from SQL Server using SQLXML and serialization, Gianluca Nuzzo discusses saving objects back to SQL Server using a schema definition file and updategrams.
    [Read This Article]  [Top]
    Sep 14, 2004 - Transaction Processing in ADO.NET 2.0
    One area that stands out when comparing ADO.NET 1.x to ADO.NET 2.0 is transaction processing. Bill Ryan shows just how easy transaction processing has become with the TransactionScope object in ADO.NET 2.0.
    [Read This Article]  [Top]
    Sep 8, 2004 - Custom Object Data Binding with .NET
    Developers often use brute force coding to marshal data between the GUI and application objects. In this article, Luther Stanton explains how to use .NET's out-of-the box data-binding functionality to make this job much easier.
    [Read This Article]  [Top]
    Sep 2, 2004 - Queue MSMQ Messages from SQL Server
    Learn how to create a console application to queue a message in Microsoft Message Queuing (MSMQ) and then use an extended stored procedure to call the console application from a SQL Server trigger.
    [Read This Article]  [Top]
    Aug 30, 2004 - Tuning Up ADO.NET Connection Pooling in ASP.NET Applications
    Connection pooling increases the performance of Web applications by reusing active database connections instead of creating a new connection with every request. This article shows how to monitor the connection pool, diagnose a potential problem, and apply the appropriate fix.
    [Read This Article]  [Top]
    Mailing List
    Want to receive email when the next article is published? Just Click Here to sign up.

    Support the Active Server Industry