asp tutorials, asp.net tutorials, sample code, and Microsoft news from 15Seconds
Data Access  |   Troubleshooting  |   Security  |   Performance  |   ADSI  |   Upload  |   Email  |   Control Building  |   Component Building  |   Forms  |   XML  |   Web Services  |   ASP.NET  |   .NET Features  |   .NET 2.0  |   App Development  |   App Architecture  |   IIS  |   Wireless
 
Pioneering Active Server
 Power Search








Active News
15 Seconds Weekly Newsletter
• Complete Coverage
• Site Updates
• Upcoming Features

More Free Newsletters
Reference
News
Articles
Archive
Writers
Code Samples
Components
Tools
FAQ
Feedback
Books
Links
DL Archives
Community
Messageboard
List Servers
Mailing List
WebHosts
Consultants
Tech Jobs
15 Seconds
Home
Site Map
Press
Legal
Privacy Policy
internet.commerce














internet.com
IT
Developer
Internet News
Small Business
Personal Technology
International

Search internet.com
Advertise
Corporate Info
Newsletters
Tech Jobs
E-mail Offers

HardwareCentral
Compare products, prices, and stores at Hardware Central!

Building A Class Library Using the Calitha BNF Engine
By Dina Fleet Berry
Rating: 5 out of 5
Rate this article


  • email this article to a colleague
  • suggest an article

  • download source code
  • Introduction

    As users of search engines, we have to learn new query syntax with every new search engine or database. It would be nice to be able to search our own databases with Google-style syntax so that non-SQL programmers could search the data. This article will build a .NET class library that converts simple Google-style syntax into Transact-SQL syntax for full-text searches. Our full-text search code uses the CONTAINS portion of the WHERE clause to search columns.

    The IT instructions for installing and configuring a full-text search on a table is covered in a different article found here: Perform SQL Queries Using Google-Style Syntax. This article will focus on the GoldParser/Calitha usage and .NET C# programming required. The class library built in this article is called GoogleToSQL.dll.

    In order to demonstrate the GoldParser/Calitha parsing, the GoogleToSQL.dll will take a simple Google-style query and transform it into the Transact SQL contains statement. A web site calls the GoogleToSQL class library, finish creating the SQL statement and run the statement against a database in order to find rows that match that full-text query.

    How to Convert From One Language to Another

    Converting from one programming language to another can be a difficult task. If both languages can be described by Backus-Naur form (BNF), a conversion is much easier. GoldParser (www.goldparser.com) is a free program that can help with that conversion.

    The BNF Grammar File

    The first step is to write the BNF that defines the Google-style search. Using GoldParser's Parse Builder, you can write the BNF contained in the grammar file. I used the grammar as described on the Google web site (http://code.google.com/apis/base/query-lang-spec.html) to write the BNF file, simplifying it for this article. However, if you want a strict Google Syntax, the grammar provided is a good starting point. Since GoldParser is a compiler, some of the syntax and keywords of the grammar file are specific to GoldParser and not Google or BNF per se. GoldParser has documentation on the web site to help understand the GoldParser syntax.


    Figure: Google BNF displayed in Gold Parser Builder

    Parts of GoldParser's BNF Grammar File

    The top of the GoogleToSQL.grm file has information about author and comments. The next section tells the compiler where to start and what to start with. For this example, the starting symbol is called '<Query>'. '<Query>' is not a keyword; its specific name is meaningful to the human reader of the file. The next section is the Delimiters section. For this example, it is called {GoogleValid} which used the GoldParser keyword '{All Valid}'. GoogleValid also removes several characters from that valid set such as the pipe '|' character which are not valid characters to Google. The next section is the smallest unit to parse that is made up of {GoogleValid} characters, called a 'Term'. Examples of a term are a word like 'hawaii' or a numeric value such as '1421'.

    The next section starting with <Query> is the definition of the BNF tree that will be followed. The purpose of the BNF tree is to parse text and determine its linguistic value. Google has symbols that need to be included in the BNF such a negation (-), wildcards (*), and Boolean values (true or false). These symbols are also part of the BNF because the parser needs to know that these symbols are different than a term.

    Compile the Grammar File

    Once you have the file, you can compile the file with the GoldParser and test out different search queries to understand the parsing of the query. Compile the file by clicking on the 'Continue' button at the bottom right of the GoldParser's Parse Builder window. The compilation is in 4 steps so you will need to keep clicking the button until it says to save the file. When you save the file, you have successfully compiled the grammar. If there is an error in the grammar, you will be shown the error and can fix it.

    Testing the Grammar

    Let's step through a test query to understand the BNF better. While in the Gold Parser Builder with the grammar file open and compiled, open a test window.

    As an example of a query, let's use 'vacation hawaii'. This query is made up of two 'terms'. It has no other symbols or delimiters for the BNF to figure out. Enter the test query in the top window and click on start button on the bottom right. Either step through the parsing with the 'Step' button, or parse the entire query with the 'Parse All' button. Notice how the Parse Actions window fills up with different items as the Google BNF is tested against the query.

    The test window for parse actions looks like the following figure when all the parsing is done. Three token types denoted by 'Token Read' actions (in black) are found. Two token types are the terms 'vacation' and 'hawaii' and the third is the 'eof' symbol meaning the end of the file or query was found. The other rows in the lower window show how the BNF grammar was used. Shifting is the process of eliminating a non-essential character and reducing is parsing the actual token to determine it's parts.

    The parse tree also shows this information but in a different form that is more related to the BNF provided in the grammar file. The words in the '<>' sections are not GoldParser keywords but are terms I've defined.

    Now that you have a compiled grammar file (A raw grammar file (GoogleToSQL.grm) and a compiled grammar file (GoogleToSQL.cgt) are provided in this article's download.), you need a c# .NET library that can interpret the compiled grammar file. For this article, I use the free Calitha engine found at www.calitha.com. You will need to download the calitha engine in order to use it in the class library. When using the Calitha class library, you will need to register events to listen for. These events are the Parse Actions. If you want to listen to the Token Read action, you need to register to listen for that event.

    Now that the BNF grammar file is finished, let's create the database objects in SQL server in order for the full-text search to work.

    Microsoft Full-Text Search

    Microsoft has added Full-Text Search to SQL server to help with complex queries. In order for Full-Text Search to work, you need to have Full-Text Search installed from SQL Server. Full-Text Search is an installation option of the Database Engine Install, when you install SQL Server.

    Full-Text Search T-SQL

    A full-text search in SQL Server uses the standard SELECT statement but the WHERE clause needs to have a CONTAINS clause, for example:

        SELECT * FROM TableName
        WHERE CONTAINS (ColumnName,'query')

    This article's download contains a SQL script to create and delete all objects necessary for a full-text search. The table name is tblFullSearch and the column name is colText so the above Transact-SQL for the sample query is:

        SELECT * FROM tblFullSearch
        WHERE CONTAINS (colText,'vacation and hawaii')

    An alternate form is:

        SELECT * FROM tblFullSearch
        WHERE
            CONTAINS (colText,'vacation')
            AND
            CONTAINS (colText,'hawaii')

    This query is interpreted as find all rows that have both the words 'vacation' and 'hawaii' in them regardless if they are next to each other or have text between them.

    Notice that the Google syntax does not need the 'and' but assumes a logical 'and' whereas the Transact SQL needs the logical 'and' stated. Also notice that the logical 'and' for the Transact SQL is inside of the single quotes meaning a literal. All text inside the single quotes is not a literal value but is in itself syntax specific to full-text search. If the query was to be interpreted as a literal value, it would need to appear like:

        SELECT * FROM tblFullSearch
        WHERE CONTAINS (colText,' "vacation and Hawaii" ')

    This query is interpreted as find all rows that have the literal phrase of "vacation and Hawaii". These three words need to be found in this word in order for a row to match.

    The use of the double quotes is not used in Transact SQL but is used in the full-text search query to denote a literal phrase to search for.

    The GoogleToSQL.dll .NET Class Library

    The GoogleToSQL class library will convert the Google-style phrase 'vacation hawaii' to the T-SQL phrase 'vacation and hawaii'. The library constructor takes a column name in order to use the column name in the CONVERT clause. Then the .Convert method is called and the CONTAINS statement is returned. The rest of the SELECT statement will be provided by the calling application. In Visual Studio, create a class library and add the GoldParserEngine.dll and the CalithaLib.dll files as references and put them in the bin directory.

    In order to use the Calitha engine, add the following code to the top of the file:

        using com.calitha.commons;
        using com.calitha.goldparser;

    The constructor takes the column name for the CONTAINS statement, opens the compiled grammar file, and starts listening for events:

        public ContainsClause(String columnName)
        {
            // make sure column name is provided
            if (columnName.Length == 0)
            throw new Exception("ContainsClause c'tor: input param columnName is empty");

            _columnName = columnName;

            // read web.config for location of compiled grammar file
            _fileLocation = System.Configuration.ConfigurationSettings.AppSettings["CompiledGoldParserFileLocation"];

            using (FileStream fileStream = new FileStream(_fileLocation, FileMode.Open))
            {
                // create grammar file reader from Calitha library
                CGTReader reader = new CGTReader(fileStream);
                _lalrParser = reader.CreateNewParser();

                // register 3 events to listen for
                _lalrParser.OnTokenRead += new LALRParser.TokenReadHandler(_lalrParser_OnTokenRead);
                _lalrParser.OnReduce += new LALRParser.ReduceHandler(_lalrParser_OnReduce);
                _lalrParser.OnShift += new LALRParser.ShiftHandler(_lalrParser_OnShift);
            }
        }

    The event this code listens for is the OnTokenRead event. Using the example of 'vacation hawaii', we will get 3 tokens: vacation, Hawaii, and eof. As we get each token that is not eof, we need to add the Transact SQL syntax 'AND'.

        private void _lalrParser_OnTokenRead(LALRParser parser, TokenReadEventArgs args)
        {
            // Get Symbol Name
            String sCurrentItemName = args.Token.Symbol.Name.ToLower();
            // Get Symbol Value
            String sCurrentItemValue = args.Token.Text.ToLower();

            // if we are not at end of query and 'AND' is required, append it
            if ((_andRequired) && (sCurrentItemName != "(eof)"))
                _sQuery += " and ";

            // parse action items
            switch (sCurrentItemName)
            {
                case "-":
                    _sQuery += " not ";
                    break;
                case "term":
                    _sQuery += String.Format("contains (" + _columnName + ",'{0}')", sCurrentItemValue);
                    _andRequired = true;
                    break;
                case "eof":
                    break;
            }
        }

    The Test Website

    In order to test the class library, you will need a calling application. For this article, I'll use a web application, which is included in the download. Create a .NET web site and add the compiled grammar file GoogleToSQL.cgt to the file system. Change the web.config file to reference that location in the AppSettings section named 'CompiledGoldParserFileLocation'. Make sure to reference the GoogleToSql.dll, GoldParserEngine.dll, CalithaLib.dll.

    Build a web page that will take a query and pass the query to the GoogleToSQL.dll. After the query is converted to SQL, you can finish the SQL syntax and execute the query.

    The code for this web page can be found in the associated article found here: Perform SQL Queries Using Google-Style Syntax.

    Summary

    One way to convert from one language to another is to have the BNF form of the language you are converting, and using a parsing engine. Extending the BNF and adding to the code is left to the user to produce the exact syntax needed for your web site.

    Required Files

    Required 3rd Party Files:

    Required Files Provided By Author:

    • GoogleToSql.dll
    • GoogleToSql.cgt
    • GoogleToSql.grm

    Web Site Files Provided By Author:

    • Default.aspx
    • Default.aspx.cs
    • Default.aspx.designer.cs
    • TestWeb.csproj
    • Web.config
    • CreateDB.sql

    Zip File Containing Files Listed Above: 070719.zip (2 MB)

  • Rate This Article
    Not HelpfulMost Helpful
    1 2 3 4 5
    Other Articles
    Jul 21, 2005 - N-Tier Web Applications using ASP.NET 2.0 and SQL Server 2005 - Part 1
    While the .NET Framework made building ASP.NET applications easier then it had ever been in the past, .NET 2.0 builds on that foundation in order to take things to the next level. This article shows you to how to construct an N-Tier ASP.NET 2.0 Web application by leveraging the new features of ASP.NET 2.0 and SQL Server 2005.
    [Read This Article]  [Top]
    Apr 28, 2005 - New Files and Folders in ASP.NET 2.0
    With the release of ASP.NET 2.0, Microsoft has greatly increased the power of ASP.NET by introducing a suite of new features and functionalities. As part of this release, ASP.NET 2.0 also comes with a host of new special files and folders that are meant to be used to implement a specific functionality. This article examines these new files and folders in detail and provides examples that demonstrate how to utilize them to create ASP.NET 2.0 applications.
    [Read This Article]  [Top]
    Mar 10, 2005 - The DataSet Grows Up in ADO.NET 2.0 - Part 2, Cont'd
    Alex Homer continues his detailed look at the major changes to the DataSet class. In this part, he looks at two features that allow developers to work with data in a more structured and efficient way when using the DataSet with a SQL Server 2005 database server.
    [Read This Article]  [Top]
    Mar 9, 2005 - The DataSet Grows Up in ADO.NET 2.0 - Part 2
    Alex Homer continues his detailed look at the major changes to the DataSet class. In this part, he looks at two features that allow developers to work with data in a more structured and efficient way when using the DataSet with a SQL Server 2005 database server.
    [Read This Article]  [Top]
    Mar 3, 2005 - The DataSet Grows Up in ADO.NET 2.0 - Part 1, Cont'd
    In this article, Alex Homer looks at the changes between the version 1.x and version 2.0 DataSet and their associated classes, showing you how you can take advantage of the new features to improve your applications' capabilities and performance.
    [Read This Article]  [Top]
    Mar 2, 2005 - The DataSet Grows Up in ADO.NET 2.0 - Part 1
    In this article, Alex Homer looks at the changes between the version 1.x and version 2.0 DataSet and their associated classes, showing you how you can take advantage of the new features to improve your applications' capabilities and performance.
    [Read This Article]  [Top]
    Feb 16, 2005 - Writing a Custom Membership Provider for the Login Control in ASP.NET 2.0
    In ASP.NET 2.0 and Visual Studio 2005, you can quickly program custom authentication pages with the provided Membership Login controls. In this article, Dina Fleet Berry examines the steps involved in using the Login control with a custom SQL Server membership database.
    [Read This Article]  [Top]
    Dec 29, 2004 - ClickOnce Deployment in .NET Framework 2.0
    In this article, Thiru Thangarathinam examines .NET 2.0's new ClickOnce deployment technology that is designed to ease deployment of Windows forms applications. This new technology not only provides an easy application installation mechanism, it also eases deployment of upgrades to existing applications.
    [Read This Article]  [Top]
    Dec 15, 2004 - A Sneak Peek at ASP.NET 2.0's Administrative Tools
    With ASP.NET 2.0, Microsoft has made great strides in increasing developer productivity and has made implementing previously complex solutions relatively easy. Where this version of ASP.NET really shines, however, is in its new administrative tools that allow developers to spend less time managing the configuration of the servers and software and more time developing great code.
    [Read This Article]  [Top]
    Nov 17, 2004 - The ASP.NET 2.0 TreeView Control
    Thiru Thangarathinam introduces ASP.NET 2.0's new TreeView control which provides a seamless way to consume and display information from hierarchical data sources. The article discusses this new control in depth and explains how to use this feature rich control in your ASP.NET applications.
    [Read This Article]  [Top]
    Mailing List
    Want to receive email when the next article is published? Just Click Here to sign up.

    Support the Active Server Industry



    JupiterOnlineMedia

    internet.comearthweb.comDevx.commediabistro.comGraphics.com

    Search:

    Jupitermedia Corporation has two divisions: Jupiterimages and JupiterOnlineMedia

    Jupitermedia Corporate Info


    Legal Notices, Licensing, Reprints, & Permissions, Privacy Policy.

    Advertise | Newsletters | Tech Jobs | Shopping | E-mail Offers

    Solutions
    Whitepapers and eBooks
    Microsoft Article: Will Hyper-V Make VMware This Decade's Netscape?
    Microsoft Article: 7.0, Microsoft's Lucky Version?
    Microsoft Article: Hyper-V--The Killer Feature in Windows Server 2008
    Avaya Article: How to Feed Data into the Avaya Event Processor
    Microsoft Article: Install What You Need with Windows Server 2008
    HP eBook: Putting the Green into IT
    Whitepaper: HP Integrated Citrix XenServer for HP ProLiant Servers
    Intel Go Parallel Portal: Interview with C++ Guru Herb Sutter, Part 1
    Intel Go Parallel Portal: Interview with C++ Guru Herb Sutter, Part 2--The Future of Concurrency
    Avaya Article: Setting Up a SIP A/S Development Environment
    IBM Article: How Cool Is Your Data Center?
    Microsoft Article: Managing Virtual Machines with Microsoft System Center
    HP eBook: Storage Networking , Part 1
    Microsoft Article: Solving Data Center Complexity with Microsoft System Center Configuration Manager 2007
    MORE WHITEPAPERS, EBOOKS, AND ARTICLES
    Webcasts
    Intel Video: Are Multi-core Processors Here to Stay?
    On-Demand Webcast: Five Virtualization Trends to Watch
    HP Video: Page Cost Calculator
    Intel Video: APIs for Parallel Programming
    HP Webcast: Storage Is Changing Fast - Be Ready or Be Left Behind
    Microsoft Silverlight Video: Creating Fading Controls with Expression Design and Expression Blend 2
    MORE WEBCASTS, PODCASTS, AND VIDEOS
    Downloads and eKits
    Sun Download: Solaris 8 Migration Assistant
    Sybase Download: SQL Anywhere Developer Edition
    Red Gate Download: SQL Backup Pro and free DBA Best Practices eBook
    Red Gate Download: SQL Compare Pro 6
    Iron Speed Designer Application Generator
    MORE DOWNLOADS, EKITS, AND FREE TRIALS
    Tutorials and Demos
    How-to-Article: Preparing for Hyper-Threading Technology and Dual Core Technology
    eTouch PDF: Conquering the Tyranny of E-Mail and Word Processors
    IBM Article: Collaborating in the High-Performance Workplace
    HP Demo: StorageWorks EVA4400
    Intel Featured Algorhythm: Intel Threading Building Blocks--The Pipeline Class
    Microsoft How-to Article: Get Going with Silverlight and Windows Live
    MORE TUTORIALS, DEMOS AND STEP-BY-STEP GUIDES