asp tutorials, asp.net tutorials, sample code, and Microsoft news from 15Seconds
Data Access  |   Troubleshooting  |   Security  |   Performance  |   ADSI  |   Upload  |   Email  |   Control Building  |   Component Building  |   Forms  |   XML  |   Web Services  |   ASP.NET  |   .NET Features  |   .NET 2.0  |   App Development  |   App Architecture  |   IIS  |   Wireless
 
Pioneering Active Server
 Power Search





Active News
15 Seconds Weekly Newsletter
• Complete Coverage
• Site Updates
• Upcoming Features

More Free Newsletters
Reference
News
Articles
Archive
Writers
Code Samples
Components
Tools
FAQ
Feedback
Books
Links
DL Archives
Community
Messageboard
List Servers
Mailing List
WebHosts
Consultants
Tech Jobs
15 Seconds
Home
Site Map
Press
Legal
Privacy Policy
internet.commerce














internet.com
IT
Developer
Internet News
Small Business
Personal Technology
International

Search internet.com
Advertise
Corporate Info
Newsletters
Tech Jobs
E-mail Offers

HardwareCentral
Compare products, prices, and stores at Hardware Central!

Integrating User Search with ASP and SQL Server Full-Text Search
By Robert Dominy
Rating: 3.4 out of 5
Rate this article


  • email this article to a colleague
  • suggest an article

    Introduction

    Microsoft Index Server is a built-in feature of Windows 2000. It provides a service for automatically indexing documents such as HTML, text, and Office documents. The Index Server search facilities can then be used to quickly locate a document.

    It allows you to perform full-text searches and retrieve all types of information from any Web browser, in just about any format, with just the click of a mouse or button. On the desktop, the search facilities are built into the standard Windows file search dialog, providing faster searches and an advanced query language. For Microsoft Internet Information Services (IIS) and ASP, Index Server is exposed through an API, a set of predefined search pages, and through integration with the FrontPage server extensions. Using these extensions, Web developers can very easily integrate advanced search capabilities into a site with little or no work.

    As an example, look at my genealogy site (see http://www.dominy.com/genealogy/search.htm). On this site I have provided a couple of search forms created in FrontPage that use the FrontPage extensions. It not only allows the user to provide simple keyword searches, but also advanced keywords, using search operators such as "near," "and," "or," and "not." When searching for common names among a lot of information, providing good search capabilities cannot only be a real time saver for the end user, but may also make the difference between whether they ever get to the information they need or simply give up in frustration.

    All of this is fine if your information sits in relatively static documents on your Web server, but what if your information resides in a database? SQL Server 2000 integrates nicely with Index Server and allows you to specify specific tables and columns for indexing. You can then use ActiveX Data Objects (ADO) or Open Database Connectivity (ODBC) to search the database using the SQL statement "CONTAINS." More information about how to set up indexing is available in the SQL Server Books Online. Look under the topic "full-text indexes" and the SQL "CONTAINS" predicate. (See http://msdn.microsoft.com/library/default.asp?PP=/library/toc/psdk/data/data0-3-5-9.xml&tocPath=data0-3-5-9&URL=/library/psdk/sql/cm_fullad_3bs2.htm and http://msdn.microsoft.com/library/psdk/sql/ts_ca-co_2y2h.htm)

    The Problem

    The missing piece in the puzzle is the user query interface. It appears (and I say "appears" because I haven't found a way yet) that the user query interface built into Index Server is not available to data that is indexed in SQL Server.

    Why is that a big deal? The purpose of the query interface is to transform ordinary, untrained, user input into an internal query. The SQL CONTAINS statement, on the other hand, is meant to be used by programmers. Although the basic syntax is not that complex, it is very structured and unforgiving. Take the following example:

        SELECT * FROM Biographies WHERE CONTAINS ('Joe Martin*')
    
    This will return all rows that contain the phrase "Joe Martin" where the second term ends in a wild card and will match variations in the name such as "Joe Martinez," "Joe Martin's," etc.

    One could construct the search page such that the users input is just passed directly into the CONTAIN statement, but as I've noted, the syntax is very unforgiving, so if the user passes in any of the following, SQL will choke with a syntax error:

    • Joe Martin* (the phrase with a wild card has to be in quotes)
    • Joe and Martin not Smith (unlike IIS and Index Server, SQL wants the "not" after an "and")
    • 'Joe Martin*' (single quotes not allowed, double quotes only)
    • and so on

    The Solution

    To solve this I created a relatively simple search-string parser that translates the users search into an appropriately well-formed SQL statement. I've tried to keep the syntax and capabilities of the user query similar to that provided by IIS so that the user doesn't need to learn a different syntax. To test-drive the parser, see http://www.angusog.com/dev/search_example.asp.

    Example search queries:

    information retrieval -- Finds records containing both "information" and "retrieval".

    information and retrieval -- Same as above.

    information or retrieval -- Finds records containing "information" or "retrieval".

    information near retrieval -- Finds records containing both "information" and "retrieval", where the two terms are located close to each other within the document.

    information retrieval -- Finds records containing the exact phrase "information retrieval". This would not match a document with the phrase "information systems retrieval," for example.

    information not retrieval -- Finds records containing "information", but not "retrieval".

    (information not retrieval) and Web -- Finds records containing "Web" plus "information", but not "retrieval".

    Web* -- Finds records containing words starting with "Web". This wild card only works at the end of a word or phrase, for example, "information ret*" or info*, but not "info*tion".

    The parser is implemented as a server-side JavaScript object, but should also be usable from Visual Basic or could be converted into Visual Basic without too much effort.

    There are three basic steps in processing a user-submitted query:

    1. Break the string into tokens.
    2. Validate the query.
    3. Create the SQL Server-compatible CONTAINS query statement.
    A sample sequence of calling the object is shown below:
    var userSearch = Request.Form("userSearch").Item;
    var sqlSearchStr = "";
    
    // Create the UserQuery object
    var uq = new UserQuery(); 
    var isOK = uq.ParseTokens(userSearch); 
    if (isOK) 
    { 
        isOK=uq.Validate(); 
        if (isOK) 
        { 
            sqlSearchStr = uq.GetMSSQLSearchStr();
        }
    }
    
    // If the parser ran into any problems, report the error to the user
    if (!isOK)
    	Response.Write(uq.error);
    // Otherwise, just plug the sqlSearchStr into your SQL CONTAIN statement
    
    A fully constructed SQL statement might look like the following:
    SELECT * FROM Biography WHERE
    CONTAINS (*,'WEB and SERVER near SQL');
    
    The UserQuery object constructs the search part of the phrase "WEB and SERVER near SQL", based on the user-submitted search string.

    By default UserQuery joins words with the AND operator. You may prefer to join them using the OR operator instead.- Just change the function "UserQuery_InsertDefaultOperator."

    As with most software, there's plenty of room for improvement. The code has not yet gone through alpha or beta testing so there are likely to be some bugs lurking. Tops on my list of to-dos are better error messages and even more flexible search syntax.

    View Source -- http://www.angusog.com/dev/source.asp?file=textsearch_inc.asp
    Download Source -- http://www.angusog.com/dev/source/SQLUserQuerySource.zip

    About the Author

    Robert Dominy is an independent software developer and consultant with over 16 years of industry experience. He is currently working on a Web-based tool that helps facilitate research on the Web. He can be reached at bob@angusog.com or visit his site at http://www.angusog.com.

  • Rate This Article
    Not HelpfulMost Helpful
    1 2 3 4 5
    Supporting Products/Tools
    Proposion N2N
    Proposion N2N connects Microsoft .NET applications to Lotus Notes and Lotus Domino databases. This ADO.NET managed data provider allows you to perform blindingly fast queries and updates of Notes data from ASP.NET pages, .NET web services, Windows, or Mobile applications. An innovative SQL-like query language leverages the unique features of Notes and makes collaborative software accessible to relational database programmers.
    [Top]
    Other Articles
    Sep 15, 2005 - Building an Image Keyword System
    Unlike text-based file formats image files aren't made up of words, which makes searching for an image file by keyword difficult. Instead of being able to simply open the file to see what it contains, we're stuck looking at the text around it and other metadata to determine the image's meaning. In this article, Ziran Sun shows you how to build a simple database-based image keyword system that allows you to associate keywords with images and use these keywords to make finding images easier.
    [Read This Article]  [Top]
    Apr 7, 2005 - A Step-by-Step Guide To Using MySQL with ASP.NET - Part 2
    In the second part of of his article on using MySQL with ASP.NET, Ziran Sun covers how to add a new MySQL user to the database server, assign the user the appropriate permissions, connect to the database, and build a simple ASP.NET page to perform a query.
    [Read This Article]  [Top]
    Feb 10, 2005 - A Step-by-Step Guide To Using MySQL with ASP.NET - Part 1
    Back in the days of classic ASP, if you were building a database-driven web site, your choice was either to invest a lot of money to get a copy of Microsoft SQL Server (or some other enterprise-ready database) or invest a lot of time finding a way to deal with the performance and scalability limitations of Microsoft Access. Luckily these days there's another viable alternative: MySQL.
    [Read This Article]  [Top]
    Jan 27, 2005 - Moving a Database from SQL Server 7.0 to SQL Server 2000
    Moving or copying a SQL Server database from one machine to another requires a lot of preparation in order to ensure a smooth transfer. In this article, Dina Fleet Berry examines the different methods and highlights the different issues associated with each of them.
    [Read This Article]  [Top]
    Jan 6, 2005 - Debugging a SQL Stored Procedure from inside SQL Server 2000 Query Analyzer
    There are many times when using SQL Server 2000 Query Analyzer to debug SQL statements is a better choice than debugging in Visual Studio .NET. In this article, Dina Fleet Berry explains why and walks you through the debugging process step-by step.
    [Read This Article]  [Top]
    Nov 24, 2004 - Persisting .NET Objects to SQL Server Using SQLXML and Serialization
    As a follow up to his article on retrieving objects from SQL Server using SQLXML and serialization, Gianluca Nuzzo discusses saving objects back to SQL Server using a schema definition file and updategrams.
    [Read This Article]  [Top]
    Sep 14, 2004 - Transaction Processing in ADO.NET 2.0
    One area that stands out when comparing ADO.NET 1.x to ADO.NET 2.0 is transaction processing. Bill Ryan shows just how easy transaction processing has become with the TransactionScope object in ADO.NET 2.0.
    [Read This Article]  [Top]
    Sep 8, 2004 - Custom Object Data Binding with .NET
    Developers often use brute force coding to marshal data between the GUI and application objects. In this article, Luther Stanton explains how to use .NET's out-of-the box data-binding functionality to make this job much easier.
    [Read This Article]  [Top]
    Sep 2, 2004 - Queue MSMQ Messages from SQL Server
    Learn how to create a console application to queue a message in Microsoft Message Queuing (MSMQ) and then use an extended stored procedure to call the console application from a SQL Server trigger.
    [Read This Article]  [Top]
    Aug 30, 2004 - Tuning Up ADO.NET Connection Pooling in ASP.NET Applications
    Connection pooling increases the performance of Web applications by reusing active database connections instead of creating a new connection with every request. This article shows how to monitor the connection pool, diagnose a potential problem, and apply the appropriate fix.
    [Read This Article]  [Top]
    Mailing List
    Want to receive email when the next article is published? Just Click Here to sign up.

    Support the Active Server Industry



    JupiterOnlineMedia

    internet.comearthweb.comDevx.commediabistro.comGraphics.com

    Search:

    Jupitermedia Corporation has two divisions: Jupiterimages and JupiterOnlineMedia

    Jupitermedia Corporate Info


    Legal Notices, Licensing, Reprints, & Permissions, Privacy Policy.

    Advertise | Newsletters | Tech Jobs | Shopping | E-mail Offers