| download source codeWhy Build an Image Keyword System? Unlike text-based file formats (HTML, XML, plain text and even PDF and MS Word documents), image files aren't made up of
text. This makes searching for an image file by keyword difficult. Instead of being able to simply open the file to
see what words it contains, we're stuck looking at the text around it and other "metadata" to determine the image's
meaning. If you're dealing with just images, then there is no surrounding text or metadata... so what should you do if
you need to deal with a lot of images on a routine basis? Well, you could give them all really descriptive file names. The problem with that is that descriptive names means
long names and dealing with long file names can be a pain... especially on the web. Creating a hyperlink
or image tag for a file whose name is a hundred characters long (and includes spaces) is not going to get you into
anyone's good graces. Aside from using long, descriptive filenames, building a simple keyword system to associate
keywords with the images is probably the easiest option. That's what this article is going to cover.
Database Design Okay, so we've decided to build a system to hold keywords that describe each image in our image library.
So where do we store these keywords? Because they're designed to manage data and are easy to query, a
database seems the natural choice. A first pass at a table design usually leads to something like this: 
Some sample data in this table might look like this: 
While this design is nice and simple, it does have some problems. First of all, it's not clear from the design
in what format keywords should be entered. In one record they're comma-delimited and in the next
there are no commas. There's also nothing to keep keywords uniform. The keywords for the image "reddog.gif"
were probably meant to be "red" and "dog", but a typo has left "dog" misspelled. Luckily
the misspelled "dogg" actually contains the correct spelling of "dog", but it'd only be sheer luck
that the image would come up in a search for the word "dog". Finally, because the amount of data is so small
it's also fairly obvious that "wetdog.gif" and "reddog.gif" were both meant to be associated with
the keyword "dog", but as the size of the data set grows it'll be harder to determine which images are related.
This makes searching for secondary images with keywords similar to a primary image more complicated then it should be. So what's to be done? Well first off, we move the keywords to a table of their own and give each keyword an
integer for a primary key. This allows us to correct keyword typos without destroying associations that may
have already been formed. Because each image can have many keywords and keywords can belong to many images,
the easiest way to handle these relationships is via a relationship table that does nothing but relate images
to keywords. So we end up with three tables: 
Notice that both the image and keyword tables have primary key identity columns, while the imagekeyword
relationship table uses a combination of ImageId and KeywordId to form a primary key. This assures that
each image is associated with each keyword only once.   
So, looking at the sample data above, the image table tells us that image file "19150545.jpg" has
an id of 1. Then, by checking the imagekeyword table we find out that the image with an id of 1 is associated with
keyword ids 1 and 3. Finally the keyword table tells us that keywords 1 and 3 are "house" and "tree"
which would lead us to believe that image file "19150545.jpg" should be a picture containing a house and a tree. At first this method may seem like a lot more work, but luckily that's really no the case. Aside from a couple
ugly joins and queries when searching for multiple keywords, this design actually makes most queries easier and
more efficient!
The Web Pages With the database design in place and the tables full of our sample data, now we get to the fun part... actually
writing some code to get this thing running. Building a form to insert a new image or keyword is trivial and
so I'm not going to waste your time going over them. As the sample stands there are three main pages: Show All Images (default.aspx): This page simply shows all the images in the database. Click on an image to go to the image details page for that image. 
This page really has very little to do with the sample, but it serves as a nice start page and an easy way to get
to the detail page for any image. You can get the same result by going to the search page and selecting every
keyword checkbox and using the "Any" search option, but the code to this page is much simpler then the search
page. Image Details (details.aspx): Displays the image and associated details. Also displays / allows you to edit the keywords that are associated with the selected image. 
This page shows you all the details of the selected image and at the same time allows you to associate/disassociate
keywords with the current image. Depending on your situation you might want the editing to be locked up in an
administrative area, but as it stands currently anyone who can see the image details page can edit the keyword associations. Keyword Search (search.aspx): The search page allows you to search the images based on one or more keywords. 
This is sort of the whole point... an easy way to search for images based on keyword. You can change the interface
to work any way you want (text links of each keyword would work well), but as it stands I'm using checkboxes
because it's the easiest way to select multiple keywords easily. As I mentioned earlier, the query for multiple keywords is
a little ugly, but once it's written you don't need to worry about it... just use the web page.
Well that's it... I'm not going to bother posting the code here in the body of the article because after all it is code...
it's meant for computers to read - not people!
The Code You can download the code for the web pages above, the sample images, the database creation script (for SQL Server),
and the actual data exported to .csv files all zipped up in a nice little package from: here (245 KB).
Conclusion I hope you found the concept interesting and if the code pertains to something you're working on, please download it and take a look.
I'll be disappointed if I'm the only one that thinks the keyword editing section on the image detail page is sort of cool!
|