asp tutorials, asp.net tutorials, sample code, and Microsoft news from 15Seconds
Data Access  |   Troubleshooting  |   Security  |   Performance  |   ADSI  |   Upload  |   Email  |   Control Building  |   Component Building  |   Forms  |   XML  |   Web Services  |   ASP.NET  |   .NET Features  |   .NET 2.0  |   App Development  |   App Architecture  |   IIS  |   Wireless
 
Pioneering Active Server
 Power Search





Active News
15 Seconds Weekly Newsletter
• Complete Coverage
• Site Updates
• Upcoming Features

More Free Newsletters
Reference
News
Articles
Archive
Writers
Code Samples
Components
Tools
FAQ
Feedback
Books
Links
DL Archives
Community
Messageboard
List Servers
Mailing List
WebHosts
Consultants
Tech Jobs
15 Seconds
Home
Site Map
Press
Legal
Privacy Policy
internet.commerce














internet.com
IT
Developer
Internet News
Small Business
Personal Technology

Search internet.com
Advertise
Corporate Info
Newsletters
Tech Jobs
E-mail Offers

HardwareCentral
Compare products, prices, and stores at Hardware Central!

Defeating the Spam Spiders
By Gaddo F. Benedetti
Rating: 4.4 out of 5
Rate this article


  • email this article to a colleague
  • suggest an article

    Introduction


    Most people have been spammed at this stage. Promises of removal from the spammers list are, more often than not, either ignored or in fact act as confirmation to the spammer that the E-mail address is live and in use. And while hunting down and reporting them (or personally dealing with them) makes their lives more difficult, this is laborious at best and rarely a solution. Prevention is always the best solution and this means preventing spammers from harvesting E-mail addresses on the Web.

    How It Is Done

    These spam agents, when pointed at a usenet group or Web site, seek out E-mail addresses using their telltale @ character or a handy mailto tag in the HTML. Once these signposts are found, all the agent needs to do is read everything on either side of the @ until it comes across either a space or a question mark, denoting an attached query string.

    Thus any Web developer who is producing an on-line discussion group or simply wishes to protect the contact addresses on a Web site from such an agent must regrettably reformat them and encrypt them in such a way that will cause the agent to fail in harvesting any information. If the developer’s aim is creating a mailto link such as <A HREF=mailto:aname@acompany.com>aname@acompany.com</A>, this is not easy because it must be done client side, making any form of encryption difficult. However, these agents are quite specialized and have difficulty dealing with addresses that have been formatted differently from the norm, thus a simple client-side JavaScript function such as this could be used:

    
    	<SCRIPT LANGUAGE="Javascript">
    	<!--
    		function EncEmail(strDomain,strName,strRoot,strQuery)
    			{
    			var strMailLink="<A HREF=\"mailto:"+strName+"@"+strDomain+"."+strRoot;
    			if(strQuery!=""){strMailLink=strMailLink+"?"+strQuery};
    			strMailLink=strMailLink+"\">"+strName+"@"+strDomain+"."+strRoot+"</A>";
    			document.write(strMailLink);
    			}
    	//-->
    	</SCRIPT>
    
    
    Once embedded at the top of the page the user need only call it in order to give the visual and performance impression of a simple mailto with:
    
    	<SCRIPT LANGUAGE="Javascript">
    	<!--
    		EncEmail("acompany","aname","com","subject=Hello")
    	//-->
    	</SCRIPT>
    
    
    However, like everyone on the Internet, spammers adapt and could soon add a new subroutine to their agents, teaching them to take the address from the call to the function when the parameters are passed. This is where ASP comes in to make their lives even more difficult.

    A person who has even a little knowledge of JavaScript would have no difficulty working out which parameter is which in a function call such as EncEmail("bigcorp","jsmith","com","") - obviously jsmith@bigcorp.com. Agents, on the other hand, may be able to spot “com” or an equal sign in a query string parameter, but they cannot tell if ssmith is a name or not, and they are only likely to assume that the name is the third parameter in the list. If the page changes the function (or more correctly the order in which the parameters are passed) every time a page is downloaded from a server, this would make E-mail address harvesting very difficult indeed for the spammer.

    The first half of the ASP script would jumble up this order for the page and create the basic JavaScript function writing it to the page. We need both the permutations and a seed number to randomize the selection of permutation. Fortunately, there are 24 possible permutations, or orders in which the parameters can be passed. This is a number that we can easily seed using the server’s clock, where all 24 permutations fit into a ten-minute period, one every 25 seconds.

    Thus we call upon the time (NOW), pick out the seconds and the last-minute digit, calculate the number of seconds in that ten-minute period, and divide by 25 (rounding it down by making it an integer) to get the index to our selected order, iPermutation:

    
    <%
    	sNow = NOW
    	iMin = CInt(Mid(Right(sNow,7), 1, 1))
    	iSec = CInt(Mid(Right(sNow,7), 3, 2))
    	iTime = (iMin * 60) + iSec
    	iPermutation = Int(iTime / 25)
    
    
    With regard to listing out the permutations, we could assign the selected permutation using an algorithm. (If given five parameters, we would want to, rather than list all 120!) Seeing as there are only 24, and for the sake of simplicity, we’ll just list them out and assign them to an array as a four-character string:
    
    	sPerm(0) = "1234"
    	sPerm(1) = "1243"
    	sPerm(2) = "1324"
    ...and so on till...
    	sPerm(23) = "4321"
    
    
    Now we can assign the variables (as they will read in JavaScript) to our VBScript variables by picking them out with a Mid function. This means that we will use four variables in the JavaScript- strVar1, strVar2, strVar3 and strVar4- which we will assign to our four parts of the mailto:
    
    	sName = "strVar" & Mid(sPerm(iPermutation), 1, 1)
    	sDomain = "strVar" & Mid(sPerm(iPermutation), 2, 1)
    	sRoot = "strVar" & Mid(sPerm(iPermutation), 3, 1)
    	sQuery = "strVar" & Mid(sPerm(iPermutation), 4)
    
    
    Thus in the case that iPermutation is 2, sPerm(iPermutation) will be 1324. And so sName (the name) will become strVar1, sDomain (the DSN) will become strVar3, sRoot (the .com part of the DSN) will become strVar2, and sQuery (any query string attached) will become strVar4 when in the JavaScript function. Finally, we can write the function to the page, first remembering to close our ASP script:
    
    %>
    <SCRIPT LANGUAGE="Javascript">
    <!--
    		function EncEmail(strVar1,strVar2,strVar3,strVar4)
    			{
    			var sMailLink;
    			sMailLink="<A HREF=\"mailto:"+<%=sName%>+"@"+<%=sDomain%>+"."+<%=sRoot%>;
    			if(<%=sQuery%>!=""){sMailLink=sMailLink+"?"+<%=sQuery%>}
    			sMailLink=sMailLink+"\">"+<%=sName%>+"@"+<%=sDomain%>+"."+<%=sRoot%>+"</A>"
    			document.write(sMailLink);
    			}
    //-->
    </SCRIPT>
    
    
    However, while we've written the JavaScript function into the page, we were still missing a means to call it when we actually want to insert a mailto in our page. Essentially, what we want is a simple function call within the page, such as
    
    	<%=EncryptEmail("gbenedetti", "hotmail", "com", "")%>
    
    
    that we can use anywhere else on the page when we want to insert a corresponding and matching JavaScript call to the JavaScript function.

    We can begin our function (remembering to open our ASP Script again) thus:

    
    <%
    	Public Function EncryptEmail(sNewName, sNewDomain, sNewRoot, sNewQuery)
    
    
    First we want to pick out which order the variables are called in so we can assign our function variables correctly. So we create a new array sVar(3) and run through our JavaScript variables (strVar1, etc.).
    
    		For i = 1 To 4
    			If Right(sName, 1) = CStr(i) Then sVar(i - 1) = sNewName
    			If Right(sDomain, 1) = CStr(i) Then sVar(i - 1) = sNewDomain
    			If Right(sRoot, 1) = CStr(i) Then sVar(i - 1) = sNewRoot
    			If Right(sQuery, 1) = CStr(i) Then sVar(i - 1) = sNewQuery
    		Next
    
    
    What this means is that we have already created the client-side function that will write the mailto, however, we are calling this function in JavaScript. We still have to get our parameters is the right order once we take them in from the ASP script. We run through them with the for/next loop, checking what number (such as 2 for strVar2) it was assigned. Another way this can also be done is as follows:
    
    		sVar(CInt(Right(sName, 1)) - 1) = sNewName
    		sVar(CInt(Right(sDomain, 1)) - 1) = sNewDomain
    		sVar(CInt(Right(sRoot, 1)) - 1) = sNewRoot
    		sVar(CInt(Right(sQuery, 1)) - 1) = sNewQuery
    
    
    How you do it depends on which way you prefer. Now, we can simply create the call to the function and finally return it so that it's written to the page:
    
    		Temp = vbCrLf & "<SCRIPT LANGUAGE=" & Chr(34) & "JavaScript"
    		Temp = Temp & Chr(34) & ">" & vbCrLf & "<!--" & vbCrLf & vbTab
    		Temp = Temp & "EncEmail(" & Chr(34) & sVar(0) & Chr(34) & "," & Chr(34)
    		Temp = Temp & sVar(1) & Chr(34) & "," & Chr(34) & sVar(2) & Chr(34)
    		Temp = Temp & "," & Chr(34) & sVar(3) & Chr(34) & ")" & vbCrLf
    		Temp = Temp & "//-->" & vbCrLf & "</SCRIPT>"
    		EncryptEmail = Temp
    	End Function
    %>
    
    
    And that's it. What remains is how to implement your script, especially if you're likely to use it in multiple pages throughout a Web site. To do so, it would be best to save it all as an include file, embedded in the pages you wish to protect and inserted at the point where you would want the JavaScript function to be written on your page. Perhaps you would insert it just after the <BODY> tag, with <!--#Include File="nospam.inc"--> and call it from the page when you need it thereafter.

    We can stop here, but I should point out the limitations, issues, and possible improvements on such a script. First, you will note that quite a few variables are used in it, which is fine as long as you don't decide to declare them a second time on any new script on your page. Solving this is simple enough. Cut down on variables and rename the variables so that the chances of their being redeclared are slim - rename sQuery to, say, noSpam_Query, and so on. Ultimately, wrapping it up as a COM would best solve this issue, and using VB would require very little recoding. There would be a method (to write the JavaScript function) and a function (to write the JavaScript call to the function).

    It's not overly flexible at present. All this does is create a simple <A HREF=mailto:aname@acompany.com>aname@acompany.com</A> mailto. It does not allow for a mailto such as <A HREF=mailto:aname@acompany.com>Contact Us</A>. This could be passed as a new parameter with a null string denoting the default aname@acompany.com display value. Also, when we pass the parameters to the EncryptEmail function, it would be nice if we could do so as a single parameter - the address - to allow it to first break up the E-mail and then encrypt it. I'll leave that for another day.

    Finally, the encryption could be better. We could even decide to use XML, but that is somewhat overemphasizing the importance of encryption. A spider collecting E-mail addresses will not be able to read this script unless it is “taught” how to parse JavaScript. Ultimately a mailto must be client side and allow the browser to interpret it, and if a spider can read JavaScript, or whatever scripting or markup language we've used to encrypt it, it will be able to harvest the address.

    However, this will keep us one step ahead of them for now.

    Download the Code

    You can download the complete source for the sample contained in this article:

    http://15seconds.com/files/990819.zip

    About the Author

    Gaddo F. Benedetti is an Internet and software developer based in Dublin, Ireland. Most of his work has been creating Web spiders, agents, and CBR search agents through VB applications and ASP. He is a great believer in embedding other languages in his ASP (or anything that sounds good at the time) until the headaches of actually trying to implement them kick in.

  • Rate This Article
    Not HelpfulMost Helpful
    1 2 3 4 5
    Mailing List
    Want to receive email when the next article is published? Just Click Here to sign up.

    Support the Active Server Industry


    The Network for Technology Professionals

    Search:

    About Internet.com

    Legal Notices, Licensing, Permissions, Privacy Policy.
    Advertise | Newsletters | E-mail Offers