Logo: TechTrax...brought to you by MouseTrax Computing Solutions

JavaScript Email Cloaking

by Shawn K. Hall

This article is protected by Copyscape! DO NOT COPY without permission!

Skill rating level 8.

This article is reprinted with permission from the author. All rights reserved. Original article: http://reliableanswers.com/js/mailme.asp

For webmasters, spam is far more of a problem than a simple nuisance. Since web-spiders actually exist with the sole purpose of collecting/harvesting email addresses from websites, webmasters have come up with some very interesting (and often quite useless) means of attempting to prevent their address from being scalped by those scum-sucking filth that denigrate the Internet. Since a webmasters purpose for having a website is generally to enable his customers to be able to contact them, removing the email address altogether is hardly an option.

The solutions offered up to now have been anything but useful:

  • Remove the email addresses. (get real!)

  • Replacing your entire contact mechanism with a contact form. Though this works (I use that word loosely; many contact forms actually send the target email address to the client in plain text in a fashion that makes it just as easy to scalp), it can be very frustrating to visitors of your site that don't fit "the mold" that you've framed your contact form around. It's also outrageously annoying to have to use a custom interface just to contact a potential vendor about their product so you can decide whether you want to spend your money on them.

  • "Encoding" it using html entities for the characters or including the letters "nospam" (or similar) within the address. This is equally as ineffective. It literally takes only a split second to remove "nospam" or decode the html entities from a list of thousands of email addresses. Have no fantasy that using these methods will prevent your email address from being scalped by automated means.

The code on this page serves as a solution that actually works to reduce, if not completely eliminate, spam that is sourced from the email address(es) exposed on your website. This solution is not guaranteed or warranted in any way, and the only support I offer is to paid users of my internet hosting services or via my "DesignAdvice" discussion group. There are several reasons why this method is not warranted, such as the fact you are obtaining it for free, I have no means of knowing whether the email address(es) you are cloaking are being used/posted/relayed elsewhere (which opens them up to more extensive opportunities for exploitation), and the simple fact that any method used to cloak email addresses will inevitably be cracked. The beauty of this method is that it takes that simple fact into account and provides wrapping functions for both server and client that make enhancing or replacing the encoding/decoding algorithm a simple activity (assuming you know a little script).

This system, which, after setup, is relatively painless to maintain is more difficult to use if you're not using a server-side scripting language of any kind (that is so '90's), but even if you're not, it's still relatively painless.

Client-Side Code

I'll take you from theory to implementation and give you the scripts necessary to make it work with ASP and PHP.

A simple link looks like this, before conversion:

<a href="/contact/" title="Contact Me!">Contact Me!</a>

We're not going to get rid of the current href, since this way, even if they don't have javascript or an email client on their system, they can still contact you using a contact form (at "/contact/").

All we need to do is add onmouseover and onfocus handlers to change the actual address to the 'real' email address when they interact with it:

<a href="/contact/" title="Contact Me!"
onmouseover="javascript:this.href=mailMe('example%23com','me');" onfocus="javascript:this.href=mailMe('example%23com','me');"
>Contact Me!</a>

(Note: For HTML compatibility you should use "onMouseOver" and "onFocus", for XHTML use "onmouseover" and "onfocus" - most browsers don't 'care' but some strict parsers do)

Now we'll break apart the script:

javascript:this.href=mailMe('example%23com','webmaster');

javascript: is required in order to tell the browser what language we're using (some browsers will fail to parse it correctly if we don't include this).

this.href= means that it will change the current anchor tag (<a />) href attribute (where it goes) to the result of the following function: mailMe().

mailMe() is just my own pet name for the function, I highly suggest you rename it for your own use to something even more obscure, if possible. The idea is that the more unique your own method is (following these simple guidelines) the less likely Joe Blow's Email Scalping Spider will be able to collect the email addresses from your site.

My mailMe() javascript function looks like this:

function mailMe(sDom, sUser){
  return("mail"+"to:"+sUser+"@"+sDom.replace(/%23/g,"."));
}

What this does is swap the input positions for user and domain and replace the 'munged' text with the correct text (using a Regular Expression replace operation).

You may find it tempting to add a target attribute to these links in order to better control the environment for new windows should the user not have scripting enabled. I recommend against this, as some browsers treat mailto links as a natural browsing action and it leaves an orphaned window or tab after the email client finally intercepts the requests and generates a compose message window.

Munging the text is a "good idea." Otherwise it just takes "the right RegEx pattern" to extract the actual addresses from your site into the emails they were originally. The email address in the script above has been 'urlencoded' to make it slightly less obvious, and also had all the .'s replaced with "#" (which is an invalid character within a domain, so likely to break/fail most regex-based spiders scalping the site, if not immediately, then when they actually attempt to send the email).

Now how do you make the script part? You can hand-code it (yuck!) using this pattern:

mailMe('example%23com','webmaster');

  • "example%23com" is the domain, in this case "example.com", where "." has been replaced with "#" then URL-encoded to produce "%23". So replace all the "."'s in your domain with "%23"

  • "webmaster" is the account at the domain to send to. This would result in webmaster@example.com

So to convert the code above to send to "admin@yahoo.co.uk" you would use:

mailMe('yahoo%23co%23uk','admin');

Great! You can do it by hand now.

Yuck! By hand!?!

Ok, let's make it easier on ourselves. Seriously - what would we do if we found out spiders were suddenly capable of parsing this method and we wanted to change the encoding mechanism globally across our entire site all at once? We'd have to use our own regex to fix it (which introduces untold numbers of cans of worms) or we just make two changes:

  1. We change the function reference in our include file to encode it differently.
  2. We change the function in our referenced javascript file to decode it differently.

Sweet.

Server-Side Code

Mind you, for server-side functions I use the same function name since the server-side code in no way interferes with the client side code. Maybe it's strange to some, but it relieves me from having to remember two function names. :)

So, for ASP:

<%
  Function mailMe(sAddress, sCaption, sTitle)
  '=mailMe("user@example.com","Display","Title")
    Dim sBuild, sSplit, sSplit2, sMailMe
    sSplit = Split(sAddress, "@", 2)
    If InStr(1, sSplit(1), "?", 1) > 0 Then
      sSplit2 = Split(sSplit(1), "?", 2)
      sMailMe = "mailMe('" _
        & Server.URLEncode( sSplit2(0) ) & "?" _
        & Replace( sSplit2(1), "'", "\'") & "','" _
        & Server.URLEncode( Replace( sSplit(0), ".", "#") ) _
        & "')"
    Else sMailMe = "mailMe('" _
        & Server.URLEncode( sSplit(1) ) _
        & "','" _
        & Server.URLEncode( Replace( sSplit(0), ".", "#") ) _
        & "')"
    End If
    sBuild = "onmouseover=""javascript:this.href=" _
      & sMailMe & ";"" " _
      & "onfocus=""javascript:this.href=" _
      & sMailMe & ";"""
    sBuild = "<a href=""/contact/"" " _
      & sBuild _
      & " title=""" & sTitle & """>"
    If sCaption = "" Then
      sBuild = sBuild & sSplit(0)
    Else
      sBuild = sBuild & sCaption
    End If
    mailMe = sBuild & "</a>"
  End Function
%>

This is called using:

<% =MailMe("Webmaster@example.com","Contact Me!","Send me email") %>

...where "Webmaster@example.com" is the target email address, "Contact Me!" is the display text (DO NOT USE THE EMAIL ADDRESS HERE!!!) and "Send me email" is the title attribute (what is displayed on float over or read to voice interface prompts and stuff). That function results in the complete HTML anchor tag, resulting in (breaks added for readability):

<a href="/contact/"
  onmouseover="javascript:this.href=mailMe('example%23com','Webmaster');"
  onfocus="javascript:this.href=mailMe('example%23com','Webmaster');"
  title="Send me email">Contact Me!</a>

The PHP code to do the exact same thing:

<?php
function mailMe($saddress,$scaption,$stitle){
//variables
  $eaddress= ""; $sdomain= ""; $aextra = "";

//begin parsing
  list($eaddress, $sdomain)= split('@', $saddress);
  list($sdomain, $aextra) = split('\?', $sdomain);
  $sdomain = ereg_replace('\.', '#', $sdomain);

//create the js address
  $smailme = "mailMe('".urlencode( $sdomain );
  if($aextra != "" ){
    $smailme .= "?" . urlencode( $aextra );
  }
  $smailme .= "','" . urlencode( $eaddress ) . "')";

//build the js events
  $sbuild =" onmouseover=\"javascript:this.href=$smailme;\"";
  $sbuild.=" onfocus=\"javascript:this.href=$smailme;\"";

//return
  return "<a href=\"/contact/\"$sbuild title=\"$stitle\">$scaption</a>";
}
?>

Lastly, it should be noted that both of these server-side functions also provide the courtesy of encoding query values, such as a specific subject tag or what-not, to the address as well. You could readily use something like this and it would encode correctly:

<% =mailMe("Webmaster@example.com?subject=hey out
there!","Contact Me!","Send me email") %>

or

<?php echo mailMe("Webmaster@example.com?subject=hey out
there!","Contact Me!","Send me email") ?>

Well, now you can do it all on the server. Add a <script> tag on the client that references your decoding javascript file with the mailMe() function in it and you're golden.

Can it be broken by intelligent spiders? Of course. You sit Joe Cracker down with the code for an hour or so and he'll correct his RegEx-capable parsing spider to decode even this. But by using a server-side mechanism like this one, you can change the encoding/decoding algorithm month-to-month, week-to-week, or even hour-to-hour and effectively avoid 100% of even the "intelligent" spiders out there. I have, however, used this method on my site for years (changed ever-so-slightly over the time period) and have not once had a single spam message go to any of the addresses that use this mechanism (that are solely used for those aspects of the site, mind you).

It's not too late to change over. Even if you're already getting spam to your email addresses you can switch to this method (or similar) and avert the inevitable increase in spam that will occur when more and more spiders collect your addresses.

Why bother? Who cares? It's just email!

Why should you be concerned about email obfuscation and cloaking?

Because there are some really unethical people out there that literally do nothing all day long but collect email addresses using web-spiders. No joke. The method described here not only prevents your email address from being obvious simple text (which every email spider will get, regardless of where it is on the page), but it also performs levels of obfuscation and abstraction with event-driven decoding that make it useable for nearly every "real" browser in the world, and gracefully degrade for those wacko no-script people that refuse to allow simple javascripts to function. Using the scripts here you should have no reason for anyone to be able to say they couldn't contact you through your website, while also having very low likelihood of your email address ever being scalped from this code and used for spam.

I've seen dozens of 'solutions' which are, for want of a better term, garbage. From hex-encoding the string or char-encoding the URL or other javascript solutions that just split it using "' + '"... well, they're all garbage. I have demonstrated how they all fail using very simple regular expressions (which are a primary component of every basic operating system over the last 9 years). This method provides you with the potential to change your algorithms at any point, globally across your site, by changing only 2 files. Talk about fast updates. :)

While I have no fantasy that I'll be able to prevent every email harvester in the world, I'll do my best to prevent as many from getting my email address as possible. Email is a time-sink, and even if I use the best anti-spam products and services in the world, spam wastes time and bandwidth that could be best used doing, well, anything else. And don't forget - it's not just about preventing spam, it's about ensuring the ability to receive legitimate contacts. That's at least equally important.

See Shawn's related article: Examples of Flawed Methods for further information.

Click to rate this article.

Go up to the top of this page.
This site powered by the Logical Web Publisher™: Content management by Logical Expressions, Inc.