How did you like the article?
4
How did you like the article?
4

Protecting your email address: how to prevent spam

As a website operator you’re advised to include your email address on your website so that you can easily be contacted by visitors. The problem with including your e-mail address is that you could find yourself inundated with spam. This isn’t just annoying, it could also mean you’re at risk from phishing emails as well as malicious programs that cyber criminals hide in attachments. Necessity being the mother of invention, website operators try everything to stop spambots from gaining access to their email addresses. We compare the most popular methods and explain their advantages and disadvantages.

Email harvesting: how spambots stalk their prey

Email harvesting is the automated acquisition of email addresses for unfair advertising, phishing attacks, or spreading malicious software, and is usually software-supported. For this purpose, specialized programs (known as ‘email harvesters’) search websites, mailing lists, internet forums, or social media platforms for email addresses. The characteristic syntax, which corresponds to all email addresses, delivers the sought-after contact information. Simple search patterns in a website’s source text search for @ signs. This sign is not normally found in natural texts, but is used in email addresses to separate the username and domain from each other. Transcribing the address provides little protection. More refined spambots can even search for popular alternative spellings such as [at], [AT], (at), (AT):

User@domain.com

User[at]domain.com

If the @ sign or its equivalent contains two special characters separated by a dot, this is a clear indication for the harvester that this is an email address. Even changing the 'dot' in front of the top-level domain offers comparatively little protection and makes it harder to read:

User[AT]domain[DOT]com

Even more revealing than the @ sign, is the HTML email link 'mailto:user@domain.com'. This allows website visitors to open their preferred email program with a simple click. The recipient address is automatically copied to the corresponding field. This is practical, but still doesn’t stop the spambots from realizing this is an email address. Website operators are therefore advised not to use traditional patterns when it comes to providing contact options. At the same time, human site visitors should still be able to read the address so they can easily contact you if they need to.

Classic representation of the email address without protection

In order to be able to protect an email address as much as possible before it is automatically read out by email harvesters, you have to envision how it is generally integrated into a web page. A simple, easily accessible email address can be inserted into any HTML page using the following code example:

<p>If you have any questions or suggestions, please write an e-mail to: 
  <a href="mailto:user@domain.com">user@domain.com</a>.
</p>

When a visitor accesses a web page with this code, the web browser displays the following information including a clickable mailto link:

If you have any questions or suggestions, please write an email to: user@domain.com.

From the user’s point of view this is an ideal representation of an online email address. In order to keep the display user-friendly, the most popular method for protecting an email address is to make it look unrecognizable in the source text without changing how it looks in the browser. Alternatively, it is possible to separate the email address from the actual website and forward it to the mailto link with a side-server redirect. However, re-writing the email addresses in the browser view is becoming less and less common. The reason for this is that it doesn’t look as good for the user and isn’t that much more effective at preventing spam.

Effective tricks that can be used to protect you against spam rely on substitutions, masks, or encryptions in the source text, which hinder the spambot and not the user.

Substituting the email address

Protection strategies based on substitution involve removing the entire e-mail address from the source text and replacing it with a graphical representation or a referral to the mailto link.

Integrate email address as a graphic

If an email address is implemented as a graphic, it can still be read by the human eye, but texts written as graphics are hard for email harvesters to recognize. Occasionally there are spambots that are able to convert image elements into text elements using OCR (Optical Character Recognition), but this isn’t common. Including corresponding contact information as a graphic therefore offers a comparatively high protection against spam. However, website operators must realize that these can limit the user-friendliness of their website. The following HTML code shows how an email address can be integrated into a website as a graphic file:

<img src="Path/graphicfile.png" with="120" height="20" alt= If you have any questions or suggestions, please write an email to: user@domain.com"> 

The following graphic will then be displayed to website visitors:

This email address is legible for most people. The text can neither be copied nor linked to a mailto link. While it is difficult for most users to manually type an email address, text information in the form of a graphic is often not available for users with visual impairment. Therefore, it makes sense to include a description of the graphic as alt text. These can be read out be screen readers, but the downside is that spambots can read them as well so this method alone is not recommended as a preventative measure against spam.

HTML email link via redirect

In order to effectively protect email addresses from harvesters, it’s a good idea to separate them from the website. A script is generally used, which redirects human users to the mailto link after the first click. This opens the user’s email program and displays the address. For spambots that scan the source code of a website, this link will look like a file link and therefore prevents automatic reading. This protection mechanism can, for example, be implemented as a link to a PHP file that contains the redirect:

<p> If you have any questions or suggestions, please write us an
  <a href="redirect-mailto.php">email</a>.
</p>

The content of the redirect-mailto.php file is a script that redirects to the actual mailto link:

<?php
header("Location: mailto:user@domain.com"); 
?>

Since PHP is processed on the server side, spambots that read a website’s source code have no chance of getting to the email address. If it is necessary for the email address to be displayed on the website, it’s recommended for you to combine this method with graphically integrating the email address.

The disadvantage of this spam prevention solution is that users need a handler for mailto: to get to the email address. In practice, this is usually an email program such as Outlook or Thunderbird. However, web mailers can also be entered as handlers in new browsers.

Masking the email address

If you don’t want to completely replace an email address with a graphic or a mailto link, there are alternative strategies. They make it possible to code an email address by masking additional elements or first compiling them in the browser using JavaScript. Simple encoding can be implemented by HTML entities, for example, as well as by URL or HEX encoding. Simple masking strategies rely on the comments feature, HTML elements, and CSS. A bit more complex, however, is to mask an email by dynamically composing the address.

Simply transcribing characters means that these methods are limited to manipulating the address in the source code and therefore not affecting how it is displayed in the browser.

Masking by character encoding

Common character encoding, used when masking email addresses in the source code, is based on HTML entities, HEX code, or the percentage of URL encodings. These descriptions were originally developed for representing special characters through standard characters. For masking email addresses, this type of encoding is suitable because the respective reference characters are automatically translated in the browser view. If the characters of the email address user@domain.com are masked using HTML entities, they are first written in the alternative style.

&commat; = @

&period; = . (dot)

This results in the following source code:

<p> If you have any questions or suggestions, please write an email to: 
  <a href="mailto:user&commat;domain&period;com"> user&commat;domain&period;com</a>
</p>

Since HTML entities have only been defined for special characters, this means that with this character encoding, neither the entire email address nor the significant text string mailto: can be encrypted. Alternatively, a representation using HEX encoding is possible. The Unicode character number is used here and is listed in the following basic schema:

&#characternumber;

Typically, the hex number of the corresponding character is indicated by a small 'x'. Thus the letter 'm' could be noted down as '&#x6d;' or decimal '&#109;'. The email address user@domain.com including the mailto link will look like this:

<p>If you have any questions or suggestions, please write an 
<a href="&#x6d;&#x61;&#x69;&#x6c;&#x74;&#x6f;&#x3a;&#x62;&#x65;&#x6e
;&#x75;&#x74;&#x7a;&#x65;&#x72;&#x40;&#x64;&#x6f;&#x6d;&#x61;&#x69;
&#x6e;&#x2e;&#x64;&#x65;">email</a>.
</p>

The corresponding reference characters for translating an email address can be easily found from lists available online. A clear overview is provided on htmlarrows.com. If you want to encode the complete email address, we recommend encoding programs that are offered free of charge as web applications on numerous websites.

Another way to protect email addresses from spam is to use URL encoding. This method was originally developed to assign special characters in a URL to something that the browser could interpret. Three-character combinations are used that originate from the two-character ASCII hexadecimal code of the respective character and a pre-defined percentage symbol. The following example shows an @ sign being masked by URL coding:

<p>If you have any questions or suggestions, please write an
  <a href="mailto:user%40domain.com">email</a>.
</p>

In principle, masking the email address can be quickly and easily done by character encoding. The protection is comparatively low presently since most spambots are now programmed to easily decipher this simple form of encryption.

Masking by supplementing

Basically, it is possible to hide email addresses from spambots by inserting additional characters into them. Programs will then hopefully not see the address as a whole and therefore it won’t be able to be read out automatically. HTML comments provide a simple way to do this.

<!-- Comment -->

Ideally, these include just the characters that are normally used in email addresses.

<!-- abc@def -->

<!-- @abc.com -->

If comments like these are added into the email address, spambots (who scan the website) will stumble across the following code:

<p>If you have any questions or suggestions, please write an email to:
us<!-- abc@def -->er@domai<!-- @abc.com -->n.com. 
</p>

In the browser view, however, the HTML comments are invisible.

Alternatively, it is possible to insert any characters without comments, as long as they are hidden in the browser view using CSS. In the following example, the email address is interrupted by a span element. The content between the start and the end tag isn’t considered because of the display quality along with the value none.

<style type="text/css">
span.spamprotection {display:none;}
</style>

<p>If you have any questions or suggestions, please write an email to:
user<span class="spamprotection">CHARACTER SEQUENCE</span>@domain.com. 
</p>

While a human user receives a correct email address in the web browser, a spambot is expected to read out the blended text in the span element. This gives website operators the option to use the email address userCHARACTERSEQUENCE@domain.com as a so-called honeypot in order to locate sender addresses and block them from spam attacks.

A disadvantage of masking by supplementing is that with this method the email address can’t be connected with an HTML email link. In this case, users must manually copy the address into their email program.

Reversing a string

CSS can be used not only to hide additional characters in the source code, but also to reverse the string. This enables website operators to store email addresses in the wrong order in the source code in order to deceive spambots.

<style type="text/css">
span.ltrText {unicode-bidi: bidi-override; direction: rtl}
</style>
<p>If you have any questions or suggestions, please write an email to:
<span class="ltrText"> moc.niamod@resu</span>.
</p>

While spambots find the character string moc.niamod@resu in the source code, the CSS property unicode-bidi ensures (along with the value bidi-override) that all characters within the appropriately distinguished span elements are read by the browser just as the quality direction intends them to be – in this case from right to left (rtl).

This masking means that email addresses aren’t displayed as they usually are. However, more advanced spambots can’t be deceived by this trick.

Dynamic composition with JavaScript

JavaScript offers another way to make sure the correct email address is displayed in the browser. The address is divided into several parts that are dynamically composed by the browser when the website is called up.

<script type="text/javascript">
var part1 = "user";
var part2 = Math.pow(2,6);
var part3 = String.fromCharCode(part2);
var part4 = "domain.com"
var part5 = part1 + String.fromCharCode(part2) + part4;
document.write("If you have any questions or suggestions, please write an email to:
   <href=" + "mai" + "lto" + ":" + part5 + ">" + part1 + part3 + part4 + "</a>.");
</script>

In lines 2 to 6, the individual sections of the email address are defined. The @ sign is defined in two steps. The Math.pow(2,6) function in part2 determines the number of the character in the ASCII compatible character sets (26 = 64). This is converted to the corresponding character in part3 by the function String.fromCharCode(part2). The output of the parts defined in part1 to part5 is performed in lines 7 and 8 by the document.write() function. The email address becomes available only after client-side execution of the script. It’s also possible to have a variant where the script is only started once the user has clicked.

Anti-spam methods that use scripts for dynamic composition are based on the assumption that email harvesters can’t fully implement JavaScript. If this is the case, it could be assumed that there’s a high level of protection. The disadvantage of this method is that users who have deactivated JavaScript in their browser aren’t displayed as much contact information as they should be. This doesn’t affect many users today though.

Encrypting the email address

With JavaScript, email addresses can not only be assembled from individual parts, but the scripting language also enables you to encrypt the email address to protect it from spam. A common method for email encryption is ROT13, which can be implemented with just a few lines of JavaScript.

<script type="text/javascript">
function decode(a) {
  return a.replace(/[a-zA-Z]/g, function(c){
    return String.fromCharCode((c <= "Z" ? 90 : 122) >= (c = c.charCodeAt(0) + 13) 
                               ? c : c - 26);
  })
}; 
function openMailer(element) {
var y = decode("znvygb:orahgmre@qbznva.qr");
element.setAttribute("href", y);
element.setAttribute("onclick", "");
element.firstChild.nodeValue = "Open email software";
};
</script>
<a id="email" href=" " onclick='openMailer(this);'>Email: please click</a>

In line 9 of the sample code, it shows the encrypted version of the email address user@domain.com including the mailto text string (znvygb:orahgmre@qbznva.qr) as well as how it should be encrypted (in lines 2 to 7). The function in lines 8 to 13 opens the user’s preferred email program and writes the decrypted address into the recipient field.

The script is started by clicking on the link with the anchor text 'Email: please click' (lines 15 to 16). After being clicked on, this displays the text 'Open email software' (line 12).

Just like the JavaScript-based composition of the email address, the encryption method is based on the assumption that spambots can’t interpret the entire client-side script language or can only partly interpret it. Theoretically, the encrypted email address could be used as a honeypot. In this case the domain should not be encrypted.

CAPTCHAs

CAPTCHAs offer the possibility of protecting an email address from spam. Encrypted email addresses are only displayed in plain text if a check has revealed that the user is human. These checks come in different forms such as asking the user to type a letter or number combination. Easy calculations, combination tasks and puzzles are also options for CAPTCHAs. A free CAPTCHA service is provided by Google with reCAPTCHA.

CAPTCHAs offer a comparatively high level of protection against spam since email addresses are either not displayed at all or only in the encrypted form in the source code. CAPTCHAs can also be easily integrated into a website’s design. However, the additional effort required to get to an email address does have a negative impact on a site’s user friendliness since it hinders the user from easily accessing the contact information.

Alternative: feedback form?

Instead of posting an email address on their website many website operators provide a feedback form that allows visitors to enter their messages as well as leave their name and contact address. These are redirected in the background to a stored recipient address. Integrating it into the website can be done using server-side programming languages such as PHP. In order to prevent spambots from automatically filling out these forms and sending them, they are usually secured by CAPTCHAs.

Conclusion

The strategy you should use to protect your email address depends primarily on the presentation requirements that need to be met and which technical possibilities are available. Redirecting to the mailto link using with the help of PHP or similar server-side programming languages is a good protection method. However, this must be supported by the hosting site. If you decide to list your email address on your website, it’s recommended to display the email address as a graphic.

Transcribing as well as coding using HTML entities, HEX code, or URL encoding, offer less protection in comparison. However, the last ones are a precursor for any subsequent encryption. Masking or encrypting via JavaScript provides reliable protection against spambots and you could also consider presenting your email address in a graphical way. This is definitely a good idea when the address isn’t created on the website, but rather only in the mailto handler.

Encryption JavaScript Security