Taming the Weeds: Keeping Your Website User Data Clean

When viewing your Website’s user data through Google Analytics or other programs, keeping your data clean and free of spam is increasingly challenging. Website referral spam is becoming a growing problem.

Sites like semalt.com, free-social-buttons, and darodar.com (the list of spammy referral sites is increasingly getting longer) send out bots that impersonates a visit to your Website. Your Analytics program records the visit, just like it would a referral from Google search or a click from a Facebook post and this all shows up in your traffic data.

The reason referral spam is done is that it promotes the Website in Analytics and the link can improve the spammer’s Website ranking in search engines that use link counting algorithms. All this can be done on a massive scale at a relatively low cost, which explains its emergence.

A Pesky Problem and Risk Factors

Referral spam does much more than just pollute data. In addition to being a pest, the load of bots on a server can adversely affect a Website’s load and performance, which can lead to higher bounce rates, which is a search engine ranking factor. These junky bot visits are using valuable server resources and could also be looking for plugin, server, or other vulnerabilities to harm or hack your site.

A referral spam address can pose a threat as the URL may contain malware designed to steal valuable information.


Keeping the Lid on Spam

Getting rid of referral spam, like other forms of spam on the Internet, can be a bit like playing whack-a-mole. Knock one out and others can still keep popping up. This pseudo traffic may make it appear your Website traffic is surging, and can make up the lion share of traffic if left unchecked.

But there are ways to keep referrer spam from polluting your Website user data. And the good news is that unlike link spam, other than being a nuisance, referral spam is harmless to your site and can be filtered.

Setting Up Regular Expressions in Google Analytics

Under Filters in Google Analytics, you can set up regular expressions to exclude data from known referral spam sites. While creating regular expressions can require programming expertise, Ben Travis’ post ‘Removing Referral Spam from Google Analytics‘ is an excellent resource for setting up these filters.

It’s important to know that regular expressions are limited to 255 characters so you’ll need to create an additional filters if the character limit is reached. There are also other means of setting up referral exclusions through Analytics or through your own Website.

More Analytics Filtering and Other Options

Another perhaps simpler way to set up filtering in Analytics is under Property→Tracking Info→Referral Exclusion List→+Add Referral Exclusion, in which you can add individual sites to exclude from your Analytics data.


Blocking certain referrals should also be set up in a site’s .htaccess file. You can add the code such as the below to .htaccess to block the worst offenders.

RewriteCond %{HTTP_REFERER} semalt.com [NC,OR]
RewriteCond %{HTTP_REFERER} free-social-buttons.com [NC,OR]
RewriteCond %{HTTP_REFERER} darodar.com [NC]
RewriteRule .* – [F]

This blocks the site before it has a chance to register as a referral. But be wary that some spammers, never actually visit your Website, they only impersonate a visit so modifying .htaccess won’t help.

Be Cautious With Traffic Reports

If your SEO provider is reporting a surge in your Website’s traffic, this may seem like a huge positive, but it’s important to understand where traffic is coming from and measure the metrics that matter. Referral spam is junk traffic.

Website visits are all about quality over quantity so be sure the reporting and data you’re receiving shows the complete referral sources. By taking steps to block and exclude referral spam, you can filter out junk data and understand the true quality and quantity of visitors your site is receiving.

This entry was posted in Analytics. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *

CommentLuv badge