What is Analytic Referrer Spam? and Why you should Block

0

referrer spam

Google Analytic is used by website owner to track their visitor.Google Analytics provides us with many tools and features for analyze the activity going to our WebSite.

Most of the website owner use Google Analytic to check daily traffic data, to make a new ad campaign or write a new content on the basis of analytic statistics, referrer spam could be broke these strategy. The records that these referrals leave in your reports are not from genuine visits, and they include futile information influencing measurements like Bounce rate and the Avg. session time.

On the off chance that you need to get the best out of this capable tools and have definitive insights, is essential to keep away this undesirable visits when you spot them.

Meaning Of Referrer Spam

Referrer spam happens when your site gets fake referral movement from spam bots and this fake activity is recorded by your Google Analytics.

Genuine Referrer and Referrer Spam

Genuine referrer are those who link back your page from their original sites, for example article on  Best Agriculture Website Templates in which found more useful links for visitor.means regularly it is utilized to indicate where the client is coming from.

Referrer Spam are those which only saw in Google Analytic reports, it seems like you’ve had a lot of visitors, but in reality none of these visits where real people visiting your website.Referrer Spam is problem for smaller website, Because you will need to check each referral visit to confirm on the off chance that they are genuine or not.

“A referrer is a simple HTTP header that’s passed along when a browser goes from one page to another page, normally used to indicate where a user’s coming from. But users can change it, and some people will set referrer at pages they want to promote and visit tons of people around the web — people see it and say ‘Oh, I should check it out’. It’s not necessarily a link… there are some people who try to drive traffic by visiting a ton of websites with an automated script and setting the referrer to be the URL they want to promote… there’s no ‘authentication’… You can’t automatically assume that it was the owner of the URL if you see something showing up in your dashboard. Somebody is trying to do some hijinx.”

Matt Cutts, Head of Google Webspam Team

Referrer Spam Can’t I just ignore it?

Referrer spam has various conceivable negative impacts and shouldn’t therefore be ignored.

Look at the first thing, just see your site is connecting out to the URLs that the spammers are promoting,  it might be downgraded in the SERPs, or even removed. You’ll be inadvertently creating a lot of links to websites that promote products and services from the shadier side of the web like Viagra pills – which can make your site look spammy. Those sites using referrer spam to promote themselves are often punished by Google, and chances are, Google will re-evaluate the sites linking to them as well. If you do nothing else, you need to make sure you’re not displaying a list of referers on your website.

Second, as mentioned at the start, referrer spam messes up your website analytics data, making it difficult to know how well your website is performing. This is especially true when you get a high volume of referral spam – it could hurt your conversion rates, and mess up your reporting if you dont realise you have referral spam. Fortunately, you can make some tweaks to Google Analytics to filter out the referrer spam from your results.

Why are they hitting my site?

You may shock how they get advantage from this, There are some reason to hitting your sites major is traffic and second one is referrer spam as a form of lead generation, third is affiliate traffic to shopping sites such as AliExpress.com and eBay.com.

Two Types Of Referrer Spam

In the setting of Google Analytics, referral spam comes in two fundamental types: spammy web crawlers and ghost referral visits.

Ghost Referrals

For example: hulfingtonpost.com / social-buttons.org / darodar.com

Ghost Referrals spam is most found spam on our analytic and it’s called Ghost because they actuallyNEVER VISITED YOUR SITE. Using some software magic, they post fake pageviews to Google’s tracking service using a random series of tracking IDs. When they pick a series that includes your tracking ID, Google records a referral visit from their source in your reports.

Ghost Referrer Spam targets arbitrary Google Analytics  tracking ID’s with the main motivation behind getting visits from website owner that get curious about the referral in their investigation and attempt to go to the site, so don’t go there.

Google-Analytics-Ghost-Referrer-Spam

The best way to stop them until further notice is with filters in Google Analytic. A generally proposed arrangement is to utilize .htaccess file to stop them, i think with the help of .htaccess file it is not 100% possible to stop Referrer Spam, because The .htaccessfile is a configuration file which can control the access to your WebSite, so its pointless to stop them from here.

People often get confused and think they successfully blocked Ghost Referrer Spam from the .htacces file because this kind of spam usually shows up only for a few days and then disappear, so it is just a coincidence.

website owner get confused and think they effectively blocked Ghost Referrer Spam from the .htacces file on the grounds that this kind of spam more often than not appears just for a couple of days and afterward vanish.

 for more information view Google’s Measurement Protocol documentation.

Crawler Referrer Spam

A Web crawler is an Internet bot that brows your sites, most for the purpose of Web indexing. Like Google bots that find pages and indexed them so you can find them when you search, these are good crawlers.

A Crawler Referrer Spam also browse sites but with a different purpose like getting traffic to their site. This crawler usually avoid all rules like robots.txt which it suppose to stop the spiders from “crawling” specific pages.

The difference with Ghosts is that Crawler Referrer Spam really access to your site and they can and its prescribed to stop them from the .htaccess record, although filtering in Google Analytics will likewise work.


How to detect Referrer Spam?

The easiest route is by searching for irregular and suspicious referrals, their websites won’t have any genuine reference to yours, however you need to void visiting them so to make sure you can utilize the information they leave in your reports.

By Checking Referral Hostname

This is the most easiest way at the moment for Ghost Referrer Spam. This sort spam will have this field either as not set or as a fake hostname. To discover the hostname they are utilizing:

  • Go to Reporting on the to tabs and Select a wide Timeframe the bigger the better
  • In the parallel bar select Aquisition
  • Extend All Traffic and Select All Channels
  • Click on Secondary Dimension type Hostname and select it

How to detect Referrer Spam

This will help you to identify Ghost Referrer Spam(Red), because they utilize an invalid Hostnames since they don’t know who are they focusing on. The Crawlers Referrer Spam(Orange) then again utilize valid Hostnames.

how-to-detect-Referrer-Spam-by-hostname

Note: Some of the valid Hostnames besides your own are webcache.googleusercontent.com and translate.googleusercontent.com, these are used by google.

By Checking metric numbers

The spam usually leaves either very high or very low numbers.

  • Go to the reporting section in your Google Analytics.
  • Select Acquisition in the lateral bar.

Checking metric numbers

 

  • Expand All Traffic
  • Select Referrals. A table like this will show

After you got this reports, you have to find out numbers like 0.00%  or 100.00% in New Sessions or Bounce Rate, or Avg. Session Duration of 0 or 1 second.

how-to-detect-Referrer-Spam-by-hostname

In spite of the fact that this is the normal behavior of the referrer spam, there are some that are beginning to change this, so they appear to be less suspicious.

By Checking Target page

Another clue that the Referrer is actually a Spam is that the page targeted will be either the home page, identified by a slash or a fake one.

Referrer-Spam-By-Checking-Target-page

If you are still unsure and you suspect of a referral check this list that I’m updating frequently.

Identify types of Referrer Spam

Although you can stop all referrer spam utilizing filters, sometimes is good to identify the type to utilize one of alternate methods.

The main difference is the hostname they utilize. Ghosts utilize a fake hostname since they don’t know really how are they focusing, on the other hand, Crawlers utilize your domain name.

check your physical access logs.  

This is why Ghost Referrer Spam (most of the spam) can’t be blocked from the “.htaccess” file or javascript or any other method outside of Google Analytics.

To demonstrate this, I took a segment of a Google Analytics Report with all Referrer Spam (crawler and ghost) that hit me last month.

get-Referrer-Spam-List


Referrer Spam List  (updated 01-May-15)

I update the list oftentimes so you can keep it as a reference. Crawler Referrer Spam is marked with the word Crawler, these are the only ones that can be blocked from the .htaccess file.

The rest are Ghost Referrer Spam and they should be excluded with filters in Google Analytics.

Text file: Referrer Spam List.txt

Most recent Referrer Spam

googlsucks.com
best-seo-solution.com Crawler
best-seo-offer.com Crawler
simple-share-buttons.com
social-buttons.com
s.click.aliexpress.com
humanorightswatch.org
o-o-6-o-o.com
bestwebsitesawards.com
resellerclub scam
darodar.com
hulfingtonpost.com
ilovevitaly.com
buttons-for-website.com Crawler
buttons-for-your-website.com Crawler
blackhatworth.com
semalt.semalt.com Crawler
semalt.com Crawler
forum20.smailik.org
4webmasters.org
torture.ml
amanda-porn.ga
generalporn.org
depositfiles-porn.ga
youporn-forum.ga
rapidgator-porn.ga
meendo-free-traffic.ga
buy-cheap-online.info
www.Get-Free-Traffic-Now.com
addons.mozilla.org

Other:

ranksonic.info
savetubevideo.info
see-your-website-here.com
ranksonic.info
Iskalko.ru
BlackHatWorth.com
lomb.co
lombia.co
econom.co
cenoval.ru
7makemoneyonline.com
priceg.com
kambasoft.com
lumb.co

How to stop Referrer Spam

There are a couple of ways to stop Referrer Spam there is no best no most worst it all relies on upon the types of Referrer Spam and your needs.

You can only stop Ghost Referrer Spam with the help of Google Analytics with filters and  Crawler Referrer Spam must block  from the .htaccess file it’s recommended, but Google Analytic Filters will also work.

Using filters in Google Analytics

This task is useful to block Referrer Spam if they are Ghost or Crawler from showing in you Google Analytics. You can add a filter for any of the Referral SPAM in the above list.
  • 1Go to your Google Analytics account and select Admin tab.

Google-Analytic-Admin

  • Under View Column Select  Filters    Google-Analytic-filters
  • Click on New Filter    GA-New-filter
  • Put a meaningful name for the Filter
  • Select Filter Type Custom. In Filter Field, find and select Campaign Source. Put the Referrer Spamname in the Filter Pattern text box.

stop Referrer Spam

Using this filter verification tool you will get result for filter working nicely.It will show you the data before and after applying the filter.

filter verification tool

This tool takes a sample of data of 7 days so if in those days there is no appearance of the Spam the verification won’t work. But the filter will work anyway.

You can find more information about filter verification and how to use it here Save time and protect your data, use filter verification

In this example, the filter will take care of all subdomains of simple-share-buttons.com, as you can see in the next image.

GA-simple-share-buttons.com-filter-verification

  • After you set everything Save it. You can repeat this process for all the other spam.

It’s important to consider that it may take up to 24 hours before filter effects become visible in your data. So don’t worry if you still see them after you apply it, they will disappear eventually.


Valid Hostname Filter by Multiple

Adding filters for every Referrer Spam may be the most secure and simplest way to stop Referrer Spam however is not very efficient, particularly if you manage many sites, adding a new filter for each of them can became tedious and unmanageable.

One approach to dispose of the vast majority of the Ghost Referrer Spam is by making an INCLUDE filter for Valid HOSTNAMES. Since the Spammers don’t generally know who the target is, they utilize a hostname that isn’t yours.

The tricky part of this technique is to get a list of all VALID HOSTNAMES to not reject any genuine visits. This technique is ideal for little medium sites because the larger the site the more Hostnames it may have.

Note: It’s recommendable to always keep a view without any filter, that way you can always see the full report and check if everything is OK

To see a list of hostnames

  • Go to the Reporting tab on GA  and Select a wide Timeframe the bigger the better.
  • In the lateral bar select Aquisition.
  • Expand All Traffic and Select All Channels.
  • Click on Secondary Dimension type Hostname and select it.

How to detect Referrer Spam

 

  • Once there, you will see a table like this 1. Find and Copy all the valid hostnames.

Identify-valid-hostname-GA

If you go one level down 2 by selecting Referral and adding Hostname as Second dimension again you can see what hostnames the Referrers Spam use.

As you can see the Referrers Spam marked in yellow (Crawler type) use a valid hostname. For these, you will have to use one of the other methods. Some others use a known name like amazon.com.

A couple of hostnames you should always add besides your own arewebcache.googleusercontent.com  and translate.googleusercontent.com, these are used by google.

The rest of the list should be completed with every domain, subdomain and any other valid Hostname you may have in my casewww.webresourcesfree.com, webresourcesfree.com

6 Create a Regular Expression that match all your hostnames.

Both of these REGEX examples will work exactly the same:

webresourcesfree.com|paypal.com|translate.googleusercontent.com|webcache.googleusercontent.com
webresourcesfree\.com|paypal\.com|translate\.googleusercontent\.com|webcache\.googleusercontent\.com

Notice that in the example I only use webresourcesfree.com and not www.webresourcesfree.com this is because it will match any other domain or subdomain having webresourcesfree.com for exampleblog.webresourcesfree.com - www.webresourcesfree.com - es.webresourcesfree.com

REGEX Tips:
Don’t leave any spaces. The | characters separate hostnames and the backslashes \ are used to escape the dot in regular expressions (dots are special characters in Regular Expressions), but Google Analytic allows also accepts the expression without it in simple REGEX’s
More about Regular Expressions.

Once you have the REGEX build, you should add an INCLUDE Hostname filter.

7 Go to the Admin tab and in the click and Select the View you want where you want to apply the filter.

8 Select Filters

Filters

If you haven’t created a No filter View. I recommend you to add one to be able to check later if the filter is working correctly.

 

9 Select New Filter New Filter

10 Select Create New Filter and put a Name for the filter

11 In Filter Type select Custom

12 Make sure you choose Include and select Hostname from the dropdown.

13 Finally, paste the REGEX that you build in Filter Pattern.

Valid Hostnames Filter analytics

I fully recommend you to verify this filter before applying it. You can save time on testings and protect your data.

So before saving select Verify this filter. It will show a table showing you sample data of before and after applying the filter.

Filter Verification google analytics

After you make sure no valid data is excluded you can save the filter.

This solution doesn’t require much maintenance but is VERY IMPORTANT that every time you add a Hostname you add it to the REGEX.


By Using the .htaccess file

This file controls who access your website, between numerous different things, and its helpful to block Crawlers Referrer Spam like semalt.com or catches for-website.com, don’t attempt to utilize it for Ghost Referrer Spam it won’t have any impact since as we probably am aware this kind of referrer never gets to your website.

Keep in mind to always make a backup before changing anything and be cautious when you modify it in light of the fact that even one lost character can leave your site out of reach.

A standout amongst the most well-known Crawler Referrer Spam is semalt.com  on the off chance that you would prefer not to utilize .htaccess file you can utilize one of the solutions here 4 ways to STOP semalt.com referral spam

If you think your comfortable working with the .htaccess file you can use it to block Crawler Referrer Spam with any of these mods

With mod_rewrite

## STOP REFERRER SPAM
RewriteCond %{HTTP_REFERER} semalt\.com [NC,OR]
RewriteCond %{HTTP_REFERER} buttons-for-website\.com [NC,OR]
RewriteRule .* - [F]

You should add a backslash before every dot.

With mod_setenvif

<IfModule mod_setenvif.c>
# Set Referrer Spam as spambot
SetEnvIfNoCase Referer semalt.com spambot=yes
SetEnvIfNoCase Referer buttons-for-website.com spambot=yes
## add all the SPAM sites you want

Order allow,deny
Allow from all
Deny from env=spambot
</IfModule>

 


By Changing your tracking ID

This technique doesn’t exactly block Referrer Spam, however it makes your Google Analytics less visible to them.Is a perfect solution for new Websites.

Since this kind of Spam normally targets UA-XXXXXXX-1 ID’s, on the off chance that you change your Google Analytics following ID for one that doesn’t end in 1 like UA-XXXXXXX-2 the vast majority of the Referrer won’t reach you.

It’s a good way of preventing new Referrer Spam from showing in your Google Analytics and don’t have to add a filter and wait for it to start working,  however this is ONLY recommended if you have a new site or you don’t care if you split your statistics since changing the tracking ID will start your statistics from 0.

It’s a good method for keeping new Referrer Spam from showing in your Google Analytics and don’t need to include a filter and sit tight for it to begin working,  however this is ONLY suggested if you have a new site or you don’t care if you split your statistics since changing the tracking ID will start your statistics from 0.

As I promise I tested this solution with 100% of effectiveness for Ghost Referral Spam. Here is a Screenshot of a test Inactive Google Analytics account with 3 trackings ID’s. As you can see the only one that got hit is UA-XXXXXXXX-1, the other 2 as expected, are untouched.

Referrer Spam Inactive Website Ghost

And here is the list of Referral that hit GA even with an inactive website. All the hits are from Ghost Referrer Spam there are no filters or segments.

GA-Ghost-Referrer-Spam-tracking-ID

This is another proof that Ghost Referrer Spam can’t be stopped by .htaccess rules since this site is not active the “visits” didn’t visit anything.

To change your tracking-ID

1 First go to the admin tab in GA.

2 Click the drop-down below the PROPERTY column and select Create new property.

Traking ID GA Change

3 Choose a new Name and enter your website address

4 Press Get tracking ID button

Change tracking Id google analytics

5 You will get a new tracking ID, add it to your website.

analytics tracking id

As for now this seems to be the best option to avoid Ghost Referrer Spam. So if you are creating a new Website this is your best choice to avoid being hit by Ghost Referrer Spam.


Excluding good Bots and Spiders

These kind of Crawlers are not bad, on the contrary, they keep running the internet and helping us get the better results for our searches. But they also add records when visiting your site, records that are not useful.

You shouldn’t block these bots / spiders because that can make you less visible for the web, but you can exclude the safely from your Analytics.

Google recently add a function to do this a lot easier. You can see full details here Google Analytics Bots and Spider Filtering.

1 Go to the admin tab  admin GA

2 Select the View you want to apply it

3 Click on View Settings

View Settings analytics

4 Check Exclude all hits from known bots and spiders, almost at the bottom of the settings screen.

Exclude all hits from known bots and spiders

5 Save and its DONE!

Selecting this option will exclude all hits that come from bots and spiders on the IAB know bots and spiders list. The backend will exclude hits matching the User Agents named in the list as though they were subject to a profile filter. This will allow you to identify the real number of visitors that are coming to your site. -Google Analytics


Conclusion

Referrer Spam affects many of us to a greater or lesser extent and it shouldn’t be taken lightly.

Even on high volume websites were data spamming would be marginal, you still have to explain why there’s such a discrepancy. As an analyst you can’t dismiss it simply by saying “nah… we’re not too sure what it is, but I heard about that spamming thing…”

Stéphane Hamel

I think you should go for Valid Hostname filter you won’t have to worry much about future occurrences of Referrer Spam,  if not utilize this method to stop Referrer Spam that better fits your needs and your knowledge. The only thing that should be clear now is that you can’t block Ghosts Referrer Spam (99% of the spammers) with .htaccess rules.

If you are starting with fresh analytics I recommend you to use a tracking Id ending in 2 or more (UA-XXXXXX-2), that will prevent you from most of the Referrer Spam as I demonstrate it.

I hope this helps you understand better how Referrer Spam affects your analytics and have better, cleaner and more useful statistics.

Hopefully, Google will add soon another checkbox to exclude malicious Crawlers and Spammers, so we can worry more on analysing our data and less in cleaning it.


If you want to go deeper after stopping the Spam,  you may need to get clean report from your past influenced data. You can see how to do it in this article: Remove Referrer Spam from historical data with Segments

Since spammers are using new techniques, I’m regular updating this article with new information and solutions as I test them, so watch it.you know and need to impart whatever other techniques or you know a Referer Spam that is not on the rundown, please tell me in the remarks and I’ll add it to the article.


Resources

These are other great posts about Referrer Spam:

Thanks to Ben and Nick for the ideas and help to build this article.

All the Google Analytics screenshots are taken from this site reports. I corrected some of them by removing blank spaces between information, so it may not coordinate precisely Google Analytics originals screens, but rather all the information is all genuine.

LEAVE A REPLY