Google warning: is your site abused through redirects?

April 4, 2009 by admin 

Google recently wrote in one of its official blogs that it is possible for spammers to take advantage of your website without ever setting a virtual foot in your server. Spammers can do this by abusing open redirects.

What are open redirects?

Many websites use links that redirect their website visitors to another page. Some redirects are left open to any arbitrary destination. These redirects can be abused by spammers to trick web surfers and search engines into following links that seem to be pointing to your website although they redirect to a spammy website.

That means that people who think that they visit your website will be redirected to highly questionable web pages that might contain adult content, viruses, malware or phishing attempts.

Which redirects on your website could be abused?

Spammers are very inventive. According to Google, they have managed to use the redirect spam on a wide range of websites, including the websites of large well-known companies and the websites of small local government agencies.

For example, the following redirection types can be abused:

  1. Scripts that redirect users to a file on the server can be abused by spammers. The links on your website could look like this:

    http://www.example.com/download.php?url=http://www…

    http:///www.example.com/get/pdf/?http://www…

  2. Site search result pages with automatic redirect options. If the result pages of your internal site search feature contain an URL variable that sends your website visitors to other pages, spammers might be able to exploit them:

    http://www.example.com/search?q=keyword&page=1&url=…

  3. Affiliate tracking links. Affiliate tracking links often allow people to direct website visitors to other pages. Spammers might enter their own URLs in the tracking links. Example:

    http://www.example.com/track.php?affid=123&url=…

  4. Proxy pages. Proxy sites send people through to other websites and they can be abused by spammers:

    http://myproxy.example.com/?url…

  5. Interstitial pages. Some websites show an interstitial page when users leave a website to let users know that the information found on the link is not under their control. These URLs usually look like this:

    http://www.example.com/redirect/http://www…

    http://www.example.com/out?http://www…

    http://www.example.com/cgi-bin/redirect.cgi?http://www…

How to find out if your website is abused

Even if you find none of the URLs above on your website, your site still may have open redirects. Do the following to check if your website is abused by spammers:

  1. Make a site search on Google

    Go to Google.com and search for “site:yourdomain.com”. Replace yourdomain.com with your own domain name. If you see web pages that have nothing to do with your website then it’s likely that someone exploits a security hole on your website.

  2. Check your web server logs for URL parameters like “=http:” or “=//”. If your redirection URLs get a lot of traffic, this could also be caused by spammers.
  3. If you get user complaints about content or malware that you know cannot be found on your website then your website users might have seen your URL before they were redirected to the malware site.

What you can do to protect your website

It’s not easy to to make sure that your redirects aren’t exploited. The reason for that is that an open redirect is not a bug or a security flaw. There are some things that you can do to protect your website:

  1. Check the referrer. Your redirect scripts should only work if they area accessed from another web page of your website. The redirect script should not work if the user accesses the script directly or from a search engine.
  2. If possible, make sure that the script can only redirect to web pages and files that are on your own websites. You could use a whitelist of allowed destination domains.
  3. Use the robots.txt file of your website to exclude search engines from the redirect scripts on your website. That will make your website less attractive for hackers.
  4. Add a signature or a checksum to your redirect links so that only you can use the script.

Open redirect abuse is a big issue for Google right now. If you secure your scripts, spammers will move over to other websites and leave your website alone.

Ranking test: can there be too many links to your home page?

January 18, 2009 by admin 

In an online webmaster forum, a webmaster described the link experiment that he did with his websites. He tried to find out how linking to the home page affected his rankings.

What did the webmaster test?

The webmaster tested the effect of links from sub pages of his website to his home page. He tried links to the home page of his website from the navigation and from the content and he tried links with and without keywords.

The test was done with a 4 year old domain name with a dedicated IP address. The web pages were HTML only. The website ranks top 5 in Google for its main, second and third keyword phrases and it has a total of 90 pages with unique content.

What were the results of the test?

It seems that too many links to the home page of your website can have a negative effect on your rankings:

  1. Linking to the home page from every page in the content with the same keyword caused a six pages drop in rankings (-6 pages).
  2. Linking to the home page from every page in the content using keyword variations caused a three pages drop in rankings (-3 pages).
  3. Linking to the home page from the navigation with “main keyword” also caused a six pages drop in rankings (-6 pages).
  4. Linking to the home page from the first 10 pages listed on Google.com for “site:domain.com/*” increased the ranking from 5th to 3rd (+2 positions).

The webmaster also observed the following:

  • Linking from the content using keyword variations was effective to a point, after which the rankings dropped.
  • There seems to be a page threshold. If the number of pages that link is even slightly above the threshold, the rankings will drop.

Does this mean that you shouldn’t link to your home page?

It’s hard to tell whether the results of this experiment are valid because there are too many other variables that influence the rankings of a web page.

It doesn’t sound sensible that Google will downrank a web page that has a link to its home page on every page. Most users expect a link to the home page on every page of a website and even Google has a link to its home page from every page.

As Google’s usual webmaster advice is to focus on the website user, it seems implausible that Google would penalize home page links.

We think that it’s more likely that the ranking drops are caused by Google’s change filter. If you change your web page contents, Google will temporarily downrank your web pages. This has been described in a Google patent.

Five mistakes that keep search engine robots away from your website

October 15, 2008 by admin 

Many webmasters don’t get high rankings on Google and other search engines just because Google’s indexing robot has difficulty to index their web pages.

Search engine robots are very simple software programs. If an indexing robot cannot find the content of your website immediately, it will skip your site and go to the next link in the list. For that reason, it is very important to make sure that search engine robots can index your web pages without problems.

Here are the top 5 elements that drive search engine robots away:

Reason 1: Your robots.txt file is damaged or it contains a typo

If search engine robots misinterpret your robots.txt file, they might completely ignore your web pages.

Double check your robots.txt file and make sure that you use the disallow parameter only for web pages that you really don’t want to have indexed.

Reason 2: Your URLs contain too many variables

URLs with many variables can cause problems with search engine robots. If your URLs contain too many variables, search engine robots might ignore your pages.

Here’s Google’s official statement about web pages with many variables:

“Google indexes dynamically generated webpages, including .asp pages, .php pages, and pages with question marks in their URLs. However, these pages can cause problems for our crawler and may be ignored.”

Reason 3: You use session IDs in your URLs

Many search engines don’t index URLs that contain session IDs because they can lead to duplicate content problems. If possible, avoid session IDs in your URLs. Better use cookies to store session IDs.

Reason 4: Your web pages contain too much code

Of course, your web pages can contain JavaScript code, CSS code and other script code that is not directly related to your content. Visit your website with a web browser and select “View source” or “View HTML source”.

If it is difficult for you to spot the actual content of your website then search engines might also have difficulty to parse your pages.

Reason 5: Your website navigation causes problems

Fancy JavaScript or DHTML menus cannot be parsed by most search engine robots. Flash or AJAX menus are even worse when it comes to website navigation.

As mentioned above, search engine robots are very simple programs. They can follow HTML links, all other links can cause problems.

Optimized web page content and good inbound links are crucial for high search engine rankings. However, the best content and the best links won’t help you much if search engines cannot index your pages.

Make sure that search engine spiders can index your web pages without problems so that your web pages can get the rankings they deserve.