Google's new panda is off the leash

| By clariondevelop
In this article, we will look at the newborn of Google’s Panda family, Panda 4.0, and what it means for affiliates and operators worldwide.

For most people in our business, the word Panda is no longer associated to cuddly Chinese bears we should fight to preserve, but to merciless Californian algorithmic updates we should fight to survive. In this article, we will look at the newborn of Google’s Panda family, Panda 4.0, and what it means for affiliates and operators worldwide. On the 20th of May 2014, Google’s spam-cop Matt Cutts announced to the World the rollout of Panda 4.0, Mountain View’s latest Panda algorithmic update. Keep reading to understand what the difference between a Panda update and a Google Penalty is, why Google Panda 4.0 may affect your site, and how to recover in case your site’s rankings were hit by this update.
 
Panda and Google penalties

In order to fully understand what Panda 4.0 is and what it means for affiliates and operators, we should first of all clarify the difference between a penalty, an algorithmic update and a data refresh. Google penalties are specific actions taken by Google against a certain website or a set of its pages. Such actions can be manual or algorithmic, and result in an intentional decrease in rankings for one or more keywords and pages. Such decreases in rakings are site-specific, and most of the time are temporary or recoverable. Using a football metaphor, Google penalties are like the four-month ban given to the Uruguayan Luis Suarez for biting Giorgio Chiellini during a World Cup match: due to what he did, Suarez (and only Suarez) will be penalised for 4 months. After this period, as long as he stops biting other players, he can go back playing football as if nothing ever happened. 
Google algorithmic updates are more  or less permanent changes to the algorithms regulating the ways in which Google finds, rates and ranks online content, involving potentially all sites indexed by Google. The decreases in rankings that may result from algorithmic updates are therefore not site-specific and potentially permanent. Getting back to our football metaphors, an algorithmic update is like a retroactive change in FIFA’s rules: if after Suarez’s bite, FIFA had retroactively modified its rules by adding a Bite Rule stating “all matches in which automatically be won by the team of the bitten player”, then Uruguay would have lost its match against Italy and ended its World Cup earlier (…and Italy would have won the World Cup – obviously ;-) ). In a similar way, the results of any previous or future match involving biting would be changed forever, as the rules of the game would have been changed. 
 
Google data refreshes are data updates, not concerning Google’s algorithmic structure, but the data processed by a specific branch of such algorithm. Resorting to football metaphors one last time, it would be comparable to FIFA’s referees updating football results by applying the “Bite Rule” to ongoing matches, by checking whether anybody has been bitten since their last check. Google Panda is an algorithmic update adding new elements to Google’s algorithm: an additional set of rules that modify the way in which Google considers and scores certain on-site parameters. The addition of this “Panda Factor” to the mix has changed the rules of the game for all sites: following the release of the first Panda update each and every site has been assigned a “Panda Score”, which represents the quality level of the site and is used as an additional ingredient in Google’s secret ranking formula. After its first release in February 2012, this specific algorithmic component has had more than 20 updates and data refreshes, the latest of which – Panda 4.0 – was particularly significant.
 
Goal and elements of the “panda score”

Defined by Google as an update designed to decrease rankings for sites which are low-value add for users, copy content from other websites or sites that are just not very useful;, with the Panda algorithm, Google clearly opted for quality over relevance, for the first time. Tackling mostly the content-farm and news-aggregation phenomena, the new algorithm aimed at increasing the rankings of content from high quality sources, preferring it to highly keyword-specific content from middle- and low-quality sources, characterised by automated or industrialised content-production. What most webmasters rightfully wondered following Panda’s first release was how Google could really grade a site’s quality. In order to “help” them, Google released a list of questions webmasters could use as a checklist to evaluate their content, but of course it did not reveal any real insight on the actual factors taken into consideration.
 
Google panda and online gaming sites

Key targets of Google’s Panda algorithm are sites with significant amounts of copied, “thin” or re-hashed content. Therefore, high quality affiliate sites and operators have historically not been significantly impacted. This is particularly true in the case of bingo and casino sites, in which the relatively limited amount of keywords to target and a generic lack of news always meant rather small sites, with limited amounts of content. However, a few elements typical of the sites in our industry have led to some Panda-related ranking issues for affi liates and operators alike. These include:

  • Syndicated content. Sports betting sites may think it’s a good idea to provide their readers with relevant sport news, eventually by syndicating news from famous news outlets. As syndicated content is essentially copied content, this can put a site at risk. 
  • Pages with little to no content. Toplists, games, quotes, match results and stats are no doubt useful content to present to players. However, pages dedicated to this kind of content tend to have very little text, which may lead Google to assume they are low quality, from an informational point of view. 
  • Software-related content. Software providers in our industry tend to provide their partners not only with software solutions but also with game descriptions, information on legal aspects, etc. As this content is distributed to all operators using the same software provider, it is often used by many sites and perceived by Google as duplicate content. 
  • Copied, “spun” and automatically translated content. The worldwide appeal of our industry and the volatility of some of its trends has led some site owners towards adoption of content production solutions like text-spinning and automated translations. Content produced in this way can be identified as such by Google relatively easily, and becomes a target for its Panda algorithm.

 
What’s new in panda 4.0

Since its launch, the Panda component of Google’s algorithm has had more than 20 updates and data refreshes, and Matt Cutts has recently declared that the algorithm is now refreshing its data on a monthly basis. Because of this, when he tweeted on 20th May about the release of Panda 4.0 (see Figure 1) the search industry knew something big was coming: not a simple data refresh but a change in the way the “Panda Score” was being calculated. Following these changes, popular sites like Ask.com and eBay.com – traditional examples of good quality sites – suffered great losses of traffic, approximately -50% for Ask and -33% for eBay, which left webmasters all over the world scratching their heads (see Figure 2). However, after close inspection, sites that at first glance seemed to be high-quality revealed some of the typical fallacies addressed by Panda: duplicated content, empty pages and ad-heavy pages. Apart from some big names, the kind of sites which have fallen victim to this Google update more than others are press release websites (see Figure 3) and the sites syndicating their content. Also this should not come unexpected to savvy webmasters, both because press-release articles with keyword-rich links have been mentioned before by Google as a risky practice and because, due to their nature, press-release sites are mostly consist of duplicated content. Interestingly, also some technical aspects seem to have been tackled by Panda’s latest release. As noticed by Joost de Valk in a recent article10, when its robots cannot correctly access JavaScript and CSS resources and therefore have problems in rendering a page, Google now tends to assume more easily that the Webmaster is up to something fishy. The technique of blocking Google from accessing JS and CSS fi les to hide adverts and part of the site’s content is not new. What seems to have changed though is Google’s approach to these situations: from “innocent until proven guilty” to “guilty until proven innocent”.Considering the points above, the shared opinion is that the new update did not bring drastic changes in Google’s Panda algorithm, but slight tunings in the elements considered by it – possibly with a higher focus on syndicated content and on how users experience a site – also considering client-side elements like JS and CSS. 
 

How to recover from panda 4.0

If you noticed a sudden drop in rankings following the 20th of May, your rankings may have been impacted by Panda’s fourth algorithm update. As the algorithm may have been pre-rolling a few days earlier, 
you should consider this possibility even if you saw rankings starting to decrease from the l7th instead of the 20th (a good tool to check for suspicious correlations between your traffic and Google updates is 
Barracuda Digital’s Panguin Tool).If Panda 4.0 seems to have had a heavy impact on your site, I recommend checking what areas are the most affected in terms of organic traffic, evaluating your content with an independent mindset and considering the following points and actions: 
 

  1. Is your content providing users coming from search engines with the kind of information and services they would expect to find? If not, improve the content and make sure the website delivers on its promises. Bounce rate, pages per session and time on site are good metrics to check in your analytics platform to get an idea of user satisfaction.
  2. Do you have any pages with little to no content at all? If so, consider removing such pages, adding a Meta Robots NOINDEX tag to them or populating them with fresh, useful content. A good 
  3. way to check for “thin” pages is via ScreamingFrog’s word count feature.
  4. Are you hosting any copied content on your site – either due to content syndication, manual copies or external content sources such as software providers? If so, consider clearly 
  5. identifying “recycled” content by adding to all copied pages RELCANONICAL attributes referring to the original URLs. Alternatively, you may remove such pages or – possibly better – add a 
  6. Meta Robots NOINDEX tag to them. Even better would of course be to create new, good quality content. If you are not sure whether your content is truly original, for instance because you have been buying
  7. it from external providers. Copyscape is a good tool to check its authenticity. 
  8. Are you syndicating your content to external sites? If so, make sure they are not ranking above your site (that is a signal Google may be thinking they are the content originators), and consider forcing them to use REL CANONICAL attributes on their sites pointing back to your original content. 
  9. Are you intentionally or unintentionally preventing Google from being able to fully render the content of your site? If so, make sure Google’s spiders can access your JS and CSS files. Using Google Webmaster Tools’ new “fetch and render” feature is a good way to verify what Google can and cannot see in your site.
  10. Are you duplicating any content within your own site? Google Webmaster Tools will help you identifying any internally duplicated HTML Titles and Meta Description – possibly related to entirely duplicated pages and sections. Once you identify the problem, in most cases you can solve it with a careful use of REL CANONICAL attributes and Meta Robots NOINDEX tags. 

 
As Panda updates modify Google’s scoring and ranking algorithms in a complex and semi-permanent way, identifying the elements that led to an eventual drop and your recovery from it can take some time. However, as these updates mainly look at a site’s content and on-site elements, once a weak spot has been identified, it can be fi xed rather quickly by implementing on-site modifi cations. Also, as Google refreshes its data on a monthly basis you should be able to understand quite soon whether you are going in the right direction or not. So, watch out for low quality, duplicated, syndicated or re-hashed content you may be hosting on your site… and good luck cleaning up after the new cub in Google’s zoo ;-)

Subscribe to the iGB Affiliate newsletter