Big Daddy: An Infrastructure Update for URLs

Big Daddy debuted in December 2005 but rolled out over several months and was completed in March 2006. It wasn’t an algorithm change; instead, the infrastructure update focused on the canonicalization of uniform resource locators or URLs, 301 and 302 redirects, indexing, and crawling.

What’s It For

URL canonicalization is the process of selecting the best URL among several similar choices. While www.sitesample.com, www.sitesample.com/, https://sitesample.com, and http://sitesample.com/, may all look the same from a human perspective, search bots treat each one as unique since they have minor differences in the structure. It typically pertains to homepages and how Google picks the web address that’s the best representative of your site among the relevant choices.

Google also announced how they changed the way bots handled 301 and 302 redirects. 301 redirects are used for pages that have permanently migrated to a new location or address while 302s are for temporary movements only. This part of the Big Daddy updates addresses the problem of URL hijacking or typosquatting, which relies on typos or mistakes made by users when typing in a URL.

These are the common types of URL hijacking:

  • Common Misspelling – With sitesample.com, typosquatters will take advantage of spelling errors especially from foreign users. They may take URLs like sightsample.com or sitesimple.com.
  • Typographical Errors – Another typical way is to rely on typos like sietsmaple.com or sitesamlpe.com.
  • Variations – It could also come in a different variation such as siteexample.com or sitemaple.com.
  • Different Top-Level Domain – Others take advantage of another top-level domain with sitesample.net or sitesample.org.
  • Country Code Top-Level Domain Abuse – Similar to acquiring the operation rights of a different top-level domain, abusing country code top-level domain means that typosquatters take the same brand or name in a different country such as sitesample.uk or sitesample.au. Even the simple omission of the letter “m” in sitesample.com can result in a different website altogether.

What Were Its Effects

While Big Daddy is more of an infrastructure update, it still affected the rankings of several websites. Some webmasters complained that they didn’t see any of their pages indexed in data centers where the update initially rolled out and was more likely to get supplemental results.

However, Google announced that the issue was that those sites had low trust scores in the search engine’s algorithm regarding inlinks and outlinks. The de-indexing was reportedly caused by excessive reciprocal links, linking to spammy web neighborhoods, or buying and selling links, which were targeted during the Jagger update which rolled out from September to November 2005.

What It Means for You

With Google’s focus and emphasis on providing the best user experience, it shouldn’t come as a surprise that the search engine still favors canonical URLs or web addresses that are clean and easily readable today. Searchers generally want URLs that they can memorize like sitesample.com/blog/this-is-a-great-example/ instead of sitesample.com/blog/12f5eae?=1qad/.

To ensure that Google picks the URL you want for your website, you have to be consistent with the structure of the web addresses of each of your pages. If you have used https://sitesample.com/ for your previous links, it no longer makes sense to switch to https://www.sitesample.com/ unless you want to do a significant overhaul for your web infrastructure.

Here are some situations where proper URL canonicalization is beneficial:

  • Having different URLs for the same content
  • Having various categories and tags pointing to similar content
  • Having mobile versions of your website displaying the same page, but on different URLs or domains
  • Having HTTP and HTTPS or www and non-www versions of URLs with the same output
  • Having syndicated content distributed among different channels and platforms

These are the various ways you can canonicalize multiple URLs:

  • Rel=canonical This attribute is added in the header tag of any post, and Google automatically knows that a particular page is the original version of another that’s featured in a separate URL.
  • 301 Redirect As mentioned above, this one pertains to a page that has been permanently moved from point A to point B. This status code signals to Google that you don’t want anyone visiting point A anymore and would like all traffic to go to point B.
  • Passive Parameters Google Search Console allows you to set particular parameters passive and set them to be inactive on specific URLs or all of it. It informs the search engine to treat the page as if it doesn’t exist if it displays this URL parameter. You just need to log in to your account on the site and go to the Search Parameters section to set the parameters.