Author Topic: 301s or canonicals?  (Read 4021 times)

jetboy

  • Inner Core
  • Sr. Member
  • *
  • Posts: 433
  • Hens of warfare!
    • View Profile
    • Email
301s or canonicals?
« on: August 03, 2016, 12:47:16 PM »
If I have three URLs my homepage can be accessed by:

https://myurl/
https://myurl/?parameter=one
https://myurl/?parameter=two

a canonical element of https://myurl/ would prevent duplicate issues. In the days before canonical elements, I'd 301 https://myurl/?parameter=one and https://myurl/?parameter=two to https://myurl/ to fix this, optionally writing the values of parameter into the page source to be picked up by analytics. Would this now be considered bad practice?

JasonD

  • Inner Core
  • Hero Member
  • *
  • Posts: 1420
  • Look at THAT!!!!
    • AOL Instant Messenger - JasonDDuke
    • View Profile
    • Domain Names
    • Email
Re: 301s or canonicals?
« Reply #1 on: August 03, 2016, 03:42:53 PM »
I don't think the 301s will ever be considered bad practice but the canonicals are the modern equivalent of dealing with these issues from a SE perspective, mostly because they are so easy to do and most SEOs are stupid nowadays and wouldn't know ho to do something server side.

However, they don't fix the usability issues for a human, so in practical terms, I'd do both. Canonicalisation for the Search engines, as it's so easy, but double up with the 301 so people get clean URL strings.

The 301 and canonicalisation won't cause any SE issues and to me.... it feels clean for all.

jetboy

  • Inner Core
  • Sr. Member
  • *
  • Posts: 433
  • Hens of warfare!
    • View Profile
    • Email
Re: 301s or canonicals?
« Reply #2 on: August 05, 2016, 09:11:06 AM »
Thanks Jason.

Rumbas

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 2106
  • Viking Wrath
    • MSN Messenger - rasmussoerensen@hotmail.com
    • AOL Instant Messenger - seorasmus
    • View Profile
Re: 301s or canonicals?
« Reply #3 on: August 07, 2016, 02:10:55 PM »
>I'd do both. Canonicalisation for the Search engines,

Hmm, you sure?
Would the search engine be able to catch the canonical if they got a 301 first? I would probably only do canonicals?

Torben

  • Global Moderator
  • Sr. Member
  • *****
  • Posts: 305
    • View Profile
Re: 301s or canonicals?
« Reply #4 on: August 07, 2016, 06:14:57 PM »
>Would the search engine be able to catch the canonical if they got a 301 first?
No.

You can set the canonical link on the primary url/page.

JasonD

  • Inner Core
  • Hero Member
  • *
  • Posts: 1420
  • Look at THAT!!!!
    • AOL Instant Messenger - JasonDDuke
    • View Profile
    • Domain Names
    • Email
Re: 301s or canonicals?
« Reply #5 on: August 07, 2016, 06:19:16 PM »
Hey Rasmus, there's two answers to that.

1. The search engine while seeing the header showing the 301 will also grab the content, as SE's rarely (and never GBot in the last 10 years or so) make a simple head request, preferring to grab the whole page and header content in one go.

This will have the desired effect that similar message of 301 is reinforced by the canonical tag, delivering the outcome we want.

However, I am going with answer #2.

Ermm, yeh. I am an idiot ;)

ergophobe

  • Inner Core
  • Hero Member
  • *
  • Posts: 9324
    • View Profile
Re: 301s or canonicals?
« Reply #6 on: August 08, 2016, 10:10:01 PM »
Rasmus,

That was the first thing I thought of, similar to putting a robots noindex tag on a page and then blocking the crawl in robots.txt

But then I was thinking that if you have a non-canonical inbound link and you 301 it and then put a canonical tag on it, you send a stronger signal that the inbound link is not canonical... but then I couldn't work out the sequence in my head.

Quote
The search engine while seeing the header showing the 301 will also grab the content

But that isn't up to the search engine to decide. Yes, Googlebot (and for that matter your browser) will just make a GET request straightaway (essentially will never make a HEAD request), but when the request comes in the server will say "Sorry, not giving you that page. My webmaster told me to give you this one instead."

So Googlebot may *request* the content straightwaway, but it isn't going to *get* it (so to speak) until the server decides it has fully negotiated the request and all the redirects, any content negotiation, etc etc.

So as far as Google knows, the content of example.com/page?param=random could be 100% different from the page it is 301ed to -- example.com/page. It couldn't and shouldn't make any assumptions about the content of example.com/page?param=random based on what it finds at example.com/page

Which gets me back to thinking that the only purpose of the rel canonical is to say "You may have found and indexed a link that you thought was good, but you have now noticed that you got redirected because that page is permanently moved top here. The reason is because I am the one true page for this content."

But I haven't the slightest clue whether or not Google cares about that second sentence (which is the signal rel canonical would be sending) or that it would add anything to the first sentence


« Last Edit: August 08, 2016, 10:14:14 PM by ergophobe »

JasonD

  • Inner Core
  • Hero Member
  • *
  • Posts: 1420
  • Look at THAT!!!!
    • AOL Instant Messenger - JasonDDuke
    • View Profile
    • Domain Names
    • Email
Re: 301s or canonicals?
« Reply #7 on: August 08, 2016, 10:18:06 PM »
>But that isn't up to the search engine to decide. Yes, Googlebot (and for that matter your browser) will just make a GET request straightaway
>(essentially will never make a HEAD request), but when the request comes in the server will say "Sorry, not giving you that page. My webmaster told me to give you this
> one instead."


Ahhhh no. That's not the case.

Just as a page that delivers a 404 message can just be the status code in the header, it can also be the status code plus content - EG https://the.domain.name/fdsufidshfuidsbif

a 301 status code could also have content too. It's just not usual to do it that way

ergophobe

  • Inner Core
  • Hero Member
  • *
  • Posts: 9324
    • View Profile
Re: 301s or canonicals?
« Reply #8 on: August 09, 2016, 12:24:57 AM »
Sure, but it's the content of the final destination page at the final URL. It just so happens in Gary's case that this is the same content, but Google can't know that.

Or am I missing something?

JasonD

  • Inner Core
  • Hero Member
  • *
  • Posts: 1420
  • Look at THAT!!!!
    • AOL Instant Messenger - JasonDDuke
    • View Profile
    • Domain Names
    • Email
Re: 301s or canonicals?
« Reply #9 on: August 09, 2016, 10:18:35 AM »
> Sure, but it's the content of the final destination page at the final URL

No, it doesn't have to be.

There can be content that is seen by the browser / requestor / search engine bot and the header.

User enters http://site.dom/Page1.html --> Server returns 301 status code and content --> browser acts on status code and redirects to http://site.dom/page2.html --> Server sends 200 status code and content --> browser renders content

With a search engine there is rarely any direct rendering but the content is sent back to the mothership, alongside the status codes etc for further processing.

In short... You can have both a 301 and content (including canonical tags) for a Search Engine to see.

jetboy

  • Inner Core
  • Sr. Member
  • *
  • Posts: 433
  • Hens of warfare!
    • View Profile
    • Email
Re: 301s or canonicals?
« Reply #10 on: August 09, 2016, 03:21:42 PM »
Code: [Select]
<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>302 Moved</TITLE></HEAD><BODY>
<H1>302 Moved</H1>
The document has moved
<A HREF="http://www.google.co.uk/?gfe_rd=cr&amp;ei=L_KpV7bxINDU8gfPka5Q">here</A>.
</BODY></HTML>

HTML source for a request for http://google.com from the UK. The RequestPolicy add-on for Firefox prevents the redirect, so makes the body visible. Would Curl or Wget see the same by default?

Quote
You can have both a 301 and content (including canonical tags) for a Search Engine to see

But the canonical on the destination page, right? The requested URL (with the 301 response code) wouldn't be a canonical source for any content.

JasonD

  • Inner Core
  • Hero Member
  • *
  • Posts: 1420
  • Look at THAT!!!!
    • AOL Instant Messenger - JasonDDuke
    • View Profile
    • Domain Names
    • Email
Re: 301s or canonicals?
« Reply #11 on: August 09, 2016, 04:37:42 PM »
> But the canonical on the destination page, right? The requested URL (with the 301 response code) wouldn't be a canonical source for any content.

The canonical can be on any of the dupe pages but pointing to the page you want to be definitive. Likely the same end destination of the 301s.

So let's presume 4 unique URLs -

page1.html

page1.html?foo=abc

page2.html?something

page2.html

We want page2.html to be what is shown in the SERPs and never page1.html, page1.html?foo=abc, or page2.html?something

But page1.html and page2.html ,with or without query strings, show the real world development and changes taken over time in the URL structure...


page1.html : Has a 301 and a canonical pointing to page2.html

page1.html?foo=abc : Has a 301 and a canonical pointing to page1.html

page2.html?something : Has a 301 and a canonical pointing to page2.html

page2.html : Has neither a canonical or 301 and just a 200 status code and normal content we want to end up showing in the SERPs

By doing the above, although there is a mish mash the SE will only show page2.html and the only URL seen in the browser bar by users (as the redirects occur too quickly to notice) will be page2.html