The Core

Why We Are Here => Water Cooler => Topic started by: 4Eyes on January 03, 2012, 05:32:56 PM

Title: Archive.org changes...
Post by: 4Eyes on January 03, 2012, 05:32:56 PM
.. .as I understand it, they are now only showing archives for sites that have a current robots.txt.

Not checked it, just what I have been told.
Title: Re: Archive.org changes...
Post by: Zwart on January 03, 2012, 05:37:15 PM
"current" ????
Title: Re: Archive.org changes...
Post by: littleman on January 03, 2012, 06:59:50 PM
I'm not seeing it on pages I have bookmarked.  Maybe they'll still be live, but removed from search?
Title: Re: Archive.org changes...
Post by: I, Brian on January 03, 2012, 10:26:06 PM
Am rebuilding a website after the hosting completely actually dropped the database in a server move (wasn't Clook!). Archive.org has a good record of the site from early last year I am using to rebuild the posts, and there's no robots.txt on the site.

Would presume archive.org are looking to only maintain crawls now on sites with robots.txt, if making any changes based on it??
Title: Re: Archive.org changes...
Post by: 4Eyes on January 04, 2012, 10:12:22 AM
http://www.archive.org/post/406187/we-were-unable-to-get-the-robotstxt-document-to-display-this-page
... not very clear. but seems to be implying not showing unless robots.txt is there.

Plenty of other threads in their forum on the general subject, but their forum is such a pain to use that I lost the will to live after 5 minutes.
Title: Re: Archive.org changes...
Post by: 4Eyes on January 04, 2012, 10:14:03 AM
... and there are a few posts on the illogical use of current robots.txt to block historic pages.

Archive.org = A bag of worms IMO