Gmail Calendar Documents Reader Web more »
Recently Visited Groups | Help | Sign in
Google Groups Home
Discussions > Suggestions & feature requests -- webmaster-related only, please > Errata: Googlebot erroneously appears to be adding "index.html" to URLs
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  12 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Bob Gezelter  
View profile  
(6 users)  More options Oct 30 2008, 9:35 pm
From: Bob Gezelter
Date: Thu, 30 Oct 2008 03:35:40 -0700 (PDT)
Local: Thurs, Oct 30 2008 9:35 pm
Subject: Errata: Googlebot erroneously appears to be adding "index.html" to URLs
Beginning on or about October 15, Googlebot appears to be adding
"index.html" to references to our site that only contain the domain
name (e.g., http://www.rlgsc.com").

This is clearly erroneous, as there are a variety of default names for
homepages. It is registered as a "Crawl error -- page not found". This
is clearly erroneous, as the URL as present in the referring www page
would have been correct, if the "index.html" had not been appended to
the URL. In fact, the home pages on the sites that we build is
typically "default.html".

Timely correction would be appreciated.

- Bob Gezelter, http://www.rlgsc.com


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
webado  
View profile  
 More options Oct 30 2008, 11:06 pm
From: webado
Date: Thu, 30 Oct 2008 05:06:15 -0700 (PDT)
Local: Thurs, Oct 30 2008 11:06 pm
Subject: Re: Errata: Googlebot erroneously appears to be adding "index.html" to URLs
What referring page does it say has that url on it?

On Oct 30, 6:35 am, Bob Gezelter wrote:


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Bob Gezelter  
View profile  
 More options Oct 31 2008, 10:37 am
From: Bob Gezelter
Date: Thu, 30 Oct 2008 16:37:46 -0700 (PDT)
Local: Fri, Oct 31 2008 10:37 am
Subject: Re: Errata: Googlebot erroneously appears to be adding "index.html" to URLs
On Oct 30, 8:06 am, webado wrote:

Webado,

One such example is: http://www.openvms-rocks.com. According to the
Errata list in Webmaster Tools, this page references http://www.rlgsc.com/index.html.
It does not, the reference is http://www.rlgsc.com.

I am sure that I am not the only one who uses a URL without a filename
and type to refer to the home page of a site.

- Bob Gezelter, http://www.rlgsc.com


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
webado  
View profile  
(4 users)  More options Oct 31 2008, 1:52 pm
From: webado
Date: Thu, 30 Oct 2008 19:52:47 -0700 (PDT)
Local: Fri, Oct 31 2008 1:52 pm
Subject: Re: Errata: Googlebot erroneously appears to be adding "index.html" to URLs
Perhaps it had found that url in a previously cached copy of that or
other pages from that site.

On Oct 30, 7:37 pm, Bob Gezelter wrote:


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Bob Gezelter  
View profile  
(5 users)  More options Oct 31 2008, 6:30 pm
From: Bob Gezelter
Date: Fri, 31 Oct 2008 00:30:50 -0700 (PDT)
Local: Fri, Oct 31 2008 6:30 pm
Subject: Re: Errata: Googlebot erroneously appears to be adding "index.html" to URLs
On Oct 30, 10:52 pm, webado wrote:

> Perhaps it had found that url in a previously cached copy of that or
> other pages from that site.

>> .. deleted in the interest of conserving bandwidth/space ...

> > - Show quoted text -

webado,

Unlikely. I am familiar with most of these pages from earlier
curiosity, and they were never a problem until now, and never had any
filename/type in the URL. Also, there never was a http://www.rlgsc.com/index.html
page for them to have linked to in any event.

- Bob Gezelter, http://www.rlgsc.com


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Bob Gezelter  
View profile  
(1 user)  More options Nov 1 2008, 6:50 am
From: Bob Gezelter
Date: Fri, 31 Oct 2008 12:50:09 -0700 (PDT)
Local: Sat, Nov 1 2008 6:50 am
Subject: Re: Errata: Googlebot erroneously appears to be adding "index.html" to URLs
On Oct 31, 2:30 am, Bob Gezelter wrote:

Note to all,

As a temporary "patch" I have added a redirect page at http://www.rlgsc.com/index.html

- Bob Gezelter, http://www.rlgsc.com


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
acidfanatic  
View profile  
(6 users)  More options Nov 24 2008, 3:12 pm
From: acidfanatic
Date: Sun, 23 Nov 2008 20:12:24 -0800 (PST)
Local: Mon, Nov 24 2008 3:12 pm
Subject: Re: Errata: Googlebot erroneously appears to be adding "index.html" to URLs
I am having this same problem and my ranking in google has suddenly
tanked
I found a bunch of http://www.acidfanatic.com//index.html not found by
googlebot in the webmaster tools
You are killing my site google
www.acidfanatic.com has been around since 2001 do a search for the
word acidfanatic and there are over 5,000 references to my site and
yet it is being buried in the search listings for the keywords that it
is most relative for "acid loops" and "acid music" your search
algorithm is not working and your results are not relevant if they
exclude one of the most popular sites for it's genre from being found

On Oct 30, 2:35 am, Bob Gezelter wrote:


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
JohnMu Google employee  
View profile  
 More options Nov 24 2008, 10:05 pm
From: JohnMu
Date: Mon, 24 Nov 2008 03:05:41 -0800 (PST)
Local: Mon, Nov 24 2008 10:05 pm
Subject: Re: Errata: Googlebot erroneously appears to be adding "index.html" to URLs
Hi acidfanatic

Looking at your site, I don't see any technical issues which would
result in your site having trouble with regards to crawling, indexing
or ranking. In particular, the tests for /index.html absolutely do not
impact anything. Every site has lots of missing URLs (many external
links are broken for lots of sites, but we try to crawl those URLs
just in case). I wouldn't worry about us accessing /index.html; if
your site does not use it, we won't count it against you (that
wouldn't be very reasonable :-)).

Cheers
John


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Phil Payne  
View profile  
 More options Nov 24 2008, 10:24 pm
From: Phil Payne
Date: Mon, 24 Nov 2008 03:24:57 -0800 (PST)
Local: Mon, Nov 24 2008 10:24 pm
Subject: Re: Errata: Googlebot erroneously appears to be adding "index.html" to URLs

JohnMu wrote:

John - it would be nice to see something comprehensive on the
index.html issue.

For example, there's the comment in the Google sitemap generator
writeup about the use of the subdirectory's date-last-modified for
index.html if index.html is not explicitly entered in the sitemap.

I'd like to know, for instance, what value for index.html's lastmod is
assumed if there's no explicit specification in a sitemap.

Etc.

Be nice to have it all in one place.


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
luzie  
View profile  
(1 user)  More options Nov 24 2008, 10:49 pm
From: luzie
Date: Mon, 24 Nov 2008 03:49:44 -0800 (PST)
Local: Mon, Nov 24 2008 10:49 pm
Subject: Re: Errata: Googlebot erroneously appears to be adding "index.html" to URLs

>>> I found a bunch of http://www.acidfanatic.com//index.html
>>> not found by googlebot in the webmaster tools
>>> You are killing my site google

I've read the same thing in your message in another recent thread here
and wonder why there should be any connection between a few (odd they
are, that's right) 404-errors and the site being
"killed" (downgraded?) ... I don't really know, if you yourself
believe that, but if you do, think about it again.

-luzie-


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
luzie  
View profile  
 More options Nov 24 2008, 10:54 pm
From: luzie
Date: Mon, 24 Nov 2008 03:54:28 -0800 (PST)
Local: Mon, Nov 24 2008 10:54 pm
Subject: Re: Errata: Googlebot erroneously appears to be adding "index.html" to URLs
Having said this (see above), I admit that it IS a nasty behaviour of
Google to 'invent' addresses just to show them as "errors" again in
the webmastertools ... what is this good for ^^ (if 'index.html' is
not there, so is 'foo.html' and foo_2.html and foo_3.html ...), where
does this stop?

-luzie-


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
webado  
View profile  
 More options Nov 25 2008, 1:04 am
From: webado
Date: Mon, 24 Nov 2008 06:04:37 -0800 (PST)
Local: Tues, Nov 25 2008 1:04 am
Subject: Re: Errata: Googlebot erroneously appears to be adding "index.html" to URLs
You "temporary patch" is pretty worhtless.
http://web-sniffer.net/?url=http%3A%2F%2Fwww.rlgsc.com%2Findex.html&s...

You are doing a meta refresh from index.html to default.html .

Since there is no html link in the body, matching the meta refresh
destination, Googlebot is left assuming (rightly) that index.html
exists.
So you have now replaced a 404 with a 200 for an empty page, and no
conneciton to the rest of the site.

At best Googlebot might recognize that as a redirection to
default.html eventually, but now you also have http://www.rlgsc.com/default.html
as a duplicate of http://www.rlgsc.com/ .

Redirections need to be done server-side, to be any good. And you must
not "fix" one problem (404, which a natural response for somethign
that doesnt' exist) by introducing another

On Oct 31, 2:50 pm, Bob Gezelter wrote:


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »

Create a group - Google Groups - Google Home - Terms of Service - Privacy Policy
©2009 Google