Webmaster Forum


Go Back   Webmaster Forum > Marketing Forums > Google Forum
Register FAQ Members List Calendar Search Today's Posts Mark Forums Read

Google Forum Discuss Google related issues.

   

Reply
 
LinkBack Thread Tools Display Modes
Old 10-16-2003, 12:51 PM   #1 (permalink)
Rob
Inactive
 
Join Date: 10-13-03
Location: NL
Posts: 317
iTrader: 0 / 0%
Latest Blog:
None

Rob is liked by somebodyRob is liked by somebodyRob is liked by somebodyRob is liked by somebodyRob is liked by somebody
Send a message via ICQ to Rob Send a message via AIM to Rob Send a message via MSN to Rob
Googlebot trying to find old pages?

About 4 days ago, I redid my website with a new design and new content. The page structure has also changed. Today, 4 days later, I noticed a first visit from Googlebot, since the new site was up. All it was doing was trying to get to pages that weren't even there anymore.

I guess that does no harm, but how come it's not crawling the new pages through the new links on the main page?

Also, for a few months already, when I try to find inbound links to my website on Google, it shows only 1 result. There are several hundreds of sites linking to mine (including a lot with PR4 or higher), still only 1 shows up. For the same reason, I would think Googlebot would browse my site a little more often, since there are quite a few pages linking to me.

Any ideas would be great.
Rob is offline  
Add Post to del.icio.us
Reply With Quote
Old 10-16-2003, 01:06 PM   #2 (permalink)
Inactive
 
Join Date: 10-12-03
Posts: 356
iTrader: 0 / 0%
Latest Blog:
None

Bubo is a jewel in the roughBubo is a jewel in the roughBubo is a jewel in the roughBubo is a jewel in the roughBubo is a jewel in the roughBubo is a jewel in the rough
Can you post the link to your website so we can take a look at it?
Bubo is offline  
Add Post to del.icio.us
Reply With Quote
Old 10-16-2003, 01:12 PM   #3 (permalink)
Crap Bag
 
Join Date: 10-12-03
Posts: 1,727
iTrader: 0 / 0%
Latest Blog:
None

PhilC is web crapPhilC is web crapPhilC is web crapPhilC is web crapPhilC is web crapPhilC is web crapPhilC is web crapPhilC is web crapPhilC is web crapPhilC is web crapPhilC is web crap
Googlebot will try to spider the URLs that it thinks exist, so spidering your old URLs is to expected. It'll come back again looking for them too, because there may have been a server error. It's good that they don't drop URLs just because it can't find them one time.

Google shows link to pages, not to websites. Maybe most of the PR4+ links are linking to inner pages. Check them out.
PhilC is offline  
Add Post to del.icio.us
Reply With Quote
Old 10-16-2003, 01:28 PM   #4 (permalink)
Rob
Inactive
 
Join Date: 10-13-03
Location: NL
Posts: 317
iTrader: 0 / 0%
Latest Blog:
None

Rob is liked by somebodyRob is liked by somebodyRob is liked by somebodyRob is liked by somebodyRob is liked by somebody
Send a message via ICQ to Rob Send a message via AIM to Rob Send a message via MSN to Rob
Site link: Free Online Games

I assume Googlebot will eventually crawl all new pages?

The sites with at least PR4 linking to mine are all linking to my domain, not to seperate pages on it.

Here are a few examples:
http://www.freestuffcenter.com/sub/games.shtml
http://www.free-games.to (at the bottom)
http://www.actionflash.com

These links used to show up in google for inbound links, but suddenly they all disappeared except for one.

http://www.google.com/search?sourcei...inegames%2Ecom

That's a weird page to show up as well. It has a PR3 and doesn't even have a real link to my site.

I'm confused :wink:
Rob is offline  
Add Post to del.icio.us
Reply With Quote
Old 10-16-2003, 01:36 PM   #5 (permalink)
Rob
Inactive
 
Join Date: 10-13-03
Location: NL
Posts: 317
iTrader: 0 / 0%
Latest Blog:
None

Rob is liked by somebodyRob is liked by somebodyRob is liked by somebodyRob is liked by somebodyRob is liked by somebody
Send a message via ICQ to Rob Send a message via AIM to Rob Send a message via MSN to Rob
To compare:
http://www.alltheweb.com/search?avkw...m&_sb_lang=any
Rob is offline  
Add Post to del.icio.us
Reply With Quote
Old 10-16-2003, 03:39 PM   #6 (permalink)
Crap Bag
 
Join Date: 10-12-03
Posts: 1,727
iTrader: 0 / 0%
Latest Blog:
None

PhilC is web crapPhilC is web crapPhilC is web crapPhilC is web crapPhilC is web crapPhilC is web crapPhilC is web crapPhilC is web crapPhilC is web crapPhilC is web crapPhilC is web crap
Yes, Google will spider all your new stuff before long.

Alltheweb often shows vast numbers of IBLs when Google shows much less.

I can't see any reason why Google should have those pages in its index and doesn't show tham as links for your URL. I only looked at two of them. For the first one, Google can't display the cache for some reason but the second one is ok, and Google's cache has your link on it - and the link is standard HTML, so it should be showing.

I can only suggest that it is one of Google's gliches and that it'll work out ok. Personally, I wouldn't worry about it although it is frustrating while you're sitting there with PR3.
PhilC is offline  
Add Post to del.icio.us
Reply With Quote
Old 10-16-2003, 03:45 PM   #7 (permalink)
Rob
Inactive
 
Join Date: 10-13-03
Location: NL
Posts: 317
iTrader: 0 / 0%
Latest Blog:
None

Rob is liked by somebodyRob is liked by somebodyRob is liked by somebodyRob is liked by somebodyRob is liked by somebody
Send a message via ICQ to Rob Send a message via AIM to Rob Send a message via MSN to Rob
Google probably can't cache it because of the frames, not sure how it works there.

Thanks for the help. I'll keep an eye on my logs for Googlebot visits, hopefully it won't take too long.
Rob is offline  
Add Post to del.icio.us
Reply With Quote
Old 10-16-2003, 05:33 PM   #8 (permalink)
Crap Bag
 
Join Date: 10-12-03
Posts: 1,727
iTrader: 0 / 0%
Latest Blog:
None

PhilC is web crapPhilC is web crapPhilC is web crapPhilC is web crapPhilC is web crapPhilC is web crapPhilC is web crapPhilC is web crapPhilC is web crapPhilC is web crapPhilC is web crap
Frames wouldn't matter. Google would cache the frameset page and, if the frame sources were absolute, the frame pages would display ok. If they were relative, the page not found messages would come up in the frames.

At the moment Google is having a problem with its caches - I just came across another one.
PhilC is offline  
Add Post to del.icio.us
Reply With Quote
Old 10-17-2003, 05:34 AM   #9 (permalink)
v7n Mentor
 
Johan007's Avatar
 
Join Date: 10-15-03
Posts: 1,932
iTrader: 0 / 0%
Latest Blog:
None

Johan007 is a name known to allJohan007 is a name known to allJohan007 is a name known to allJohan007 is a name known to allJohan007 is a name known to allJohan007 is a name known to allJohan007 is a name known to allJohan007 is a name known to allJohan007 is a name known to allJohan007 is a name known to allJohan007 is a name known to all
I have updated my site by removing a major directory and putting the files in the root. Google still has the old pages in its index and tries to visit them all the time. Its been over a month now and I want those pages removed because I think Google has put a limit of 200 pages on my site for the time being because of the dynamic urls.

I cannot teach him, the boy has no patience!!!
Johan007 is offline  
Add Post to del.icio.us
Reply With Quote
Old 10-17-2003, 05:44 AM   #10 (permalink)
Crap Bag
 
Join Date: 10-12-03
Posts: 1,727
iTrader: 0 / 0%
Latest Blog:
None

PhilC is web crapPhilC is web crapPhilC is web crapPhilC is web crapPhilC is web crapPhilC is web crapPhilC is web crapPhilC is web crapPhilC is web crapPhilC is web crapPhilC is web crap
Why do you think there's a 200 URL limit on your site?
PhilC is offline  
Add Post to del.icio.us
Reply With Quote
Old 10-17-2003, 07:30 AM   #11 (permalink)
v7n Mentor
 
Johan007's Avatar
 
Join Date: 10-15-03
Posts: 1,932
iTrader: 0 / 0%
Latest Blog:
None

Johan007 is a name known to allJohan007 is a name known to allJohan007 is a name known to allJohan007 is a name known to allJohan007 is a name known to allJohan007 is a name known to allJohan007 is a name known to allJohan007 is a name known to allJohan007 is a name known to allJohan007 is a name known to allJohan007 is a name known to all
lol no I am wrong google has added more pages now.

The old pages are still there which is disappointing.

I am still waiting for page ranking to be given in the toolbar but the pages themselves show no improvement which may prove the theory than toolbar PR is not what google uses to rank pages on a site.
Johan007 is offline  
Add Post to del.icio.us
Reply With Quote
Old 10-17-2003, 07:32 AM   #12 (permalink)
Crap Bag
 
Join Date: 10-12-03
Posts: 1,727
iTrader: 0 / 0%
Latest Blog:
None

PhilC is web crapPhilC is web crapPhilC is web crapPhilC is web crapPhilC is web crapPhilC is web crapPhilC is web crapPhilC is web crapPhilC is web crapPhilC is web crapPhilC is web crap
Quote:
Originally Posted by Johan007
... which may prove the theory than toolbar PR is not what google uses to rank pages on a site.
It's not a theory. Google uses about 100 different factors to rank pages (their number) - PageRank is just one of them - and important one, but just one of 100.

The Toolbar PR figure isn't a PageRank value - it's a label. If you don't understand that, ask and it will be explained.
PhilC is offline  
Add Post to del.icio.us
Reply With Quote
Old 10-18-2003, 05:07 AM   #13 (permalink)
Rob
Inactive
 
Join Date: 10-13-03
Location: NL
Posts: 317
iTrader: 0 / 0%
Latest Blog:
None

Rob is liked by somebodyRob is liked by somebodyRob is liked by somebodyRob is liked by somebodyRob is liked by somebody
Send a message via ICQ to Rob Send a message via AIM to Rob Send a message via MSN to Rob
A visit from Googlebot again, today:

64.68.84.139 - - [18/Oct/2003:16:54:51 -0500] "GET /robots.txt HTTP/1.0" 200 39 "-" "Googlebot/2.1 (+http://www.googlebot.com/bot.html)"

64.68.84.139 - - [18/Oct/2003:16:54:52 -0500] "GET / HTTP/1.0" 200 39946 "-" "Googlebot/2.1 (+http://www.googlebot.com/bot.html)"

Grr. Why does it not browse my pages?
Rob is offline  
Add Post to del.icio.us
Reply With Quote
Old 10-18-2003, 05:15 AM   #14 (permalink)
Crap Bag
 
Join Date: 10-12-03
Posts: 1,727
iTrader: 0 / 0%
Latest Blog:
None

PhilC is web crapPhilC is web crapPhilC is web crapPhilC is web crapPhilC is web crapPhilC is web crapPhilC is web crapPhilC is web crapPhilC is web crapPhilC is web crapPhilC is web crap
Assuming that your pages are crawlable and not excluded by the robots.txt file, they'll get crawled.
PhilC is offline  
Add Post to del.icio.us
Reply With Quote
Old 10-18-2003, 05:17 AM   #15 (permalink)
Rob
Inactive
 
Join Date: 10-13-03
Location: NL
Posts: 317
iTrader: 0 / 0%
Latest Blog:
None

Rob is liked by somebodyRob is liked by somebodyRob is liked by somebodyRob is liked by somebodyRob is liked by somebody
Send a message via ICQ to Rob Send a message via AIM to Rob Send a message via MSN to Rob
http://www.1001onlinegames.com/robots.txt

Looks okay to me. It's just strange to see Googlebot only request the robots.txt file, I don't understand why.
Rob is offline  
Add Post to del.icio.us
Reply With Quote
Old 10-18-2003, 08:27 AM   #16 (permalink)
Crap Bag
 
Join Date: 10-12-03
Posts: 1,727
iTrader: 0 / 0%
Latest Blog:
None

PhilC is web crapPhilC is web crapPhilC is web crapPhilC is web crapPhilC is web crapPhilC is web crapPhilC is web crapPhilC is web crapPhilC is web crapPhilC is web crapPhilC is web crap
Yup. The robots.txt file is ok. Google does that sometimes, but it'll get crawled.

I've crawled the site and there aren't any problems with crawling but, just out of interest, the external link to media.fastclick.net/... is taking as much Pagerank as your home page. It's on every page and you might want to change the links to javascript to avoid the PR drain. Those two pages are the most important, PRwise.

The next most important page is another external one at tafmaster.com. You might want to change the links to that URL into javascript as well.
PhilC is offline  
Add Post to del.icio.us
Reply With Quote
Old 10-18-2003, 11:27 AM   #17 (permalink)
Rob
Inactive
 
Join Date: 10-13-03
Location: NL
Posts: 317
iTrader: 0 / 0%
Latest Blog:
None

Rob is liked by somebodyRob is liked by somebodyRob is liked by somebodyRob is liked by somebodyRob is liked by somebody
Send a message via ICQ to Rob Send a message via AIM to Rob Send a message via MSN to Rob
Hmm indeed, I'll look into it. Thanks.

Just one question. On the game-pages (like fowlwords.html) there are 3 links. One to the main page, one to media.fastclick.net/.. and one to the TAFmaster form. Does the drain to these 2 outbound links have any effect on any page at all? I don't think a subpage link fowlwords.html can donate extra PR to the mainpage, or can it?
Rob is offline  
Add Post to del.icio.us
Reply With Quote
Old 10-18-2003, 11:30 AM   #18 (permalink)
Rob
Inactive
 
Join Date: 10-13-03
Location: NL
Posts: 317
iTrader: 0 / 0%
Latest Blog:
None

Rob is liked by somebodyRob is liked by somebodyRob is liked by somebodyRob is liked by somebodyRob is liked by somebody
Send a message via ICQ to Rob Send a message via AIM to Rob Send a message via MSN to Rob
FastClick does not allow code modification..

Edit: Googlebot probably only is able to follow the link within the NOSCRIPT tags, I guess I can edit that one.
Rob is offline  
Add Post to del.icio.us
Reply With Quote
Old 10-18-2003, 11:35 AM   #19 (permalink)
Rob
Inactive
 
Join Date: 10-13-03
Location: NL
Posts: 317
iTrader: 0 / 0%
Latest Blog:
None

Rob is liked by somebodyRob is liked by somebodyRob is liked by somebodyRob is liked by somebodyRob is liked by somebody
Send a message via ICQ to Rob Send a message via AIM to Rob Send a message via MSN to Rob
Also, I have a few non-javascript outbound links on my main page (and most of the subpages) to some of my other websites and some of my friend's websites. What would you suggest, placing javascript on these links on the main page as well, or leave them as is?
Rob is offline  
Add Post to del.icio.us
Reply With Quote
Old 10-18-2003, 12:30 PM   #20 (permalink)
Crap Bag
 
Join Date: 10-12-03
Posts: 1,727
iTrader: 0 / 0%
Latest Blog:
None

PhilC is web crapPhilC is web crapPhilC is web crapPhilC is web crapPhilC is web crapPhilC is web crapPhilC is web crapPhilC is web crapPhilC is web crapPhilC is web crapPhilC is web crap
Quote:
Originally Posted by Rob
Just one question. On the game-pages (like fowlwords.html) there are 3 links. One to the main page, one to media.fastclick.net/.. and one to the TAFmaster form. Does the drain to these 2 outbound links have any effect on any page at all? I don't think a subpage link fowlwords.html can donate extra PR to the mainpage, or can it?
Whatever page(s) the external links are on, PR will leave the site. E.g. 2 thirds of the PR weight that fowlwords.html can pass on is being passed out of the site and only 1 third is being passed on inside the site - to the home page. So the 2 OBLs that you mentioned affect all pages within the site.

Quote:
Originally Posted by Rob
FastClick does not allow code modification..
There's no difference between a straight link to FastClick, like you have, and redoing the same links in Javascript. FastClick can't tell the difference. Here are the external pages that your site links to, each followed by the PageRabk value that your site is giving to it. Below them is your home page's [root] PageRank value, so you can see how draining the 2 outbound links are. The figures are based on no inbound links to the site.

media.fastclick.net/w/click.here?sid=13590&m=1&c=1 3.23935988585561
regman.freeze.com/survey/marine/index.asp?s=marine&f=robbie 0.281116947760822
www.all4humor.com 0.516656849587047
www.funny-pet-pictures.com 0.516656849587047
www.rudefun.com 0.516656849587047
www.top-free-games.com 0.516656849587047
aftrk.com/c/c?b=13932&h=3644&sh=307919&bt=1x1 0.385539901826224
aftrk.com/c/c?b=16311&h=3644&sh=307919&bt=1x1 0.385539901826224
aftrk.com/c/c?b=17326&h=3644&sh=307919&bt=1x1 0.385539901826224
service.bfast.com/bfast/click?bfmid=37925529&siteid=40705098&bfpage=gamesv ille 0.385539901826224
tafmaster.com/taf/2403/252088/ 2.8067041967862
www.alltheweb.com/?q=1001onlinegames.com 0.156144580634181

[root] 3.23935988585561

The rest of your internal pages have PagRanks ranging from 0.5 to 0.1, so you can see how much, by comparison, is being given to those 2 external URLs that I mentioned.

Quote:
Originally Posted by Rob
Also, I have a few non-javascript outbound links on my main page (and most of the subpages) to some of my other websites and some of my friend's websites. What would you suggest, placing javascript on these links on the main page as well, or leave them as is?
It's good to have a few outbound links and the others aren't too big a drain, although you may still want to javascript some of the ones that are not there for linkpop/pagerank purposes.
PhilC is offline  
Add Post to del.icio.us
Reply With Quote
Go Back   Webmaster Forum > Marketing Forums > Google Forum

Reply



Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools
<