Webmaster Forum


Go Back   Webmaster Forum > Marketing Forums > SEO Forum
Register FAQ Members List Calendar Search Today's Posts Mark Forums Read

SEO Forum Search engine optimization discussions.

Ezilon Directory   1,000 Directory Submissions   V7N Directory

Reply
 
LinkBack Thread Tools Display Modes
Old 03-31-2004, 10:41 AM   #1 (permalink)
Inactive
 
Join Date: 03-08-04
Posts: 508
iTrader: 0 / 0%
Latest Blog:
None

spidersam is liked by many
What directory does robots.txt need to be placed in?

What directory does robots.txt need to be placed in? In regards to allowing forum crawling....
spidersam is offline  
Add Post to del.icio.us
Reply With Quote
Sponsored Links
SEO Hosting by HostGator  Advertise Here  Buy Blog Links
Old 03-31-2004, 11:06 AM   #2 (permalink)
Inactive
 
Join Date: 03-10-04
Location: UK
Posts: 74
iTrader: 0 / 0%
Latest Blog:
None

Andy is liked by many
Send a message via MSN to Andy
Your robots.txt only needs to be placed in the root directory, any references to lower level directories are made from there.

However robots.txt. is an exclusion standard so in terms of "Allowing Crawling" you shouldnt need to do anything except install your forums search engine hack and link to the forum index.
Andy is offline  
Add Post to del.icio.us
Reply With Quote
Old 03-31-2004, 02:02 PM   #3 (permalink)
Inactive
 
Join Date: 03-08-04
Posts: 508
iTrader: 0 / 0%
Latest Blog:
None

spidersam is liked by many
so, in the www/ folder?

I added this as a hack, but am not even sure what it does

Disallow: /forums/sutra*.html$
Disallow: /forums/ptopic*.html$
Disallow: /forums/ntopic*.html$
Disallow: /forums/ftopic*asc*.html$
spidersam is offline  
Add Post to del.icio.us
Reply With Quote
Old 03-31-2004, 02:07 PM   #4 (permalink)
Inactive
 
MadHatter's Avatar
 
Join Date: 11-19-03
Posts: 496
iTrader: 0 / 0%
Latest Blog:
None

MadHatter is on the right pathMadHatter is on the right pathMadHatter is on the right path
This stops bots crawling any of the topics!
(I think, confirmation please...)
MadHatter is offline  
Add Post to del.icio.us
Reply With Quote
Old 03-31-2004, 02:30 PM   #5 (permalink)
Inactive
 
Join Date: 03-08-04
Posts: 508
iTrader: 0 / 0%
Latest Blog:
None

spidersam is liked by many
Yes! Confirmation Please!
spidersam is offline  
Add Post to del.icio.us
Reply With Quote
Old 03-31-2004, 02:34 PM   #6 (permalink)
Inactive
 
MadHatter's Avatar
 
Join Date: 11-19-03
Posts: 496
iTrader: 0 / 0%
Latest Blog:
None

MadHatter is on the right pathMadHatter is on the right pathMadHatter is on the right path
I am pretty sure that if you specify them rules for 'media partners google' that google cannot crawl any of those pages (the *'s mean anything, like any post number)
MadHatter is offline  
Add Post to del.icio.us
Reply With Quote
Old 03-31-2004, 02:48 PM   #7 (permalink)
Inactive
 
Join Date: 03-08-04
Posts: 508
iTrader: 0 / 0%
Latest Blog:
None

spidersam is liked by many
Can someone confirm MadHatter's suspicion? I'm totally clueless...
spidersam is offline  
Add Post to del.icio.us
Reply With Quote
Old 03-31-2004, 03:46 PM   #8 (permalink)
Inactive
 
compar's Avatar
 
Join Date: 01-07-04
Location: Waterloo ON Canada
Posts: 679
iTrader: 0 / 0%
Latest Blog:
None

compar is a jewel in the roughcompar is a jewel in the roughcompar is a jewel in the roughcompar is a jewel in the roughcompar is a jewel in the roughcompar is a jewel in the roughcompar is a jewel in the roughcompar is a jewel in the rough
Send a message via MSN to compar
The question is what was MadHatter trying to achieve with that entry?
compar is offline  
Add Post to del.icio.us
Reply With Quote
Old 03-31-2004, 03:49 PM   #9 (permalink)
Inactive
 
Join Date: 03-08-04
Posts: 508
iTrader: 0 / 0%
Latest Blog:
None

spidersam is liked by many
Disallow: /forums/sutra*.html$
Disallow: /forums/ptopic*.html$
Disallow: /forums/ntopic*.html$
Disallow: /forums/ftopic*asc*.html$

From my understanding, this was supposed to not allow Google from crawling duplicate content...not stop it from crawling 'any' content.....but I am a supreme amatuer at this...so I was just following directions
spidersam is offline  
Add Post to del.icio.us
Reply With Quote
Old 03-31-2004, 03:50 PM   #10 (permalink)
Inactive
 
MadHatter's Avatar
 
Join Date: 11-19-03
Posts: 496
iTrader: 0 / 0%
Latest Blog:
None

MadHatter is on the right pathMadHatter is on the right pathMadHatter is on the right path
Quote:
The question is what was MadHatter trying to achieve with that entry?
The guy said he added the hack but wasnt sure what it did, I was telling him what i thought it did, then I explained about the rule being specified for media partners google to be in effect for googlebot, and then explained what the *'s mean.
Thus answering the question of: Not sure what it does
MadHatter is offline  
Add Post to del.icio.us
Reply With Quote
Old 03-31-2004, 03:52 PM   #11 (permalink)
Inactive
 
MadHatter's Avatar
 
Join Date: 11-19-03
Posts: 496
iTrader: 0 / 0%
Latest Blog:
None

MadHatter is on the right pathMadHatter is on the right pathMadHatter is on the right path
Quote:
Originally Posted by spidersam
Disallow: /forums/sutra*.html$
Disallow: /forums/ptopic*.html$
Disallow: /forums/ntopic*.html$
Disallow: /forums/ftopic*asc*.html$

From my understanding, this was supposed to not allow Google from crawling duplicate content...not stop it from crawling 'any' content.....but I am a supreme amatuer at this...so I was just following directions
Where are you bothered about google crawling duplicate content, after all, with twice the content crawled, cached, and listed, you have twice the probability of you pages appearing in SERPS
MadHatter is offline  
Add Post to del.icio.us
Reply With Quote
Old 03-31-2004, 03:57 PM   #12 (permalink)
Inactive
 
Join Date: 03-08-04
Posts: 508
iTrader: 0 / 0%
Latest Blog:
None

spidersam is liked by many
I don't know....I was literally following step-by-step directions.....that was one of the steps.....and that's what it said that step did.....

If I remove that, will it matter in terms of getting my forum pages indexed? That's all I'm looking to get done....
spidersam is offline  
Add Post to del.icio.us
Reply With Quote
Old 04-01-2004, 02:39 AM   #13 (permalink)
Inactive
 
MadHatter's Avatar
 
Join Date: 11-19-03
Posts: 496
iTrader: 0 / 0%
Latest Blog:
None

MadHatter is on the right pathMadHatter is on the right pathMadHatter is on the right path
A Disallow: rule
Stops a bot from crawling that page...
so if you remove them, but allow google to crawl all pages then your more likely to get ranked.
I only put disallow rules in for my images dir... otherwise your images end up showing up on google image search.
MadHatter is offline  
Add Post to del.icio.us
Reply With Quote
Old 04-01-2004, 06:54 AM   #14 (permalink)
Inactive
 
compar's Avatar
 
Join Date: 01-07-04
Location: Waterloo ON Canada
Posts: 679
iTrader: 0 / 0%
Latest Blog:
None

compar is a jewel in the roughcompar is a jewel in the roughcompar is a jewel in the roughcompar is a jewel in the roughcompar is a jewel in the roughcompar is a jewel in the roughcompar is a jewel in the roughcompar is a jewel in the rough
Send a message via MSN to compar
Quote:
Originally Posted by MadHatter
I only put disallow rules in for my images dir... otherwise your images end up showing up on google image search.
Do you think that's where Google gets images from? What stops them from lifting right off the web pages?
compar is offline  
Add Post to del.icio.us
Reply With Quote
Old 04-01-2004, 11:53 AM   #15 (permalink)
Contributing Member
 
Fruit & Veg's Avatar
 
Join Date: 10-13-03
Location: UK
Posts: 370
iTrader: 2 / 100%
Fruit & Veg is just really niceFruit & Veg is just really niceFruit & Veg is just really niceFruit & Veg is just really niceFruit & Veg is just really niceFruit & Veg is just really niceFruit & Veg is just really niceFruit & Veg is just really niceFruit & Veg is just really niceFruit & Veg is just really niceFruit & Veg is just really nice
eh? The images on a webpage are referenced by their url - eg. www.domain.com/images/image1.gif - how else can you get an image by not using this url? There's no backdoor.

Google obeys robots.txt so you're kosher MadHatter.

spidersam - I recommend you start again with your robots.txt file, you could be screwing things up and not knowing. Check out - http://www.searchengineworld.com/cgi-bin/robotcheck.cgi and http://www.robotstxt.org/ to learn more.

Where did you copy that bit of code from? Doesn't that page explain what you are actually doing?
__________________
Individualitee - Great Tshirts
SmartGeezer - UK Men's Clothing Blog
Fruit & Veg is offline  
Add Post to del.icio.us
Reply With Quote
Old 04-01-2004, 12:17 PM   #16 (permalink)
Inactive
 
compar's Avatar
 
Join Date: 01-07-04
Location: Waterloo ON Canada
Posts: 679
iTrader: 0 / 0%
Latest Blog:
None

compar is a jewel in the roughcompar is a jewel in the roughcompar is a jewel in the roughcompar is a jewel in the roughcompar is a jewel in the roughcompar is a jewel in the roughcompar is a jewel in the roughcompar is a jewel in the rough
Send a message via MSN to compar
Quote:
Originally Posted by Fruit & Veg
eh? The images on a webpage are referenced by their url - eg. www.domain.com/images/image1.gif - how else can you get an image by not using this url? There's no backdoor.
But anybody can lift an image right off a web site. Right click and save image. So do you think Google needs direct access to the image folder to get the file?
compar is offline  
Add Post to del.icio.us
Reply With Quote
Old 04-01-2004, 12:52 PM   #17 (permalink)
Inactive
 
MadHatter's Avatar
 
Join Date: 11-19-03
Posts: 496
iTrader: 0 / 0%
Latest Blog:
None

MadHatter is on the right pathMadHatter is on the right pathMadHatter is on the right path
Yes it does!
You see, if i set a dissallow rule saying that google cannot use anything in the images dir, google simply indexes the page without being allowed to get the images on the page served... it is like it is getting the little red x.

Quote:
But anybody can lift an image right off a web site. Right click and save image.
Yes... but that actually downloads the image... from /images/logo.jpg
it is like right click and save target as for a zip file, you are still requesting the zip file from the downloads directory!
MadHatter is offline  
Add Post to del.icio.us
Reply With Quote
Old 04-01-2004, 01:10 PM   #18 (permalink)
Inactive
 
Join Date: 03-08-04
Posts: 508
iTrader: 0 / 0%
Latest Blog:
None

spidersam is liked by many
This topic has gotten a little off of my original intent....which is fine.......

but let simplify my question..........Do I have to have a robots.txt file in order to get all of my forum pages indexed?
spidersam is offline  
Add Post to del.icio.us
Reply With Quote
Old 04-01-2004, 01:55 PM   #19 (permalink)
Contributing Member
 
Fruit & Veg's Avatar
 
Join Date: 10-13-03
Location: UK
Posts: 370
iTrader: 2 / 100%
Fruit & Veg is just really niceFruit & Veg is just really niceFruit & Veg is just really niceFruit & Veg is just really niceFruit & Veg is just really niceFruit & Veg is just really niceFruit & Veg is just really niceFruit & Veg is just really niceFruit & Veg is just really niceFruit & Veg is just really niceFruit & Veg is just really nice
No, but your forum pages have to be able to be indexed by a search engine.

Robots.txt is mainly used for blocking rather than allowing.
__________________
Individualitee - Great Tshirts
SmartGeezer - UK Men's Clothing Blog
Fruit & Veg is offline  
Add Post to del.icio.us
Reply With Quote
Old 04-01-2004, 02:01 PM   #20 (permalink)
Inactive
 
Join Date: 03-08-04
Posts: 508
iTrader: 0 / 0%
Latest Blog:
None

spidersam is liked by many
I did install a hack....by looking at my forum, could you tell if i should be good?

www.lordsrendezvous.com/forums/
spidersam is offline  
Add Post to del.icio.us
Reply With Quote
Go Back   Webmaster Forum > Marketing Forums > SEO Forum

Reply



Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On

Similar Threads
Thread Thread Starter Forum Replies Last Post
Robots.txt DimlerJ Google Forum 4 06-26-2007 12:48 AM
Need help with robots.txt mitra SEO Forum 10 05-28-2007 03:28 PM
Robots.txt timtom SEO Forum 4 02-19-2007 06:56 PM
Robots.txt danny boy SEO Forum 5 04-23-2005 06:52 AM


Sponsor Links
Get exposure! Get exposure! Find Scripts Web Hosting Directory Get exposure! SEO Blog


All times are GMT -7. The time now is 11:44 PM.
© Copyright 2008 V7 Inc