Webmaster Forum

Go Back   Webmaster Forum > Marketing Forums > SEO Forum

SEO Forum Search engine optimization discussions.


Reply
 
LinkBack Thread Tools Display Modes
Old 05-24-2007, 11:32 PM   #1 (permalink)
Contributing Member
 
mitra's Avatar
 
Join Date: 12-23-06
Location: India
Posts: 87
iTrader: 0 / 0%
mitra is liked by many
Need help with robots.txt

I've my robots.txt created and I need to stop some bad urls from getting indexed in google. I'm not able to find from where google is finding those urls. Those pages are on the sub domain. The urls are something like 'http://xxx.mysite.com/1/2/3/4.php' .

Suppose I want to block /1/ directory and all its content, how should I block them through robots.txt?

Thanks
mitra
mitra is offline  
Add Post to del.icio.us
Reply With Quote
Old 05-24-2007, 11:44 PM   #2 (permalink)
Meeow!
 
Costin Trifan's Avatar
 
Join Date: 04-13-07
Location: Romania
Posts: 3,235
iTrader: 0 / 0%
Latest Blog:
None

Costin Trifan is a web professional of the highest orderCostin Trifan is a web professional of the highest orderCostin Trifan is a web professional of the highest orderCostin Trifan is a web professional of the highest orderCostin Trifan is a web professional of the highest orderCostin Trifan is a web professional of the highest orderCostin Trifan is a web professional of the highest orderCostin Trifan is a web professional of the highest orderCostin Trifan is a web professional of the highest orderCostin Trifan is a web professional of the highest orderCostin Trifan is a web professional of the highest order
User-Agent: *
Disallow: /1/

this will disallow access for all bots to that directory.
__________________
...to be continued
Costin Trifan is offline  
Add Post to del.icio.us
Reply With Quote
Old 05-24-2007, 11:47 PM   #3 (permalink)
Meeow!
 
Costin Trifan's Avatar
 
Join Date: 04-13-07
Location: Romania
Posts: 3,235
iTrader: 0 / 0%
Latest Blog:
None

Costin Trifan is a web professional of the highest orderCostin Trifan is a web professional of the highest orderCostin Trifan is a web professional of the highest orderCostin Trifan is a web professional of the highest orderCostin Trifan is a web professional of the highest orderCostin Trifan is a web professional of the highest orderCostin Trifan is a web professional of the highest orderCostin Trifan is a web professional of the highest orderCostin Trifan is a web professional of the highest orderCostin Trifan is a web professional of the highest orderCostin Trifan is a web professional of the highest order
for a detailed view, see my robots.txt file: http://optimizaremaster.org/robots.txt
__________________
...to be continued
Costin Trifan is offline  
Add Post to del.icio.us
Reply With Quote
Old 05-25-2007, 12:13 AM   #4 (permalink)
Contributing Member
 
mitra's Avatar
 
Join Date: 12-23-06
Location: India
Posts: 87
iTrader: 0 / 0%
mitra is liked by many
Thanks costin

But I think I need another robots.txt for sub domain. where should I place it?

One more query, I've my google sitemap account for the main site. Do I need to add another sitemap for my sub.domain? Because what I find that I won't be getting the url removal option (new feature in google sitemap) for the sub domain?

mitra
mitra is offline  
Add Post to del.icio.us
Reply With Quote
Old 05-25-2007, 12:49 AM   #5 (permalink)
Meeow!
 
Costin Trifan's Avatar
 
Join Date: 04-13-07
Location: Romania
Posts: 3,235
iTrader: 0 / 0%
Latest Blog:
None

Costin Trifan is a web professional of the highest orderCostin Trifan is a web professional of the highest orderCostin Trifan is a web professional of the highest orderCostin Trifan is a web professional of the highest orderCostin Trifan is a web professional of the highest orderCostin Trifan is a web professional of the highest orderCostin Trifan is a web professional of the highest orderCostin Trifan is a web professional of the highest orderCostin Trifan is a web professional of the highest orderCostin Trifan is a web professional of the highest orderCostin Trifan is a web professional of the highest order
Quote:
But I think I need another robots.txt for sub domain. where should I place it?
You don't need a new robots.txt file for your sub-domain.

Look at my robots.txt file and see there

Quote:
#
# FOLDERS
#
User-agent: *
Disallow: /App_Data/
User-agent: *
Disallow: /App_Code/
User-agent: *
Disallow: /costin/
Where costin is my sub-domain.

See my point here?
__________________
...to be continued
Costin Trifan is offline  
Add Post to del.icio.us
Reply With Quote
Old 05-25-2007, 01:14 AM   #6 (permalink)
Contributing Member
 
mitra's Avatar
 
Join Date: 12-23-06
Location: India
Posts: 87
iTrader: 0 / 0%
mitra is liked by many
I'm going thru this http://www.webmasterworld.com/forum93/628.htm and found that we need separate robots.txt for each.
mitra is offline  
Add Post to del.icio.us
Reply With Quote
Old 05-25-2007, 01:43 AM   #7 (permalink)
Meeow!
 
Costin Trifan's Avatar
 
Join Date: 04-13-07
Location: Romania
Posts: 3,235
iTrader: 0 / 0%
Latest Blog:
None

Costin Trifan is a web professional of the highest orderCostin Trifan is a web professional of the highest orderCostin Trifan is a web professional of the highest orderCostin Trifan is a web professional of the highest orderCostin Trifan is a web professional of the highest orderCostin Trifan is a web professional of the highest orderCostin Trifan is a web professional of the highest orderCostin Trifan is a web professional of the highest orderCostin Trifan is a web professional of the highest orderCostin Trifan is a web professional of the highest orderCostin Trifan is a web professional of the highest order
...wherever...

IMHO: I'm sure you don't need two different robots.txt files

A subdomain is part of the main domain (that is the subdomain is considered to be just a folder for the main domain) and it can be blocked from the main robots.txt.
__________________
...to be continued
Costin Trifan is offline  
Add Post to del.icio.us
Reply With Quote
Old 05-25-2007, 02:48 AM   #8 (permalink)
Contributing Member
 
mitra's Avatar
 
Join Date: 12-23-06
Location: India
Posts: 87
iTrader: 0 / 0%
mitra is liked by many
ok let me check once again with robots.txt analysis in google sitemap if it works.

I have the sub domain like 'http://xxx.mysite.com' and I'm putting /1/
mitra is offline  
Add Post to del.icio.us
Reply With Quote
Old 05-26-2007, 11:16 AM   #9 (permalink)
Contributing Member
 
mitra's Avatar
 
Join Date: 12-23-06
Location: India
Posts: 87
iTrader: 0 / 0%
mitra is liked by many
I've checked with the google site robots.txt analysis tool and found that it is not working for me.

Here is the url I tested

http://subdomain.mysite.com/sys/

I've put Disallow: /sys/

Could you please throw more light on it?
mitra is offline  
Add Post to del.icio.us
Reply With Quote
Old 05-28-2007, 10:19 AM   #10 (permalink)
Contributing Member
 
mitra's Avatar
 
Join Date: 12-23-06
Location: India
Posts: 87
iTrader: 0 / 0%
mitra is liked by many
I don't find any info on how to deal with the sub domain in google webmaster tool. can anybody pls help me?
mitra is offline  
Add Post to del.icio.us
Reply With Quote
Old 05-28-2007, 04:28 PM   #11 (permalink)
Meeow!
 
Costin Trifan's Avatar
 
Join Date: 04-13-07
Location: Romania
Posts: 3,235
iTrader: 0 / 0%
Latest Blog:
None

Costin Trifan is a web professional of the highest orderCostin Trifan is a web professional of the highest orderCostin Trifan is a web professional of the highest orderCostin Trifan is a web professional of the highest orderCostin Trifan is a web professional of the highest orderCostin Trifan is a web professional of the highest orderCostin Trifan is a web professional of the highest orderCostin Trifan is a web professional of the highest orderCostin Trifan is a web professional of the highest orderCostin Trifan is a web professional of the highest orderCostin Trifan is a web professional of the highest order
Thumbs down

Well, Mitra, I really don't know an easy way to say it but I'll try to explain it as better as I can.
So,
Let's say you have this web site called "MyWebSite" and the url will be, of course "www.MyWebSite.com". Now, in this web site you create two subdomains: one called "MySubdomainOne" and other called "MySubdomainTwo"
.
Your web site structure should look like this:
MyWebSite (this is the root folder)
> index.php
> page1.html
> page2.html
> robots.txt
> MySubdomainOne (this is a folder, but also the root folder for the subdomain 1 and contains the following files)
-- index.php
-- page1.html
-- page2.html
> MySubdomainTwo (this is another folder, but also the root folder for the subdomain 2 and contains the following files)
-- index.php
-- page1.html
-- page2.html

Now, as you said, you don't want the SE's robots to access your subdomain's files, so your robots.txt file should look like this:

# This will allow any spider to access all the files within your domain:
User-Agent: *
Disallow:


# This will block all spiders to access any file within this subdomain
User-Agent: *
Disallow: /MySubdomainOne/


# This will block all spiders to access any file within this subdomain
User-Agent: *
Disallow: /MySubdomainTwo/


because you're blocking the access to your subdomains, the spiders will be able to access only these files:
> index.php
> page1.html
> page2.html
> robots.txt
located on the root folder (See above)

Note:
You have to specify the "noindex,nofollow" attributes in each head section of each existent page within your blocked subdomain:
<head>
<meta name="robots" content="noindex,nofollow" />
.....
</head>

This way, you don't need an individual robots.txt file for each subdomain.

I hope this helps you.
__________________
...to be continued
Costin Trifan is offline  
Add Post to del.icio.us
Reply With Quote
Go Back   Webmaster Forum > Marketing Forums > SEO Forum

Reply


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Robots.txt m2k SEO Forum 4 08-13-2004 04:31 PM
robots.txt John Scott Web Design Lobby 6 10-15-2003 08:14 AM


Sponsor Links
Get exposure! Contextual Links V7N SEO Blog V7N Directory


All times are GMT -7. The time now is 04:09 AM.
© Copyright 2008 V7 Inc
Powered by vBulletin
Copyright © 2000-2009 Jelsoft Enterprises Limited.


Search Engine Optimization by vBSEO 3.3.0 ©2009, Crawlability, Inc.