Webmaster Forum


Go Back   Webmaster Forum > Marketing Forums > SEO Forum
Register FAQ Members List Calendar Search Today's Posts Mark Forums Read

SEO Forum Search engine optimization discussions.

   

Reply
 
LinkBack Thread Tools Display Modes
Old 04-11-2007, 09:42 PM   #1 (permalink)
Inactive
 
Join Date: 01-21-04
Posts: 779
iTrader: 0 / 0%
peter_d is a highly respected web propeter_d is a highly respected web propeter_d is a highly respected web propeter_d is a highly respected web propeter_d is a highly respected web propeter_d is a highly respected web propeter_d is a highly respected web propeter_d is a highly respected web propeter_d is a highly respected web propeter_d is a highly respected web propeter_d is a highly respected web pro
New Robots.txt Standard

From the V7N Search Blog:

Quote:
Google, Yahoo, MSN, and Ask have got together and announced a new robots.txt feature, sitemap auto-discovery.

“The new open-format autodiscovery allows webmasters to specify the location of their sitemaps within their robots.txt file, eliminating the need to submit sitemaps to each search engine separately“.

What are site-maps?

A sitemap is an XML file that lists URLs for a site along with additional metadata about each URL (when it was last updated, how often it usually changes, and how important it is, relative to other URLs in the site) so that search engines can more intelligently crawl the site. More information here. Formatting guidelines are here.

What is the robots.txt specification for a sitemap?

Sitemap: <sitemap_location>
http://blog.v7n.com/2007/04/11/auto-...via-robotstxt/
peter_d is offline  
Add Post to del.icio.us
Reply With Quote
Old 04-12-2007, 06:13 AM   #2 (permalink)
Inactive
 
Join Date: 03-29-07
Posts: 60
iTrader: 0 / 0%
Latest Blog:
None

Axial is liked by many
Thanks very interesting
Axial is offline  
Add Post to del.icio.us
Reply With Quote
Old 04-12-2007, 12:57 PM   #3 (permalink)
Inactive
 
StupidScript's Avatar
 
Join Date: 09-22-06
Location: Los Angeles
Posts: 678
iTrader: 0 / 0%
Latest Blog:
None

StupidScript is just really niceStupidScript is just really niceStupidScript is just really niceStupidScript is just really niceStupidScript is just really niceStupidScript is just really niceStupidScript is just really niceStupidScript is just really niceStupidScript is just really niceStupidScript is just really niceStupidScript is just really nice
See also this announcement from Yahoo!.
Quote:
Since working with Google and Microsoft [JB: and Ask and IBM] to support a single format for submission with Sitemaps, we have continued to discuss further enhancements to make it easy for webmasters to get their content to all search engines quickly.
Below, please find the PHP code for generating these types of sitemaps semi-automatically. It's FREE! Enjoy! (Please post modifications.)
Code:
<?php /*######################################################## # Generates a sitemap per specifications found at: # # http://www.sitemaps.org/protocol.html # # DOES NOT traverse directories # # Apr 12 2007 By James Butler <james@musicforhumans.com> # # Free for all: http://www.gnu.org/licenses/lgpl.html # # # # Useage: # # 1) Save this as file name: sitemap_gen.php # # 2) Change variables noted below for your site # # 3) Place this file in your site's root directory # # 4) Run from http://www.yourdomain.com/sitemap_gen.php # # # # <lastmod> -OPTIONAL # # YYYY-MM-DD # # <changefreq>-OPTIONAL # # always # # hourly # # daily # # weekly # # monthly # # yearly # # never # # <priority> -OPTIONAL # # 0.0-1.0 [default 0.5] # # # # Add completed sitemap file to robots.txt: # # Sitemap: http://www.yourdomain.com/sitemap.xml # # # ########################################################*/ ######## CHANGE THESE FOR YOUR SITE ######### # IMPORTANT: Trailing slashes are REQUIRED! $my_domain = "http://www.yourdomain.com/"; $root_path_to_site = "/root/path/to/site/"; $file_types_to_include = array('html','htm'); ############## END CHANGES ################## $xml ="<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n"; $xml.="<urlset xmlns=\"http://www.sitemaps.org/schemas/sitemap/0.9\">\n"; $xml.=" <url>\n"; $xml.=" <loc>".$my_domain."</loc>\n"; $xml.=" <priority>1.0</priority>\n"; $xml.=" </url>\n"; function file_type($file){ $path_chunks = explode("/", $file); $thefile = $path_chunks[count($path_chunks) - 1]; $dotpos = strrpos($thefile, "."); return strtolower(substr($thefile, $dotpos + 1)); } $file_count = 0; $path = opendir($root_path_to_site); while (false !== ($filename = readdir($path))) { $files[] = $filename; } sort($files); foreach ($files as $file) { $extension = file_type($file); if($file != '.' && $file != '..' && array_search($extension, $file_types_to_include) !== false) { $file_count++; $xml.=" <url>\n"; $xml.=" <loc>".$my_domain.$file."</loc>\n"; $xml.=" <lastmod>".date("Y-m-d",filemtime($file))."</lastmod>\n"; $xml.=" <changefreq>monthly</changefreq>\n"; $xml.=" <priority>0.5</priority>\n"; $xml.=" </url>\n"; } } $xml.="</urlset>\n"; if($file_count == 0){ echo "No files to add to the Sitemap\n"; } else { $sitemap=fopen("sitemap.xml","w+"); if (is_writable("sitemap.xml")) { fwrite($sitemap,$xml); fclose($sitemap); echo "DONE! <a href='sitemap.xml'>View sitemap.xml</a><br>\n"; echo "Remove items you do not want included in the search engines.<br>\n"; echo "Modify < changefreq > and < priority > to taste.<br>\n"; echo "Add 'Sitemap: ".$my_domain."sitemap.xml' to robots.txt.<br>\n"; } else { exec("touch sitemap.xml"); exec("chmod 666 sitemap.xml"); if (is_writable("sitemap.xml")) { fwrite($sitemap,$xml); fclose($sitemap); exec("chmod 644 sitemap.xml"); echo "DONE! <a href='sitemap.xml'>View sitemap.xml</a><br>\n"; echo "Remove items you do not want included in the search engines.<br>\n"; echo "Modify < changefreq > and < priority > to taste.<br>\n"; echo "Add 'Sitemap: ".$my_domain."sitemap.xml' to robots.txt.<br>\n"; } else { echo "File is not writable.<br>\n"; } } } ?>
StupidScript is offline  
Add Post to del.icio.us
Reply With Quote
Old 04-12-2007, 07:27 PM   #4 (permalink)
Inactive
 
StupidScript's Avatar
 
Join Date: 09-22-06
Location: Los Angeles
Posts: 678
iTrader: 0 / 0%
Latest Blog:
None

StupidScript is just really niceStupidScript is just really niceStupidScript is just really niceStupidScript is just really niceStupidScript is just really niceStupidScript is just really niceStupidScript is just really niceStupidScript is just really niceStupidScript is just really niceStupidScript is just really niceStupidScript is just really nice
See also this announcement from Yahoo!.
Quote:
Since working with Google and Microsoft [JB: and Ask and IBM] to support a single format for submission with Sitemaps, we have continued to discuss further enhancements to make it easy for webmasters to get their content to all search engines quickly.
Below, please find the PHP code for generating these types of sitemaps semi-automatically. It's FREE! Enjoy! (Please post modifications.)
Code:
<?php /*######################################################## # Generates a sitemap per specifications found at: # # http://www.sitemaps.org/protocol.html # # DOES NOT traverse directories # # 20070712 James Butler james at musicforhumans dot com # # Based on opendir() code by mike at mihalism dot com # # http://us.php.net/manual/en/function.readdir.php#72793 # # Free for all: http://www.gnu.org/licenses/lgpl.html # # # # Useage: # # 1) Save this as file name: sitemap_gen.php # # 2) Change variables noted below for your site # # 3) Place this file in your site's root directory # # 4) Run from http://www.yourdomain.com/sitemap_gen.php # # # # <lastmod> -OPTIONAL # # YYYY-MM-DD # # <changefreq>-OPTIONAL # # always # # hourly # # daily # # weekly # # monthly # # yearly # # never # # <priority> -OPTIONAL # # 0.0-1.0 [default 0.5] # # # # Add completed sitemap file to robots.txt: # # Sitemap: http://www.yourdomain.com/sitemap.xml # # # ########################################################*/ ######## CHANGE THESE FOR YOUR SITE ######### # IMPORTANT: Trailing slashes are REQUIRED! $my_domain = "http://www.yourdomain.com/"; $root_path_to_site = "/root/path/to/site/"; $file_types_to_include = array('html','htm'); ############## END CHANGES ################## $xml ="<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n"; $xml.="<urlset xmlns=\"http://www.sitemaps.org/schemas/sitemap/0.9\">\n"; $xml.=" <url>\n"; $xml.=" <loc>".$my_domain."</loc>\n"; $xml.=" <priority>1.0</priority>\n"; $xml.=" </url>\n"; ## START Modified mike at mihalism dot com Code ###### function file_type($file){ $path_chunks = explode("/", $file); $thefile = $path_chunks[count($path_chunks) - 1]; $dotpos = strrpos($thefile, "."); return strtolower(substr($thefile, $dotpos + 1)); } $file_count = 0; $path = opendir($root_path_to_site); while (false !== ($filename = readdir($path))) { $files[] = $filename; } sort($files); foreach ($files as $file) { $extension = file_type($file); if($file != '.' && $file != '..' && array_search($extension, $file_types_to_include) !== false) { $file_count++; ### END Modified mike at mihalism dot com Code ###### $xml.=" <url>\n"; $xml.=" <loc>".$my_domain.$file."</loc>\n"; $xml.=" <lastmod>".date("Y-m-d",filemtime($file))."</lastmod>\n"; $xml.=" <changefreq>monthly</changefreq>\n"; $xml.=" <priority>0.5</priority>\n"; $xml.=" </url>\n"; } } $xml.="</urlset>\n"; if($file_count == 0){ echo "No files to add to the Sitemap\n"; } else { $sitemap=fopen("sitemap.xml","w+"); if (is_writable("sitemap.xml")) { fwrite($sitemap,$xml); fclose($sitemap); echo "DONE! <a href='sitemap.xml'>View sitemap.xml</a><br>\n"; echo "Remove items you do not want included in the search engines.<br>\n"; echo "Modify < changefreq > and < priority > to taste.<br>\n"; echo "Add 'Sitemap: ".$my_domain."sitemap.xml' to robots.txt.<br>\n"; } else { exec("touch sitemap.xml"); exec("chmod 666 sitemap.xml"); if (is_writable("sitemap.xml")) { fwrite($sitemap,$xml); fclose($sitemap); exec("chmod 644 sitemap.xml"); echo "DONE! <a href='sitemap.xml'>View sitemap.xml</a><br>\n"; echo "Remove items you do not want included in the search engines.<br>\n"; echo "Modify < changefreq > and < priority > to taste.<br>\n"; echo "Add 'Sitemap: ".$my_domain."sitemap.xml' to robots.txt.<br>\n"; } else { echo "File is not writable.<br>\n"; } } } ?>

Last edited by StupidScript : 04-12-2007 at 07:28 PM. Reason: PROPER ATTRIBUTION FOR CONTRIBUTED CODE
StupidScript is offline  
Add Post to del.icio.us
Reply With Quote
Old 04-13-2007, 03:24 AM   #5 (permalink)
Inactive
 
Join Date: 11-09-06
Posts: 446
iTrader: 0 / 0%
Latest Blog:
None

NothingButNet is a jewel in the roughNothingButNet is a jewel in the roughNothingButNet is a jewel in the roughNothingButNet is a jewel in the roughNothingButNet is a jewel in the roughNothingButNet is a jewel in the rough
This is a very good feature. I always find it amazing to have to submit my sitemaps to each search engine. This way I'll be able to focus on something else. Thank you.
NothingButNet is offline  
Add Post to del.icio.us
Reply With Quote
Old 04-15-2007, 08:25 PM   #6 (permalink)
Contributing Member
 
Comenius's Avatar
 
Join Date: 04-02-07
Location: San Francisco
Posts: 255
iTrader: 0 / 0%
Comenius is liked by somebodyComenius is liked by somebodyComenius is liked by somebodyComenius is liked by somebodyComenius is liked by somebody
This is good news, and from the press releases I've been seeing it seems like more and more search engines are piling on the band wagon.
Comenius is offline  
Add Post to del.icio.us
Reply With Quote
Old 04-15-2007, 09:22 PM   #7 (permalink)
Contributing Member
 
solidghost's Avatar
 
Join Date: 08-26-06
Posts: 241
iTrader: 0 / 0%
solidghost is a jewel in the roughsolidghost is a jewel in the roughsolidghost is a jewel in the roughsolidghost is a jewel in the roughsolidghost is a jewel in the roughsolidghost is a jewel in the roughsolidghost is a jewel in the rough
I was wondering, how often does Search Engine bots read the robots.txt file? Everytime?
solidghost is offline  
Add Post to del.icio.us
Reply With Quote
Old 04-16-2007, 03:10 PM   #8 (permalink)
Inactive
 
StupidScript's Avatar
 
Join Date: 09-22-06
Location: Los Angeles
Posts: 678
iTrader: 0 / 0%
Latest Blog:
None

StupidScript is just really niceStupidScript is just really niceStupidScript is just really niceStupidScript is just really niceStupidScript is just really niceStupidScript is just really niceStupidScript is just really niceStupidScript is just really niceStupidScript is just really niceStupidScript is just really niceStupidScript is just really nice
Every time.
StupidScript is offline  
Add Post to del.icio.us
Reply With Quote
Old 04-16-2007, 04:24 PM   #9 (permalink)
Inactive
 
Join Date: 11-09-06
Posts: 88
iTrader: 0 / 0%
Latest Blog:
None

ClassTopic.com is liked by many
It's about time they implemented this. Great new standard to have. I know that google has implemented it. Any word on Yahoo and the others?
ClassTopic.com is offline  
Add Post to del.icio.us
Reply With Quote
Old 04-16-2007, 05:06 PM   #10 (permalink)
Contributing Member
 
Comenius's Avatar
 
Join Date: 04-02-07
Location: San Francisco
Posts: 255
iTrader: 0 / 0%
Comenius is liked by somebodyComenius is liked by somebodyComenius is liked by somebodyComenius is liked by somebodyComenius is liked by somebody
Yahoo, MSN, Ask, Google and others all came out in support of it.

I just added the line to my sites. No clue if it's working, but figured what the heck.
Comenius is offline  
Add Post to del.icio.us
Reply With Quote
Go Back   Webmaster Forum > Marketing Forums > SEO Forum

Reply



Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On

Similar Threads
Thread Thread Starter Forum Replies Last Post
Sitemaps standard (or not) nasty.web SEO Forum 6 11-25-2006 11:35 AM
Win 2K3 Standard IIS6 FGTH Web Hosting Forum 3 09-29-2006 09:11 AM
Starter or Standard? andrewp Google Forum 1 07-31-2006 06:53 AM


Sponsor Links
Get exposure! Get exposure! Find Scripts Web Hosting Directory Get exposure! SEO Blog


All times are GMT -7. The time now is 04:51 AM.
© Copyright 2008 V7 Inc