mod_rewrite
Introduction.
Welcome to mod_rewrite, the Swiss Army Knife of URL manipulation! Despite the tons of examples and docs, mod_rewrite is voodoo!
This module uses a rule-based rewriting engine (based on a regular-expression parser) to rewrite requested URLs on the fly. It supports an unlimited number of rules and an unlimited number of attached rule conditions for each rule to provide a really flexible and powerful URL manipulation mechanism. The URL manipulations can depend on various tests, for instance server variables, environment variables, HTTP headers, time stamps and even external database lookups in various formats can be used to achieve a really granular URL matching.
This module operates on the full URLs (including the path-info part) both in per-server context (httpd.conf) and per-directory context (.htaccess) and can even generate query-string parts on result. The rewritten result can lead to internal sub-processing, external request redirection or even to an internal proxy throughput.
This module was invented and originally written in April 1996.
[1]
How do I find out if the server supports mod_rewrite?
At domain.com/.htaccess have
XBitHack Full
Options +FollowSymlinks
RewriteEngine on
RewriteBase /
RewriteRule ^index.page$ index.html [L]
(If the index file isn't index .html, change that to what ever it is.)
Then go to domain.com/index.page
If the index page shows up then you got mod_rewrite. if you're on a Windows server, it probably won't work.
How to change your URLs from dynamic to search engine friendly static URLs using mod_rewrite.
Get an example of the dynamic URL and the way you want it. For example
http://www.domain.com/cgi-bin/store.cgi?section=Nintendo&id=4867635&item=Pokemon
and
http://www.domain.com/store/Nintendo/4867635/Pokemon.html
Now that you got both URLs, make a .htaccess file starting with...
Options +Indexes
Options +FollowSymlinks
RewriteEngine on
RewriteBase /
RewriteRule ^
Depending on the server, you might not need the first two lines.
Right after RewriteRule ^ enter the static URL, then a $, a space, and then original URL (with out the http://www.domain.com part for both URLs).
You now got...
Options +Indexes
Options +FollowSymlinks
RewriteEngine on
RewriteBase /
RewriteRule ^store/Nintendo/4867635/Pokemon.html$ cgi-bin/store.cgi?section=Nintendo&id=4867635&item=Pokemon
In the first URL, the static URL code, where ever the URL will change, replace it with a [^.]+) (Nintendo, 4867635
and Pokemon in the example above).
Then after .html add a $ and add a \ before the .html
If you have a hyphen (-) or period in the new static URL, add a \ before it, for example...
RewriteRule ^store/[^.]+)/[^.]+)/[^.]+)\.html$ cgi-bin/store.cgi?section=Nintendo&id=4867635&item=Pokemon
If you don't add the \, you might get an Internal Server Error message, depending on the servers Apache version.
Now in the static part of the URL where the URL changes, in the first change, change it to $1, then $2 and so on. Then add an [L] at the very end, with a space before the [L].
You now got...
Options +Indexes
Options +FollowSymlinks
RewriteEngine on
RewriteBase /
RewriteRule ^store/[^.]+)/[^.]+)/[^.]+)\.html$ cgi-bin/store.cgi?section=$1&id=$2&item=$3 [L]
Save the .htaccess file and upload it at domain.com/.htaccess and your static URLs will now work.
http://www.domain.com/store/Nintendo/4867635/Pokemon.html
instead of
http://www.domain.com/cgi-bin/store.cgi?section=Nintendo&id=4867635&item=Pokemon
Here's some other examples...
http://www.domain.com/cgi-bin/store.cgi?section=Nintendo&id=4867635
RewriteRule ^store/[^.]+)/[^.]+)\.html$ cgi-bin/store.cgi?section=$1&id=$2 [L]
http://www.domain.com/cgi-bin/store.cgi?section=Nintendo
RewriteRule ^store/[^.]+)\.html$ cgi-bin/store.cgi?section=$1 [L]
http://www.domain.com/cgi-bin/store.cgi
RewriteRule ^index\.html$ cgi-bin/store.cgi [L]
In this last example domain.com will show the index of the script. If the page shows nothing, try
RewriteRule ^$ cgi-bin/store.cgi [L]
With all the examples combined, you got...
Options +Indexes
Options +FollowSymlinks
RewriteEngine on
RewriteBase /
RewriteRule ^store/[^.]+)/[^.]+)/[^.]+)\.html$ cgi-bin/store.cgi?section=$1&id=$2&item=$3 [L]
RewriteRule ^store/[^.]+)/[^.]+)\.html$ cgi-bin/store.cgi?section=$1&id=$2 [L]
RewriteRule ^store/[^.]+)\.html$ cgi-bin/store.cgi?section=$1 [L]
RewriteRule ^index\.html$ cgi-bin/store.cgi [L]
Notice the order. If you list it as...
Options +Indexes
Options +FollowSymlinks
RewriteEngine on
RewriteBase /
RewriteRule ^index\.html$ cgi-bin/store.cgi [L]
RewriteRule ^store/[^.]+)\.html$ cgi-bin/store.cgi?section=$1 [L]
RewriteRule ^store/[^.]+)/[^.]+)\.html$ cgi-bin/store.cgi?section=$1&id=$2 [L]
RewriteRule ^store/[^.]+)/[^.]+)/[^.]+)\.html$ cgi-bin/store.cgi?section=$1&id=$2&item=$3 [L]
then mod_rewrite will freak out and it won't work! List the line with the most variables first, then the second most and so on.
If you have more than one script, make sure you give each of them unique directory names or a different extension. For example, you can't let two different scripts change to product/WHATEVER.html but must give them for example product/whatever.html and product/whatever.htm (extension), or item/whatever.html and product/whatever.html (Directory name change.)
Now upload the .htaccess so it's at domain.com/.htaccess
Ack!!! Now it's messing up the rest of my site.
If you have domain.com/index.html for example and have
RewriteRule ^[^.]+)\.html$ store.cgi?section=$1 [L]
in the code, make sure your mod_rewrited URLs use another extension, like .htm or .shtml or a unique directory name, for example...
RewriteRule ^store/[^.]+)\.shtml$ store.cgi?section=$1 [L]
I make the .htaccess changes, but the links still point to the same URLs.
You must edit the script to make the links point to the new URLs. mod_rewrite only lets the fake URLs work.
Can I have the .htaccess in a directory?
Yes.
In the above example, for having it at domain.com/store/.htaccess, change the code to...
Options +Indexes
Options +FollowSymlinks
RewriteEngine on
RewriteBase /store/
RewriteRule ^[^.]+)/[^.]+)/[^.]+)\.html$ /cgi-bin/store.cgi?section=$1&id=$2&item=3 [L]
RewriteRule ^[^.]+)/[^.]+)\.html$ /cgi-bin/store.cgi?section=$1&id=$2 [L]
RewriteRule ^[^.]+)\.html$ /cgi-bin/store.cgi?section=$1 [L]
RewriteRule ^index\.html$ /cgi-bin/store.cgi [L]
You moved store/ up to the RewriteBase line and added / before cgi-bin. If the script was in /store/store.cgi
you would of had store/ instead of cgi-bin/ and then just got rid of it, to look like...
Options +Indexes
Options +FollowSymlinks
RewriteEngine on
RewriteBase /store/
RewriteRule ^[^.]+)/[^.]+)/[^.]+)\.html$ store.cgi?section=$1&id=$2&item=3 [L]
RewriteRule ^[^.]+)/[^.]+)\.html$ store.cgi?section=$1&id=$2 [L]
RewriteRule ^[^.]+)\.html$ store.cgi?section=$1 [L]
RewriteRule ^index\.html$ store.cgi [L]
The URL to the index of the store will be domain.com/store/
Don't search engines already index dynamic URLs?
The biggest ones do like Yahoo, MSN, and Google, or it looks like they do. Here's a quote from some one who changed to mod_rewrite in November 2005...
Quote:
|
My site 771 pages was indexed by google and after implementing mod_write today google indexed over 9000 pages i would like to thanks Nintendo for starting such a thread that help me get better position. After doing all the optimization and with last google update i was ranking 80 for the most important keyword but today i am at 9th place visible on first page. Trafic is increasing from 1500 daily visitors to arround 4000 visitors.
|
The original script URLs don't have the product name in the URL. Can I add the product name to the URL?
Yes! If you can change the script to put the product names in the URL, or edit the links to link to them, yes you can. Here's an example. Notice there are two [^.]+)'s and no $2.
RewriteRule ^[^.]+)/[^.]+)\.html$ cgi-bin/file.cgi?Item=$1 [L]
Just edit the script links, or links in the static page to link to domain.com/whatever/PRODUCT_NAME.html have the product name show up where the last [^.]+) is in the .htaccess code.
But how can I get rid of special characters or spaces?
For perl, you can do search and replaces, for example...
$value =~ s/ /_/g;
$value =~ s/?//g;
or
$value =~ s/[^\w\d\-_. ]//g;
which gets rid of almost everything but letters and numbers. Just make sure it only changes the URL and not the content. As for php or asp, I don't know how to do it there.
Can I rewrite a sub-domain to a directory?
xxxxx.domain.com
to
www.domain.com/XXXXXX/
RewriteCond %{HTTP_HOST} ^[www\.]*xxxxx.domain-name.com [NC]
RewriteCond %{REQUEST_URL} !^/XXXXX/.*
RewriteRule ^[^.]+) /XXXXX/$1 [L]
According to mnemtsas on another message board.
Does .htaccess increase server load?
I have yet to ever see it increase server load on my dedicated server. IMO, that's just a rumor. I got about 30 domains with about 54 lines in the domain.com/.htaccess file and have yet to ever see it effect the server. The only effect I've ever got is getting GoogleBombed (Google chomping away at the static URLs so much that the server almost crashes or does crash!!!). Don't panic. This is why you have static URLs, to help search engines crawl your site.
If you ever see high server loads or a slow server, try optimizing Apache.
How do I optimize Apache?
You have to have access to the actual server through telnet as root.
Edit your httpd.conf file.
Here's the best settings I've found.
Timeout 50
KeepAlive On
MaxKeepAliveRequests 120
KeepAliveTimeout 10
MinSpareServers 10
MaxSpareServers 20
StartServers 16
MaxClients 125
MaxRequestsPerChild 5000
and then restart apache. Even when I have massively HIGH server loads, the sites are fast. Once I had the server load above 100, which is EXTREMELY high, and the static pages loaded as if nothing was high!!
Don't ask me how to do it. If you don't know what you're doing, don't mess with it. Ask your web host. Mess up and your sites can 'die' until it get's fixed! For example, simply pressing return can crash your sites until you go back and undo the return, geting it back to how it was before.
How can I do a 301 redirect?
at domain.com/.htaccess
Options +Indexes
Options +FollowSymlinks
RewriteEngine on
RewriteBase /
RewriteRule ^whatever/[^.]+)$ http://www.domain.com/$1/ [R=301,L]
or
RewriteRule ^index.htm$ http://www.domain.com/ [R=301,L]
The second example only changes one URL.
[^.]+) and $1 work the same way here as in mod_rewrite, so you can easily change a lot of URLs with one line. The only change with redirects and mod_rewrite is the R=301 (Redirect 301).
How can I create a 404 error message?
ErrorDocument 404 /Error404.html
and what ever is at that file will show up. domain.com/Error404.html
You can do the same thing for other server errors if you got there number.
Conclusion.
Yes, mod_rewrite is voodoo, and it may look hard to learn, but it's not that hard. When I first tried to figure it out, I spent a day over at apache.org and hardly got any where (hence there is only one link there as the source to the introduction.) I then posted over on the Amazon Associate board, some one gave me a few lines of code, I changed it a little and with in a day I had a completely search engine friendly Amazon store using
MrRats script, and
my mod_rewrite hack, which as you may know by now, it completely revolutionized the Amazon AWS industry, until it drove Google insane! mod_rewrite rocks, if you got any URLs that have ?, =, or &, do mod_rewrite!
__________________________________________________
If you have trouble changing the URL and want help, post your current URL, the way you want it, and your .htaccess code and I can give you the correct code.