Retooling the URL: The Steps

In case you didn’t notice, I finally did a little housekeeping with my URL structure (after writing about it many, many times). Thanks in large part to a bunch of articles I’ve read recently about URLs, and an excellent conversion tutorial from Olivier Travers (which is where almost all of my tricks came from), I’m pleased to announce that my site now has a much better URL structure (in my mind), but it didn’t come without a lot of work.

The premise was to create a cookie-crumb trail URL scheme so that anyone could read a story:

http://www.inluminent.com/weblog/archives/2003/06/14/acuna/

and by deleting the directory (or crawling up the directory structure), they could read all of the stories for that day:

http://www.inluminent.com/weblog/archives/2003/06/14/

or month:

http://www.inluminent.com/weblog/archives/2003/06/

or, that if they wanted to they could browse the category archives more easily:

http://www.inluminent.com/weblog/archives/categories/leadership_management/
http://www.inluminent.com/weblog/archives/categories/marketing_advertising/
though I still need to build the master category page that should reside at http://www.inluminent.com/weblog/archives/categories/

After reading this article against file extensions on the web, I also didn’t want people to have to know that I was using PHP, though I don’t mind them knowing, so I really wanted everything to look like it was sitting in a directory (even if it’s really a file, or it’s really sitting in its own directory). It should be transparent to the user, and still Google friendly, and user friendly… so:

Here are the steps I used to get my URLs straight, and not lose any traffic from old links, or search engines (GoogleJuice) that haven’t updated their links (stolen largely from Olivier and improved in a few places)

1. With the individual entry path still set at <$MTEntryTitle dirify=”1″$>, replaced the individual entry template to:

<?php
$NewUrl = "<$MTArchiveDate format="%Y/%m/%d/"$><$MTEntryTitle dirify="1"$>/";
$NewUrl = "http://www.inluminent.com/weblog/archives/" . $NewUrl;
header("HTTP/1.1 301 Moved Permanently");
header("Location: $NewUrl");
exit();
?>

This is so that all of the old inbound links will get redirected to the proper place which will be created in a few steps.

2. Rebuilt individual entries. (This took a bit of time, but not too long. As Olivier’s example states, at this point, old links aren’t working anymore, but we’ll fix that in a few steps)

3. Changed the individual entry path (in MT’s archiving settings) to:
<$MTArchiveDate format="%Y/%m/%d/"$><$MTEntryTitle dirify="1"$>.php

Note: This is different than Olivier’s approach, as I didn’t want to have a whole lot of individual directories to maintain in the filesystem, but rather one directory per day in each month containing however many posts were created that day.

Also changed the daily archive’s entry path:
<$MTArchiveDate format="%Y/%m/%d/index.php"$>

the monthly archive’s path:
<$MTArchiveDate format="%Y/%m/index.php"$>

and the category archive’s path:
categories/<$MTCategoryLabel dirify="1"$>.php

4. Replaced the individual entry template with my old template.

5. Rebuilt individual entries. (At this point, old links almost work again because the redirects set up in step 1 now point to directories much like the files created in step 5, but not quite… I’ll fix that in a minute with a mod_rewrite trick I learned… read on.)

6. Added the following lines to my .htaccess file to redirect monthly and category archives pages which were easy to handle through regexp thanks to their previous structure.

Options +FollowSymLinks
RewriteEngine On
RewriteBase /
RewriteRule weblog/archives/200([0-9])_([0-9])([0-9])_(.*)(\.php)$
http://www.inluminent.com/weblog/archives/200$1/$2$3/$4 [R=301]
RewriteRule weblog/archives/200([0-9])_(.*)(\.php)$
http://www.inluminent.com/weblog/archives/200$1/$2/ [R=301]
RewriteRule weblog/archives/cat_(.*)(\.php)$
http://www.inluminent.com/weblog/archives/categories/$1/ [R=301,L]

(formating note: each line in the .htaccess file starts with “RewriteRule”, ie. there aren’t any breaks in the code when it’s in the real file on the server)

7. Added the following rule (taken from Keith’s “no extensions” entry) so that category pages (which are technically category_name.php) can be delivered as directories (among other page types)

RewriteRule ^([^.]+[^/])$ $1/ [R=permanent,L]
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME}\.php -f
RewriteRule ^(.+[^/]) $1.php

8. Added some code to all of the internal links (depending on which type there were that replaced the ‘.php’ or ‘index.php’ with ‘/’ (as appropriate) so that all links on the site go to the correct place:

<?php echo str_replace("index.php","","<$MTArchiveLink$>"); ?>
<?php echo str_replace(".php","/","<$MTEntryLink$>"); ?>

Oh, and if you’re looking for a decent mod_rewrite primer, here’s one at Kuro5hin.

One last thing, Olivier, since you’re stuck on IIS as your dev platform of choice, you’re probably going to be looking ISAPI_rewrite which I pointed to back in November of ’02.

4 Responses to “Retooling the URL: The Steps”


  • It seems somewhat redundant to bother with mod_rewrite as well as PHP-based 301 redirects, IMO. I’d’ve just written out the PHP archives and been done with it.

    Well, that’s what I did do when I did this about a week ago. Though, I must admit, the PHP solution is better than my META refresh one. šŸ˜‰

  • While I can understand (being a victim myself šŸ™‚ the obsessive-compulsiveness that drives a person to do this, it seems that a much better solution is just to use link rel urls. That’s what they are there for.

  • Dear God in heaven, this looks awesome, and yet it makes my head hurt. I’d been thinking of doing exactly this for a while.. thanks for documenting the steps!

  • Thanks for the info. I’ll be doing this myself soon.

Leave a Reply