feed.rss and sitemap.xml
posted by Stephan Brumme
Discovery
In November 2010 I saw space shuttle Discovery in Cape Canaveral. A completely different kind of discovery is widespread among internet sites, especially blogs: Both are available now for create.stephan-brumme.com.News feed - feed.rss
My RSS 2.0 feed is generated on-the-fly by a simple PHP script.Except for
<lastBuildDate> and <pubDate>,
the header (everything up to <item>) is static.All
<item> tags are filled while scanning the file system.
The optional <description> is missing at the moment but might
appear over the next days.
A small excerpt is shown below:
feed.rss:
<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
<channel>
<title>create.stephan-brumme.com News Feed</title>
<description>blogging about some weird computer stuff</description>
<link>http://create.stephan-brumme.com</link>
<atom:link href="http://create.stephan-brumme.com/feed.rss" rel="self" type="application/rss+xml" />
<lastBuildDate>Sat, 1 Oct 2011 22:00:00 +0000</lastBuildDate>
<pubDate>Sat, 1 Oct 2011 22:00:00 +0000</pubDate>
<language>en-us</language>
<copyright>(C)2011 Stephan Brumme</copyright>
<generator>Homegrown And Environmentally Friendly</generator>
<image>
<url>http://create.stephan-brumme.com/favicon.png</url>
<title>create.stephan-brumme.com News Feed</title>
<link>http://create.stephan-brumme.com</link>
<width>16</width>
<height>16</height>
</image>
<item>
<title><![CDATA[Feed.rss and Sitemap.xml]]></title>
<link>http://create.stephan-brumme.com/misc/rss-and-sitemap.html</link>
<pubDate>Sat, 1 Oct 2011 22:00:00 +0000</pubDate>
<guid isPermaLink="true">http://create.stephan-brumme.com/adsense/</guid>
<source url="http://create.stephan-brumme.com/feed.rss">create.stephan-brumme.com</source>
</item>
<!-- several more items will follow here ... -->
</channel>
</rss>
I had to ran several validation tests with the W3 RSS validator until it passed all tests:My main problem was that the validator is quite picky about the date format. During the process of getting everything right, I learnt about PHP's constant
DATE_RFC2822
(I never heard about it before),
I learnt that I should use gmdate instead of date
and found at least 2 small bugs in my homegrown Content Management System.
Search Engine Feed - sitemap.xml
Search engines like Google are responsible for the vast majority of my web site visitors.
Often it takes several days - and sometimes over a month - for their web spiders to discover new
additions to my web site. A sitemap.xml speeds up this discovery process by magnitudes.
Read more about sitemaps and their XML specification.The code is almost the same as used for the RSS feed. This time,
date("c", $someTimestamp) works best for lastMod.
The changefreq is manually set to monthly for all blogs entries and
weekly for the front page.
Priorities are manually set as well to 0.5 or 0.8.
A small excerpt is shown below:
sitemap.xml:
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>http://create.stephan-brumme.com/</loc>
<lastmod>2011-10-02T09:35:09+02:00</lastmod>
<changefreq>weekly</changefreq>
<priority>0.8</priority>
</url>
<url>
<loc>http://create.stephan-brumme.com/misc/rss-and-sitemap.html</loc>
<lastmod>2011-10-02T00:00:00+02:00</lastmod>
<changefreq>monthly</changefreq>
<priority>0.5</priority>
</url>
<!-- several more URLs will follow here ... -->
</urlset>
Configuring Apache
Most Apache web servers only sent .PHP files to the PHP compiler. There are two options to generate the feeds with PHP:mod_rewrite- parse
.rssand.xmlwith PHP
mod_rewrite is available, you can add these rules (regular expressions)
to .htaccess:
RewriteEngine On
RewriteRule ^sitemap.xml$ /sitemap.php [last]
RewriteRule ^feed.rss$ /feed.php [last]
sitemap.xml and feed.rss
are redirected to some PHP files which in turn must contain the necessary code to generate the proper feeds.The
mod_rewrite techniques works very well but this time I went for method 2 and
added this single line to .htaccess:
AddType x-mapp-php5 .xml .rss
sitemap.xml and feed.rss now actually exist
on my server and contain all required PHP code.Note: Be careful when adding other
.rss or .xml
files because this might produce undesired results, especially mis-interpretation of the
first line's <? and ?>.
