feed.rss
and sitemap.xml
posted by Stephan Brumme
Discovery
In November 2010 I saw space shuttle Discovery in Cape Canaveral. A completely different kind of discovery is widespread among internet sites, especially blogs: Both are available now for create.stephan-brumme.com.News feed - feed.rss
My RSS 2.0 feed is generated on-the-fly by a simple PHP script.Except for
<lastBuildDate>
and <pubDate>
,
the header (everything up to <item>
) is static.All
<item>
tags are filled while scanning the file system.
The optional <description>
is missing at the moment but might
appear over the next days.
A small excerpt is shown below:
feed.rss:
<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
<channel>
<title>create.stephan-brumme.com News Feed</title>
<description>blogging about some weird computer stuff</description>
<link>http://create.stephan-brumme.com</link>
<atom:link href="http://create.stephan-brumme.com/feed.rss" rel="self" type="application/rss+xml" />
<lastBuildDate>Sat, 1 Oct 2011 22:00:00 +0000</lastBuildDate>
<pubDate>Sat, 1 Oct 2011 22:00:00 +0000</pubDate>
<language>en-us</language>
<copyright>(C)2011 Stephan Brumme</copyright>
<generator>Homegrown And Environmentally Friendly</generator>
<image>
<url>http://create.stephan-brumme.com/favicon.png</url>
<title>create.stephan-brumme.com News Feed</title>
<link>http://create.stephan-brumme.com</link>
<width>16</width>
<height>16</height>
</image>
<item>
<title><![CDATA[Feed.rss and Sitemap.xml]]></title>
<link>http://create.stephan-brumme.com/misc/rss-and-sitemap.html</link>
<pubDate>Sat, 1 Oct 2011 22:00:00 +0000</pubDate>
<guid isPermaLink="true">http://create.stephan-brumme.com/adsense/</guid>
<source url="http://create.stephan-brumme.com/feed.rss">create.stephan-brumme.com</source>
</item>
<!-- several more items will follow here ... -->
</channel>
</rss>
I had to ran several validation tests with the W3 RSS validator until it passed all tests:My main problem was that the validator is quite picky about the date format. During the process of getting everything right, I learnt about PHP's constant
DATE_RFC2822
(I never heard about it before),
I learnt that I should use gmdate
instead of date
and found at least 2 small bugs in my homegrown Content Management System.
Search Engine Feed - sitemap.xml
Search engines like Google are responsible for the vast majority of my web site visitors.
Often it takes several days - and sometimes over a month - for their web spiders to discover new
additions to my web site. A sitemap.xml speeds up this discovery process by magnitudes.
Read more about sitemaps and their XML specification.The code is almost the same as used for the RSS feed. This time,
date("c", $someTimestamp)
works best for lastMod
.
The changefreq
is manually set to monthly
for all blogs entries and
weekly
for the front page.
Priorities are manually set as well to 0.5
or 0.8
.
A small excerpt is shown below:
sitemap.xml:
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>http://create.stephan-brumme.com/</loc>
<lastmod>2011-10-02T09:35:09+02:00</lastmod>
<changefreq>weekly</changefreq>
<priority>0.8</priority>
</url>
<url>
<loc>http://create.stephan-brumme.com/misc/rss-and-sitemap.html</loc>
<lastmod>2011-10-02T00:00:00+02:00</lastmod>
<changefreq>monthly</changefreq>
<priority>0.5</priority>
</url>
<!-- several more URLs will follow here ... -->
</urlset>
Configuring Apache
Most Apache web servers only sent .PHP files to the PHP compiler. There are two options to generate the feeds with PHP:mod_rewrite
- parse
.rss
and.xml
with PHP
mod_rewrite
is available, you can add these rules (regular expressions)
to .htaccess
:
RewriteEngine On
RewriteRule ^sitemap.xml$ /sitemap.php [last]
RewriteRule ^feed.rss$ /feed.php [last]
sitemap.xml
and feed.rss
are redirected to some PHP files which in turn must contain the necessary code to generate the proper feeds.The
mod_rewrite
techniques works very well but this time I went for method 2 and
added this single line to .htaccess
:
AddType x-mapp-php5 .xml .rss
sitemap.xml
and feed.rss
now actually exist
on my server and contain all required PHP code.Note: Be careful when adding other
.rss
or .xml
files because this might produce undesired results, especially mis-interpretation of the
first line's <?
and ?>
.