Making RSS Feeds

You’ve heard the saying, “It’s better to give than to receive”. If that’s true we’re doing great today. We’re going to be on the giving end of RSS feeds. Being on the receiving end is easy, just fire up your feed reader/aggregator like google.com/reader and go down the list of new stories. Giving can be easy to, but sometimes it means getting into the code to produce exactly what you want.

Maybe you’re wondering what a feed is. Well they’re all over the internet. It is how the internet syndicates it’s content. Feeds are flying out of blogs, news web sites, and any other tuned in site with frequent content updates.

It’s basically just an xml file on a web server, like a web page or an image is a file on a web server. People use a dedicated tool to view the file contents. For a web site, you use a web browser, for a feed, you use a feed reader. I prefer the free web based reader at google.com/reader. Some email programs and web browsers are capable of being readers as well. As usual, if a tool is dedicated to a task, it is a better experience, that’s why I use a dedicated feed reader.

The Internet has lost a lot of it’s “geeks-only” feel since the web is full of stuff everyone can use. Web feeds still feel geek-ish, but that doesn’t make them any less useful. Reading feeds is like reading email. A message with a time stamp comes in and you read it. An email is addressed to a specific person or group, while a feed is open to the world. Just subscribe. Your feed reader becomes the starting point for all your internet news and updates.

So let’s go deeper. What’s in a feed technically? It’s just a text file with readable content organized using markup called XML, like a web page uses HTML. It’s editable with any basic text editor. Notepad, Textedit, everyone’s got an editor. Word to the wise – don’t use Word. Word wants to add it’s own formatting. You don’t want to change the formatting.

Editing XML means getting familiar with tags so you can give the content meaning. What’s a title? What’s a link? Tags distinguish these things. There are specific tags for feeds. There are different sets of these tags depending on which feed format you’re using. The one I always see is RSS 2, but there is also Atom, RSS 1 and RDF. An RSS 2 file uses the tags rss, channel, item, title, link, description, pubdate, and guid.

Everything is wrapped in an rss tag. In that is the channel tag and a bunch of item tags. The channel describes the feed. The feed has a title and copy, for example, which would go into the title and description tags respectively. Other tags describe the feed’s publication date and URL that would point to the feed’s associated web page.

The feed is hopefully full of stories and site updates. Each of these gets its own item tag. Some of the tags in the item tag are the same as the ones in the channel tag. title, link, description, pubdate. For every new blog entry or news story a new item tag is made. The title goes in the title tag, the body copy goes in the description tag, the web site link goes in the link tag, the date the item should appear in readers goes in the pubDate tag, etc.

Hey, you know this podcast? You may have received it because you’re subscribed to its feed using your podcast feed reader, iTunes. Well this podcast feed has got some extra tags in the channel and in each item. Apple Inc has declared it’s own feed namespace. These are tags starting with the word itunes followed by a colon followed by a descriptive word like owner, name, email, image, category, author, subtitle, summary, duration, or keywords. Some of these are a little redundant to the main feed tags, but they’re used by iTunes to get the info it needs from a feed.

<itunes:image href="http://www.davidvanvickle.com/images/seriesface2_300.png"/>

Podcasts may not have survived if it wasn’t for Apple . Podcasts certainly wouldn’t be on their way to thriving as they are now. Apple is making the complex simple. So buy an iPod, connect it to iTunes, get podcasts for free. Feeds are knitting all that together. So we’ll do what Apple says and include these tags in our podcast feed along with the others.

Anyway, we actually have one more tag to look at. It is the enclosure tag, which determines where the audio file will come from. Audio media type and file size are attributes of this tag.

So let’s say we’ve just Notepadded the heck out of an RSS 2 XML file using all these tags. What are the chances something isn’t right? The chances are high. So let’s upload the file to a web site, then wander over to a feed validator to debug it. feedvalidator.org is an awesome free service for debugging feeds. Go there, type the URL of your feed, and it will tell you how to fix any issues. Doing this beats the possibility of upsetting your feed reader and not having your content syndicated. Better to broadcast something that works.

feedvalidator.org even tells you if the http headers are wrong. If they are, modify your server to serve files with that extension as “Content-Type=text/xml”. Similarly, since you’re declaring the served content as xml, the first line of that xml file should be the first line of every good xml file.

<?xml version="1.0" encoding="iso-8859-1"?>

Okay eventually we survive validation. We just manually edited a feed.

Sometimes we can edit the feed directly like this, sometime we can’t. Sometimes someone else is producing the feed. If you have a blog, you already have a feed. Just keep on bloggin, and that will automatically update your feed.

Let’s go back to situations where we make it ourself. What are other methods besides Notepad? Do we want automated generation or manual generation? For automated, we could use a server side scripting language to crank through a database and produce our XML. For manual generation, we could get an off the shelf desktop application that outputs feeds. At least we’d get away from Notepad that way.

It is good when feeds are automatic – the product of some other activity, like a blog. Or maybe I’m a system admin and I want to get notifications from my servers when something goes wrong. The last thing I want is more email, especially from a non-human. Maybe I could write a program that puts system logs into a feed. Then anyone with access could opt in to see what’s up with the server.

Maybe I want to convert my feed from one with MP3 files to one with M4A files? I could read the feed and replace the enclosure tags, maybe add some iTunes tags.

Maybe I upload the school newsletter every week, and every time I do, the system updates the feed with a link to the new newsletter for subscribers.

Maybe I’m a viagra salesman, gaaa, and I want to crap on a feed by inserting advertising. If I was that kind of person and a programmer – highly unlikely – I could suck in a feed, and make every fifth item be an ad, or at the end of every item description I could put an ad. At least I could make it a targeted ad based on the subject matter of the feed.

The tools to make these things happen might be a web scripting language and a database. There are many approaches. PHP MySQL anyone? Each entry is a new database record. The feed is generated every time someone visits a URL by a web script which loops through the database records to produce the feed items. It serves xml instead of html content, as we learned.

What if I’m a sys admin? I am concerned with the feed being somewhat private. First I might deliver feed via SSL https instead of http. That would encrypt the transmission and most feed readers are okay with this. After I get the SSL going, I make sure users have to call the URL with an MD5 feed key instead of a guessable numeric database primary key. This would also facilitate multiple feeds from the same web script based on which MD5 key was passed in.

Ok let’s say I’m not a programmer. I want a desktop program that can make the feed for me. Adding an entry is a manual process. I use a program called Feeder for this podcast feed. Learn more at reinventedsoftware.com/feeder.

Here are some features you’ll want. Of course you want fast programmer-less edits. You want it to use itunes tags when podcasting. You want it to publish the feed for you, like via FTP.

You want it to “ping” services that have to load your feed. iTunes, technorati and feedburner are three examples of services you may want to ping if you have accounts set up with them. If you want them to always have your latest and greatest, you’ll want to ping them. iTunes can be ping-ed with a basic GET request that includes you’re podcast ID, however most pinging happens via an XML POST that contains the ping command. Of course, a good desktop app will keep that behind the scenes.

So I mentioned feedburner.com . They are good for a couple things – reporting and redirecting. They have other services but let’s focus on these 2 free things

Basically you give people a feedburner URL instead of your feed’s direct URL. By doing that, you are free to change your feed’s direct URL without leaving subscribers with a dead subscription. So now feedburner is getting a bunch of requests that they can analyze and produce usage reports for you. Great reporting.

Without feedburner we’re responsible for returning http redirect headers when the feed’s direct URL changes.

When you are forced to put your feed at a different URL, you want to think about redirecting existing subscribers to the new URL. You may not care and your subscribers will get a 404 response. You may care a little and give your subscribers a 410 “gone” response. At least you’re leaving a note, even if it’s “screw you guys”. Don’t leave nasty notes.

A good response is 301 with a header location redirect. Since they ultimately got the feed they were looking for, they may not notice that you delivered it from left field instead of from straight ahead.

If you’re on Apache you can use .htaccess files for redirecting. In the folder with the old feed file, load a file named .htaccess and insert a “redirect permanent” line like

Redirect permanent /oldfeed.xml http://www.domain.com/newfeed.xml

If you just can’t get enough XML, you could return an XML redirect. That would be a file in the old location that only contains redirect and newlocation tags.

Don’t be afraid to dig into the XML flying all around us.

Share and Enjoy:
  • del.icio.us
  • Facebook
  • Google Bookmarks
  • email
  • Twitter

Tags: , , , , , ,

Leave a Reply

CommentLuv Enabled