A ton of services are popping up that take all of your (Twitter, Google Reader, Netflix, Blog, etc) RSS feeds and make a “stream” of your online activity. My favorite is FriendFeed. Last night I decided to see how fast I could whip one up in Django, and the result is http://stream.benjamingolub.com. So far I’ve got it crawling my Blog, Flickr, Google Reader, and Twitter (the services I use the most). It only took me a few hours to write and most of that time was spent on templates. Some things I learned about RSS feeds:
- Universal Feed Parser saved me a lot of time.
- Not all RSS feeds are created equally. Come on Google Reader, why don’t you tell me the time that I shared an item? Instead all I get is the time the item was published. So instead I have to make sure that my cycle time for a crawl is fast enough that I can use the current time as the time I shared it. At most I’ll only be off by a minute or two, but this could easily be included in the feed by Google.
- Tagging was dead simple to get working. I just pull in the tags (using Universal Feed Parser), create each one if it doesn’t already exist, and add them to the event. The more events that are pulled into my stream, the more tags I get. FriendFeed doesn’t have tags visible at the moment, but I’m sure they are collecting this data. Here is all it takes to do in Django:
if 'tags' in entry:
from django.template.defaultfilters import slugify
for tag in entry.tags:
tag, created = Tag.objects.get_or_create(title = tag.term.strip(), slug = slugify(tag.term.strip()))
event.tags.add(tag) - This one’s obvious but don’t rely on receiving valid data, you can bet someone will have malformed HTML that will mess with your site. So before I display any data I strip the tags, urlize the urls (if the feed contains http://www.google.com it will turn that into a link), and then truncate to 100 words. In Django that looks like this:
{{ event.content|striptags|urlize|truncatewords:100 }}
I’ve got more ideas for crawlers in the works. The great thing about doing it on my own server is that I can store usernames and passwords I use for various accounts. So FriendFeed can’t get my Facebook feed (because they’d need my credentials) but I’ll be able to.
Update: I used to be able to correctly space out my python code but can’t anymore, oh well.

Add New Comment
Viewing 2 Comments
Thanks. Your comment is awaiting approval by a moderator.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Add New Comment
Trackbacks