Login or Sign Up to become a member!
LessThanDot Sit Logo

LessThanDot

Web Developer

Less Than Dot is a community of passionate IT professionals and enthusiasts dedicated to sharing technical knowledge, experience, and assistance. Inside you will find reference materials, interesting technical discussions, and expert tips and commentary. Once you register for an account you will have immediate access to the forums and all past articles and commentaries.

LTD Social Sitings

Lessthandot twitter Lessthandot Linkedin Lessthandot friendfeed Lessthandot facebook Lessthandot rss

Note: Watch for social icons on posts by your favorite authors to follow their postings on these and other social sites.

Your profile

    Search

    XML Feeds

    Google Ads

    « Create HTML from output of Perl::CriticUsing a Perl Singleton to Share Values Between Objects »
    comments

    This is a remarkably simple trick which I've found very handy. With a few lines of Perl you can take any RSS feed and format it to your liking.

    Get the Feed

    You can do this using LWP::Simple:

    1. use LWP::Simple;
    2.  
    3. my $feed_url = 'http://feeds.bbci.co.uk/news/rss.xml';
    4. my $feed = get($feed_url)
    5.         or die ("Failed to fetch feed.");

    Process the Raw Result

    Using XML::RSS, convert the raw feed into a more manageable hash.

    1. use XML::RSS;
    2.  
    3. my $rss = XML::RSS->new();
    4. $rss->parse($feed);

    Format to Your Liking

    Template::Toolkit can take in a template and a hash reference of values to substitute into the template.

    1. # Define a template
    2. my $template = <<"TEMPLATE";
    3. [% channel.title %]
    4.  
    5. Headlines:
    6. [% FOREACH item = items %]
    7. [% item.pubDate %]\t[% item.title %]
    8. [% END %]
    9. TEMPLATE

    This simple template will take the BBC news feed from above and print out a list of headlines with publication dates.

    1. my $tt = Template->new()
    2.         or die ("Failed to load template: $Template::ERROR\n");
    3.  
    4. # Combine the template with the processed RSS feed.
    5. $tt->process ( \$template, $rss )
    6.         or die $tt->error();

    Putting it All Together

    1. #!/usr/bin/perl
    2. use strict;
    3. use warnings;
    4.  
    5. use XML::RSS;
    6. use LWP::Simple;
    7. use Template;
    8.  
    9. ##################
    10. # Configuration:
    11. #
    12. ##################
    13. my $feed_url = 'http://feeds.bbci.co.uk/news/rss.xml';
    14. my $template = <<"TEMPLATE";
    15. [% channel.title %]
    16.  
    17. Headlines:
    18. [% FOREACH item = items %]
    19. [% item.pubDate %]\t[% item.title %]
    20. [% END %]
    21. TEMPLATE
    22.  
    23. ##################
    24. ##################
    25.  
    26. my $tt = Template->new()
    27.         or die ("Failed to load template: $Template::ERROR\n");
    28. my $feed = get($feed_url)
    29.         or die ('Failed to fetch feed.');
    30. my $rss = XML::RSS->new();
    31. $rss->parse($feed);
    32.  
    33. $tt->process ( \$template, $rss )
    34.         or die $tt->error();
    1. rob@arrakis:~/public_html/rss-reader$ perl rss-reader.pl
    2. BBC News - Home
    3.  
    4. Headlines:
    5.  
    6. Fri, 14 Sep 2012 12:15:22 GMT   Kate privacy invasion 'grotesque'
    7.  
    8. Fri, 14 Sep 2012 11:49:10 GMT   US missions on film protest alert
    9.  
    10. Fri, 14 Sep 2012 12:25:25 GMT   Alps attack girl returning to UK
    11.  
    12. Fri, 14 Sep 2012 10:17:03 GMT   Woman is held after car body find

    In this example the RSS feed and template are defined in code but they can just as easily be defined in files or a database allowing for changes/additions without deploying new code.

    2007 views
    InstapaperVote on HN

    3 comments

    Comment from: Dave Cross [Visitor] · http://perlhacks.com/
    Dave Cross Hi,

    One little bug in your code and one suggestion for an improvement.

    The bug: You're using single quotes around the end marker for your heredoc. That means that the heredoc will be treated as a single-quoted string and the \t won't be expanded to a tab character. Better to use double quotes there.

    The improvement: If you use XML::Feed instead of XML::RSS then very similar code will be able to cope with Atom feeds as well as RSS feeds.

    [Interesting. When I included examples of the heredoc syntax in this comment it caused a problem in your blog software. I've had to remove the examples in order to post this comment.]
    09/15/12 @ 09:47
    Comment from: Rob Earl [Member] Email
    Rob Earl That'll teach me to make formatting changes at the last minute, thanks for pointing it out!
    09/20/12 @ 11:33
    Comment from: Ngan Tengyuen [Visitor] · https://plus.google.com/115133286769644313447/
    Ngan Tengyuen this tutorial is just what i've been searching the internet for, thanks for wirting. cheers
    12/10/12 @ 04:46

    Leave a comment


    Your email address will not be revealed on this site.

    To mislead the spambots.

    Your URL will be displayed.
    (Line breaks become <br />)
    (Name, email & website)
    (Allow users to contact you through a message form (your email will not be revealed.)