Posted By

karlhorky on 09/29/09


Tagged

rss feed date element node replace xml Value feeds child modify dateformat pubdate


Versions (?)

Perl: Modify XML Feed Items' pubDate


 / Published in: Perl
 

Read in a remote XML file, change the pubDate date format, and output to specific local file.

  1. # Modify XML Feed Items' pubDate to different date format
  2. # Author: Karl Horky
  3. # Date: 29 September 2009
  4. #
  5. # Sample Input:
  6. # <?xml version="1.0" encoding="UTF-8"?>
  7. # <rss version="2.0">
  8. # <channel>
  9. # <title>News</title>
  10. # <link>http://www.news.com/</link>
  11. # <description>The latest headlines</description>
  12. # <language>en-us</language>
  13. # <copyright>Copyright �© 2009</copyright>
  14. # <ttl>5</ttl>
  15. # <item>
  16. # <title>News Item 1</title>
  17. # <description>Item Description 1</description>
  18. # <link>http://www.news.com/news_release_1.htm</link>
  19. # <pubDate>Tue, 29 Sep 2009 17:47:42 GMT</pubDate>
  20. # </item>
  21. #
  22. # <item>
  23. # <title>News Item 2</title>
  24. # <description>Item Description 2</description>
  25. # <link>http://www.news.com/news_release_2.htm</link>
  26. # <pubDate>Mon, 24 Aug 2009 07:00:00 GMT</pubDate>
  27. # </item>
  28. # </channel>
  29. # </rss>
  30. #
  31. #
  32. #
  33. # Sample Output file:
  34. # <rss version="2.0">
  35. # <channel>
  36. # <title>News</title>
  37. # <link>http://www.news.com/</link>
  38. # <description>The latest headlines</description>
  39. # <language>en-us</language>
  40. # <copyright>Copyright �© 2009</copyright>
  41. # <ttl>5</ttl>
  42. # <item>
  43. # <title>News Item 1</title>
  44. # <description>Item Description 1</description>
  45. # <link>http://www.news.com/news_release_1.htm</link>
  46. # <pubDate>September 29, 2009</pubDate>
  47. # </item>
  48. #
  49. # <item>
  50. # <title>News Item 2</title>
  51. # <description>Item Description 2</description>
  52. # <link>http://www.news.com/news_release_2.htm</link>
  53. # <pubDate>August 24, 2009</pubDate>
  54. # </item>
  55. # </channel>
  56. # </rss>
  57. #
  58. #!/usr/local/bin/perl
  59.  
  60. require LWP::UserAgent;
  61. use Data::Dumper;
  62. use XML::Simple;
  63. use Date::Manip;
  64.  
  65. $output = 'feed.xml'; # The location of your output file
  66.  
  67. my $ua = LWP::UserAgent->new;
  68. $ua->timeout(10);
  69. $ua->env_proxy;
  70. my $response = $ua->get('http://www.example.com/feed.xml'); # The location of the input file
  71.  
  72. if ($response->is_success) {
  73. $xml = $response->content;
  74. } else {
  75. exit (1);
  76. }
  77. my $xs = new XML::Simple(keeproot => 1,searchpath => ".", forcearray => 1, keyattr => [key, tag]);
  78.  
  79. my $ref = $xs->XMLin($xml, KeepRoot => 1);
  80.  
  81. foreach my $item (@{$ref->{rss}->[0]->{channel}->[0]->{item}}){
  82. my $currDate = \$item->{pubDate}->[0];
  83. $$currDate = UnixDate($$currDate,"%B %d, %Y");
  84. }
  85.  
  86. my $xml = $xs->XMLout($ref, KeepRoot=>1);
  87.  
  88. open (OUT, ">$output") or die "Cannot open file $output: $!\n";
  89. print OUT $xml;
  90. close (OUT);

Report this snippet  

You need to login to post a comment.