Kevin Reid (kpreid) wrote,
Kevin Reid
kpreid

pipermail2rss

I wanted to be able to read a mailing list as a feed, so I wrote this tool yesterday. It converts Pipermail or Hypermail mailing list archives (at least the one example of each I've tried) to RSS 2.0.

(I'd have used Atom except that <updated> is required, and that information is not easily available in bulk in a Mailman Pipermail archive.)

It uses xsltproc and TagSoup.

darcs repository: http://switchb.org/kpreid/2006/pipermail2rss/

p2r

#!/bin/sh

set -e

here=`dirname $0`
soup="java -jar $here/tagsoup.jar --nodefaults"

monthrel=$(curl -s "$1" | $soup | xsltproc "$here/month.xsl" -)

curl -s "$1/$monthrel" | $soup | xsltproc --stringparam prefix "$1`dirname $monthrel`/" "$here/messages.xsl" -

month.xsl

<?xml version="1.0" standalone="yes"?>
<t:stylesheet
  xmlns:t="http://www.w3.org/1999/XSL/Transform"
  xmlns:h="http://www.w3.org/1999/xhtml"
  xmlns=""
  version="1.0"
>
  <t:output method="text"/>
  <t:template match="/">
    <month>
      <t:value-of select="//h:td/h:a[text()='[ Date ]']/@href
                          | //h:tr[position()=last()]/h:td/h:a[text()='By Date']/@href"/>
    </month>
  </t:template>
</t:stylesheet>

messages.xsl

<?xml version="1.0" standalone="yes"?>
<t:stylesheet
  xmlns:t="http://www.w3.org/1999/XSL/Transform"
  xmlns:h="http://www.w3.org/1999/xhtml"
  xmlns=""
  version="1.0"
>
  <t:template match="/">
    <rss version="2.0"><channel>
    
    <title><t:value-of select="/h:html/h:head/h:title"/></title>
    
    <t:for-each select="//h:li/h:a/@name/../..">
    <t:sort order="descending" data-type="number" select="position()"/>
    <item>
    <t:for-each select="h:em">
      <pubDate><t:value-of select="substring(., 2, string-length(.) - 2)"/></pubDate>
      </t:for-each>
    
      <title><t:value-of select="h:a"/> - <t:value-of select="h:i|h:a/h:em"/></title>
      <link><t:value-of select="$prefix"/><t:value-of select="h:a/@href"/></link>
    </item>
    </t:for-each>
    
    </channel></rss>
  </t:template>
</t:stylesheet>

CGI script to serve the RSS:

#!/bin/sh

PATH=/bin:/usr/bin:/usr/local/bin

echo Status: 200 OK
echo Content-Type: application/rss+xml
echo

/path/p2r <address of mailing list archive, with trailing slash>
Tags: programs, web, xslt
Subscribe

  • Post a new comment

    Error

    default userpic

    Your reply will be screened

    Your IP address will be recorded 

    When you submit the form an invisible reCAPTCHA check will be performed.
    You must follow the Privacy Policy and Google Terms of use.
  • 0 comments