Home Page   the Perl Page   Programming and Computer Reviews   and Thailand Comments

Snatch - Working Code examples

snatch      dtd     subs      GUI    XML files are HERE

Snatch XML Samples

These work for me. They should work for you also, assuming a dozen things don't go wrong.
The LWP stuff is down below. I've been playing with RSS which makes fumbling with regexes obsolete! (well maybe :-)

RSS News Feeds via LWP

RSS is very simple. Looks like this:
<item>
     <title>O'Reilly Labs Review: Object Design's eXcelon 1.1</title>
     <link>http://xml.com/pub/1999/08/excelon/index.html?wwwrrr_rss</link>
     <description>Jon Udell takes a look at eXcelon...(SNIP{</description>
 <item>
You can use Eisenzopf's XML::RSS to pull the data out. But our Small Parser can handle this easily just by adding title, link, and description elements to a DTD to the top of the feed. The rest of the code uses a small HereDoc to format each item as a table row.
RSS has more information available but the essentials are simple and portable.
<item>
<name>MotherOfPerl</name> <method>LWP all</method> <url>http://www.perlxml.com/rdf/moperl.rdf</url> <sub><!-- my $dtd=<< "DTD"; <?xml version="1.0"?> <!DOCTYPE rss [ <!ELEMENT item (title, link, description)> ]> DTD $s= $dtd . $_; my @ll=parseXML($s,"item"); die "Parse Failed" if not @ll; my ($temp, $text, $bg, $bgcolor); print "<BR><A HREF=\"http://xml.com/\">MotherOfPerl RSS feed</A>\n"; print "<table> \n"; foreach $temp (@ll) { if($bg) {$bgcolor='#CCFF99'; $bg=0;} else {$bgcolor='#CCFFCC';$bg=1;} $text= << "ARTICLE"; <tr bgcolor=\"$bgcolor\"> <td><font size='-1'><A HREF=\"$temp->{'link'}\">$temp->{'title'}</a></font></td> ARTICLE print $text; } print "</table>\n"; undef; -->
</sub> <item>
This produces something like this:
MotherOfPerl RSS feed
Tutorial 7: URL redirects with CGI.pm
Tuiorial 6: Monitoring Internet services by name with Moniker
phpHoo: A strong competitor to PerlHoo
Tutorial 5: PerlHoo, Part III

This is the minor quibble about the whether to show the description tag, sometimes its merely a duplicate. But this is such a simple and obvious thing to do with RSS that I put it in as a new method...
<item>
<name>MotherOfPerl</name> <method>LWP RSS</method> <options>#CCFFCC #CCFF99</options> <url>http://www.perlxml.com/rdf/moperl.rdf</url> <sub><!-- print $_; -->
</sub> <item>
That's nicely shorter :-))
Don't have to print it right there, of course. Wrap it up in other code or
store it $sys{"Mother's RSS"}=$_; and use later.
see scanrss.xml for a full page

LWP

LWP has chunk, all and file variations. LWP chunk and plain LWP are equivalent because chunk is the default.
The data turns up in $_ ready for the <sub> to massage it.
Note: Make sure that <sub> returns false when chunking till you have what you want or it won't work right!

This example gets weather from Yahoo. I grab 4 of these for various locations and have a picture of the whole country. It's in a <TR> tag and ready to insert.

Weather from Yahoo


<item>
<name>weatherBKK</name> <method>LWP chunk</method> <url>http://weather.yahoo.com/forecast/Bangkok_TH_f.html</url> <sub> my $rebegin='<!--Begin Extended Forecast-->'; #Yahoo uses these my $reend='<!--End forecast table-->'; # making extraction easy m/^.*?$rebegin\s*(.*?)\s*$reend/is; my $s=$1; if ($s) { $s=~ s!/?graphics/new_icons/!!sg; $s= "<TR><TD><b>Bangkok</b></TD><TD>" . $s . "</TD></TR>"; }
</sub> <item>
<method>LWP all</method> is available for shorter pages. It just grabs everything in one shot and returns it in a scalar. And <method>LWP file=/perl/somefile.htm</method> is also available. Note this last returns the filename only (or an error message if something went wrong). The Perl sub must open it to do something useful, unless you only want to store the file somewhere, like this example that returns a GIF.

Bank exchange Rates


<item>
<name>BangkokBank</name> <method>LWP file=/localweb/images/bankrates.gif</method> <url>http://www.bbl.co.th/cgi-bin/cgiwrap/nabbl/bankrates.cgi</url> <sub>$_
</sub> <item>

Bank Rates with Sockets

LWP doesn't handle everything. Here's some sample socket code.
<item>
<name>BankRates</name> <sub><!-- use IO::Socket; my ($request_string,$rate,$reply,$conn,$len); my $base="THB"; my @quotes=("USD","MYR","GBP","AUD"); $s =''; foreach (@quotes) { $s= $s . sprintf "%s %s ",$_, fxp($base,$_); } #$s=~ s/\n\r//; $_="<b>Thai Baht to $s</B>"; sub fxp { my ($base,$quotecurrency) = @_; $conn=IO::Socket::INET->new( PeerAddr => "www.oanda.com", PeerPort => 5011, Proto => 'tcp'); die "Couldn't connect to host www.oanda.com\n" unless $conn; $request_string="fxp/1.1\nbasecurrency: $base\nquotecurrency: $quotecurrency\n\n"; $len = length($request_string); unless (syswrite($conn,$request_string,$len) == $len) { print "www.oanda.com closed connection\n"; $conn->close(); die "No connection to www.oanda.com\n"; } while ($reply=<$conn>) { if ($reply=~/^\d+\.+\d*/) { $rate=$reply; last; } } $rate; } -->
</sub> <item>

Set the PC Clock

I like using my ISP for this one. Its usually faster and close enough unless you need to synch to a particular host.
<item>
<name>SetTime</name> <sub>use Net::Time qw(inet_time); #set the pc clock my $host="mozart.inet.co.th"; my $t=inet_time($host,'udp'); my ($sec, $min, $hour) = (localtime($t))[0,1,2]; my($old_sec, $old_min, $old_hour) = (localtime(time))[0,1,2]; my $min_diff = $min - $old_min; my $hour = $old_hour; if (abs($min_diff) < 30 ) { #if more than 30 minutes out set by hand! $sec_diff = 60*(60*$hour + $min) + $sec - (60*(60*$old_hour + $old_min) + $old_sec); if ($sec_diff > 3) { # set if more than 3 seconds off my $time_new = "$hour:$min:$sec"; my $rc=system("time $time_new"); } } $timestr=sprintf("%s %02d:%02d:%02d %+d",substr($now,0,16),$hour,$min,$sec,$sec_diff);
</sub> <item>

------------

flush

Normally the footer code will run and the page will be closed when Snatch runs out of items to run.
Sometimes you want to close the page manually. flush runs any <sub> code, finishes with anything in footer, closes the output file and puts STDOUT back. Snatch needs another page definition at this point or you can do something else before linking off to the unknown, see the IE examples.
<item>
<name>flush</name> <method>flush</method> <sub><!-- print "<BR>", checkTimer('report')/1000," seconds for the report<BR>\n"; undef; -->
</sub> <item>

Link

Specifying multiple XML files The link method stuffs the XML files into a list and calls them. The %sys hash can be used to pass along information.
<item>
<name>do all these</name> <method>link news.xml weather.xml</method> <item>