Home Page the Perl Page Programming and Computer Reviews and Thailand Comments
Snatch - Working Code examples
snatch
dtd
subs
GUI
XML files are HERE
Snatch XML Samples
These work for me. They should work for you also, assuming a dozen things don't go wrong.
The LWP stuff is down below. I've been playing with RSS which makes fumbling with regexes obsolete! (well maybe :-)
RSS News Feeds via LWP
RSS is very simple. Looks like this:
<item>
<title>O'Reilly Labs Review: Object Design's eXcelon 1.1</title>
<link>http://xml.com/pub/1999/08/excelon/index.html?wwwrrr_rss</link>
<description>Jon Udell takes a look at eXcelon...(SNIP{</description>
<item>
You can use Eisenzopf's XML::RSS to pull the data out.
But our Small Parser can handle this easily just by adding title, link, and description elements to a DTD to the top of the feed. The rest of the code uses a small HereDoc to format each item as a table row.
RSS has more information available but the essentials are simple and portable.
<item>
<name>MotherOfPerl</name>
<method>LWP all</method>
<url>http://www.perlxml.com/rdf/moperl.rdf</url>
<sub><!--
my $dtd=<< "DTD";
<?xml version="1.0"?>
<!DOCTYPE rss [
<!ELEMENT item (title, link, description)>
]>
DTD
$s= $dtd . $_;
my @ll=parseXML($s,"item");
die "Parse Failed" if not @ll;
my ($temp, $text, $bg, $bgcolor);
print "<BR><A HREF=\"http://xml.com/\">MotherOfPerl RSS feed</A>\n";
print "<table> \n";
foreach $temp (@ll) {
if($bg) {$bgcolor='#CCFF99'; $bg=0;}
else {$bgcolor='#CCFFCC';$bg=1;}
$text= << "ARTICLE";
<tr bgcolor=\"$bgcolor\">
<td><font size='-1'><A HREF=\"$temp->{'link'}\">$temp->{'title'}</a></font></td>
ARTICLE
print $text;
}
print "</table>\n";
undef;
-->
</sub>
<item>
This produces something like this:
This is the minor quibble about the whether to show the description tag, sometimes its merely a duplicate. But this is such a simple and obvious thing to do with RSS that I put it in as a new method...
<item>
<name>MotherOfPerl</name>
<method>LWP RSS</method>
<options>#CCFFCC #CCFF99</options>
<url>http://www.perlxml.com/rdf/moperl.rdf</url>
<sub><!--
print $_;
-->
</sub>
<item>
That's nicely shorter :-))
Don't have to print it right there, of course. Wrap it up in other code or
store it $sys{"Mother's RSS"}=$_; and use later.
see scanrss.xml for a full page
LWP
LWP has chunk, all and file variations.
LWP chunk and plain
LWP are equivalent because chunk is the
default.
The data turns up in $_ ready for the <sub> to massage it.
Note: Make sure that <sub> returns false when chunking till you have what you want
or it won't work right!
This example gets weather from Yahoo. I grab 4 of these for various locations
and have a picture of the whole country. It's in a <TR> tag and ready to insert.
Weather from Yahoo
<item>
<name>weatherBKK</name>
<method>LWP chunk</method>
<url>http://weather.yahoo.com/forecast/Bangkok_TH_f.html</url>
<sub>
my $rebegin='<!--Begin Extended Forecast-->'; #Yahoo uses these
my $reend='<!--End forecast table-->'; # making extraction easy
m/^.*?$rebegin\s*(.*?)\s*$reend/is;
my $s=$1;
if ($s) {
$s=~ s!/?graphics/new_icons/!!sg;
$s= "<TR><TD><b>Bangkok</b></TD><TD>" . $s . "</TD></TR>";
}
</sub>
<item>
<method>LWP all</method> is available for shorter pages. It
just grabs everything in one shot and returns it in a scalar. And
<method>LWP file=/perl/somefile.htm</method> is also available.
Note this last returns the filename only (or an error message if something
went wrong). The Perl sub must open it to do something useful, unless you
only want to store the file somewhere, like this example that returns a
GIF.
Bank exchange Rates
<item>
<name>BangkokBank</name>
<method>LWP file=/localweb/images/bankrates.gif</method>
<url>http://www.bbl.co.th/cgi-bin/cgiwrap/nabbl/bankrates.cgi</url>
<sub>$_
</sub>
<item>
Bank Rates with Sockets
LWP doesn't handle everything. Here's some sample socket code.
<item>
<name>BankRates</name>
<sub><!--
use IO::Socket;
my ($request_string,$rate,$reply,$conn,$len);
my $base="THB";
my @quotes=("USD","MYR","GBP","AUD");
$s ='';
foreach (@quotes) {
$s= $s . sprintf "%s %s ",$_, fxp($base,$_);
}
#$s=~ s/\n\r//;
$_="<b>Thai Baht to $s</B>";
sub fxp {
my ($base,$quotecurrency) = @_;
$conn=IO::Socket::INET->new(
PeerAddr => "www.oanda.com",
PeerPort => 5011,
Proto => 'tcp');
die "Couldn't connect to host www.oanda.com\n" unless $conn;
$request_string="fxp/1.1\nbasecurrency: $base\nquotecurrency: $quotecurrency\n\n";
$len = length($request_string);
unless (syswrite($conn,$request_string,$len) == $len) {
print "www.oanda.com closed connection\n";
$conn->close();
die "No connection to www.oanda.com\n";
}
while ($reply=<$conn>) {
if ($reply=~/^\d+\.+\d*/) {
$rate=$reply;
last;
}
}
$rate;
}
-->
</sub>
<item>
Set the PC Clock
I like using my ISP for this one. Its usually faster and close enough unless you need to synch to a particular host.
<item>
<name>SetTime</name>
<sub>use Net::Time qw(inet_time); #set the pc clock
my $host="mozart.inet.co.th";
my $t=inet_time($host,'udp');
my ($sec, $min, $hour) = (localtime($t))[0,1,2];
my($old_sec, $old_min, $old_hour) = (localtime(time))[0,1,2];
my $min_diff = $min - $old_min;
my $hour = $old_hour;
if (abs($min_diff) < 30 ) { #if more than 30 minutes out set by hand!
$sec_diff = 60*(60*$hour + $min) + $sec - (60*(60*$old_hour + $old_min) + $old_sec);
if ($sec_diff > 3) { # set if more than 3 seconds off
my $time_new = "$hour:$min:$sec";
my $rc=system("time $time_new");
}
}
$timestr=sprintf("%s %02d:%02d:%02d %+d",substr($now,0,16),$hour,$min,$sec,$sec_diff);
</sub>
<item>
------------
flush
Normally the footer code will run and the page will be closed when Snatch runs out of items to run.
Sometimes you want to close the page manually. flush runs any <sub>
code, finishes with anything in footer, closes the output file and puts STDOUT back.
Snatch needs another page definition at this point or you can do something else before linking off to the unknown, see the IE examples.
<item>
<name>flush</name>
<method>flush</method>
<sub><!--
print "<BR>", checkTimer('report')/1000," seconds for the report<BR>\n";
undef;
-->
</sub>
<item>
Link
Specifying multiple XML files
The link method stuffs the XML files into a list and calls them. The %sys hash can be used to pass along information.
<item>
<name>do all these</name>
<method>link news.xml weather.xml</method>
<item>