Perl - modules and programs

Home Page     NEW!  Bangkok     Computer Reviews     Thailand     Hubble Deep Space Wallpaper
Perl
    grabbing chunks of pages  Snatch
    neat applications - In praise of Perl
    collapable menus TOC for ActivePerl
    discover what you've got Indexer

    I've been mucking around with Perl since I overloaded my Dos system with it some years ago. This was frustrating so I wrote a C interpreter with built in regexes, automatic line splitting, list manipulation and a dumb terminal (in 56kb :-). Aside from playing about in Linux I didn't use Perl much.
    With Perl 5, the Activestate release on Windows platforms and its utility on the Internet, Perl has become much more usable and useful. I cheerfully abandoned my interpreter, now Perl is better :-).
Managing ActiveState HTML documentation
    Any Perl hacker on Win32 grabs every ppm module that looks halfway interesting and the docs for it end up in the Table of Contents which gets very very Long
With a little help with some collapsable menu javascript from Matt Kruse and a bit of work with HTML::Parser we can create something a bit more managable.
    This does cookies so it will keep the sections you are interested in open.
gentoc.zip (about 7k)

Indexing your hard disk
    This has been showing me lots of HTML pages that I've carefully filed away as useful but then forgotten.
    Give indexdb.pl a path and a db name and let it run. Then look.pl takes a keyword and produces a framed view with the hits list on the left and the first page open on the right.   look.pl uses Win32::OLE to call Internet Explorer (but you don't need to do that if you don't want too). and the style sheet is courtesy (they haven't complained) of Activestate, so it looks comfortably familiar.
    indexdb.pl can take a stop list argument.  What is on the path that you don't want to index (sometimes very important!). If you need something especially complex for a regex here it will be better to hack the program than trying to get it off the command line unmangled. If you want more than one path, also, modify indexdb.pl as needed.
    I use a junk.db just to poke around the undusted files from 6 months ago. Indexing is fast enough to make a one time database useful so playing with it is fun.
Featuritis:
    What about my text files? I want to be able to delete trash easily! I want a new query to go in the same window. Can we get neat dropdown menus and a folder button for files in the same directory? (yeah, maybe)
indexer.zip (about 4k)

Ftpmirror.pm
Yes, its yet another upload utility. This seemed to be a good starting place for learning about the Internet and dealing with it in Perl. I have half a dozen sites or so and besides the html there are backups of critical local data to consider. This has a couple of features I needed, like how to ignore certain files and directories, which works this way:
$ignore="\.log|\.bak|\.psp|\.zip|Archive|junk*"; Don't do anything with log files, backups, huge photoshop stuff, the Archive directory or anything named junk. This regex is matched against the file list so I keep the garbage out, or some of it anyway.
For the rest of it that tends to accumulate when you automate something like backups there is a delete list:
$deletefirst=2; # 0-delete after upload, 1-delete before, 2 no delete The delete file is 'delete.txt', and it gets renamed to 'delete.bak' after every operation. That's hardwired in since I forgot to change the switch one time and wiped out a megabytes worth of newly uploaded zip files. It takes a deliberate effort to get the delete function going each time. There's a switch called $gimmeftplistonly=1; which is used to get a complete file list and I pick what I don't want from that.

The module assumes you have a local directory that updates remote sites you can get to using FTP. Setting it up easy, give it a directory name and the proper params and away you go. It will create new directories as needed. I'm updating several sites using the same local directory and use one small file per destination. It looks like this:

use strict;
use Ftpmirror;
$gimmeftplistonly=0;
$localbase="C:/web";
$localname="BangkokWizard"; #this is your starting directory
$site="ftp.xoom.com";
$user="username";
$pass="password";
$ignore="\.log|\.bak|\.psp|test\.html|junk.*";
$logfile="bwwx.log";
$deletefirst=2; # 0-delete after upload, 1-delete before, 2 no delete
ftpmirror();

Make up a file like the above, drop Ftpmirror.pm in perl/lib and you're ready to go. It uses Win32::Internet so you will need that. It also uses TeeOutput for the log file but you can comment out that part easily. There are a couple of variables you might want to play with if you need to debug something.

$gimmeLocallistonly=0;
$showFTPresponse=0;
$sitebase="public_html"; # might need this on some sites, handles it properly

Ftpmirror.pm decides to upload on file size. I was originally using date but this can fail when you're doing a lot of fast changes and size rarely does. The date hooks are still in there is you want to do something about that. I've been using it successfully for several months but do consider it Beta till you've wrung it out for yourself. Comments are welcome but remember this was my first Perl 5 project and is my first module so be nice. ;-)
Todo
It could be a bit faster and an obvious way to do that is to check local file times against the log file and ignore directories with no changes. It also could be more modular, also but don't feel like fiddling with that yet.
get it here, Ftpmirror.pm (8.1 kb).

tracker.cgi
I wanted to track who was hitting my pages, which is straightforward using the $ENV{'REMOTE_HOST'} and similar variables available to CGI, but with the addition of a little javascript borrowed :-) from Extreme you get more useful infomation like a better referrer and screen resolution if the browser supports 1.2 javascript.
I've adapted it so a 'Who' variable in the query string determines the log file, so I can have serveral web sites running off of the same script.
Tracker outputs an image, a 1x1 transparent GIF by default, but the 'pic' variable changes the image.
Here's the Javascript:

<script language="javascript"></script> <noscript><img src='"/cgi-bin/tracker.cgi?who=bwwx" height=1 width=1'></noscript>
and here is tracker.cgi
It's been working reliably for several months.
The script to output tables from the data is ugly but minimally functional.
Another version is in the works and will replace this one shortly. Right now the problem is that it works nicely using my local version of Apache but dies on the remote system. :-(((.
Comments, Suggestions and requests go here robert@bangkokwizard.com

Images and Icons
I use the Apache as an icon for my CGI files and this as an icon for PM files. Yes, I know its lopsided, the original is the Ring Nebula from the Hubble space telescope and is much prettier...Look here or on the Hubble Deep Space Wallpaper page

These were done by Matt Kruse and they came out better than my own camel so
here, take them all icons.zip 5 kb.

Back to the TOP
New Snatch grabbing chunks of pages

Page by BangkokWebWizard Any problems please contact robert@bangkokwizard.com Last Modified