Administration

How to fix the meetup.com broken exported calendars.

I’m a big fan of meetup.com, but they’re so tragically unhip when it comes to mashups/integration/web 2.0.  One of my biggest gripes until about 6 months ago was that they had no facility (besides API) for exporting a calendar of meetups to my calendar app (I use Google Calendar), or any other calendar app for that matter.

They introduced an export feature recently, but it’s pretty useless.  Here’s why: they offer two calendars

  • [Calendar A] contains all upcoming items in all your meetup groups
  • [Calendar B] contains upcoming items which you have RSVP’d with “yes” or “maybe”.

That’s it.  The calendars exported don’t even contain links that allow you to RSVP from directly inside your calendar — you have click through to the meetup.com site, log in, then RSVP.  Ugh.

 

Come on, product guys.  What’s really called for is 4 separate calendars.

  • [Calendar "yes"] All groups, “yes” events
  • [Calendar "maybe"] All groups, “maybe” events
  • [Calendar "no"] All groups, “no” events
  • [Calendar "none"] All groups, events to which I have not yet submitted an RSVP.

I was finally just pissed off enough about the status quo that I fixed it for myself, and below I share the code.  You can try it out here: http://spicylogic.com/allenday/cgi-bin/mu.cgi?key=<your_api_key>&cal=<calendar> 

where <your_api_key> can be found here and <calendar> is one of “yes”, “no”, “none”, “maybe”.

Okay, here’s the code.  Install it on your own machine if possible, my ISP will appreciate it.  If you find fuckups, let me know and I’ll update the post.

#!/usr/bin/perl
use strict;
use CGI qw(:standard);
use Date::Manip qw(ParseDate ParseDateString ParseDateDelta DateCalc UnixDate);
use Date::Parse;
use HTML::Entities;
use LWP::Simple qw(get);
use XML::DOM;
 
use constant URL_EVENTS =&gt; 'http://api.meetup.com/events?key=%s&amp;member_id=%d&amp;format=xml';
 
print header(q(text/calendar));
 
my $parser = new XML::DOM::Parser ();
 
my $mode = param( 'cal' );
my $key  = param( 'key' );
my $user = param( 'user' );
 
if ( ! $mode || ! $key || ! $user ) {
  die
}
 
my $events_url = sprintf( URL_EVENTS, $key, $user );
#warn $events_url;
my $events_txt = get( $events_url );
#warn $events_txt;
my $events_dom = $parser-&gt;parse( $events_txt );
#warn $events_dom;
 
print qq(BEGIN:VCALENDAR\nPRODID:-//Meetup Inc//RemoteApi//EN\nVERSION:2.0\nMETHOD:PUBLISH\nCALSCALE:GREGORIAN\nX-ORIGINAL-URL:http://www.meetup.com/\nX-WR-CALNAME:mu $mode\n);
 
my $events = $events_dom-&gt;getElementsByTagName( 'item' );
for ( my $i = 0 ; $i &lt; $events-&gt;getLength() ; $i++ ) {
  my $event = $events-&gt;item( $i );
  my $n_id    = $event-&gt;getElementsByTagName( 'id'             )-&gt;item( 0 )-&gt;getFirstChild();
  my $n_rsvp  = $event-&gt;getElementsByTagName( 'myrsvp'         )-&gt;item( 0 )-&gt;getFirstChild();
  my $n_addr0 = $event-&gt;getElementsByTagName( 'venue_name'     )-&gt;item( 0 )-&gt;getFirstChild();
  my $n_addr1 = $event-&gt;getElementsByTagName( 'venue_address1' )-&gt;item( 0 )-&gt;getFirstChild();
  my $n_addr2 = $event-&gt;getElementsByTagName( 'venue_address2' )-&gt;item( 0 )-&gt;getFirstChild();
  my $n_addr3 = $event-&gt;getElementsByTagName( 'venue_address3' )-&gt;item( 0 )-&gt;getFirstChild();
  my $n_addr4 = $event-&gt;getElementsByTagName( 'venue_city'     )-&gt;item( 0 )-&gt;getFirstChild();
  my $n_addr5 = $event-&gt;getElementsByTagName( 'venue_state'    )-&gt;item( 0 )-&gt;getFirstChild();
  my $n_addr6 = $event-&gt;getElementsByTagName( 'venue_zip'      )-&gt;item( 0 )-&gt;getFirstChild();
  my $n_desc  = $event-&gt;getElementsByTagName( 'description'    )-&gt;item( 0 )-&gt;getFirstChild();
  my $n_link  = $event-&gt;getElementsByTagName( 'event_url'      )-&gt;item( 0 )-&gt;getFirstChild();
  my $n_name  = $event-&gt;getElementsByTagName( 'name'           )-&gt;item( 0 )-&gt;getFirstChild();
  my $n_lat   = $event-&gt;getElementsByTagName( 'venue_lat'      )-&gt;item( 0 )-&gt;getFirstChild();
  my $n_lon   = $event-&gt;getElementsByTagName( 'venue_lon'      )-&gt;item( 0 )-&gt;getFirstChild();
  my $n_start_time  = $event-&gt;getElementsByTagName( 'time'           )-&gt;item( 0 )-&gt;getFirstChild();
 
  my $start_time;
  my $end_time;
 
  #my $dummy_time = "20000101T000000Z";
  my ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime(time());
  my $dummy_time = sprintf( q(%04d%02d%02dT%02d%02d%02dZ), $year + 1900, $mon + 1, $mday, $hour, $min, $sec );
 
  if ( $n_start_time ) {
    my ($ss,$mm,$hh,$day,$month,$year,$zone);
 
    ($ss,$mm,$hh,$day,$month,$year,$zone) = strptime( $n_start_time-&gt;toString() );
    $start_time = sprintf( q(%04d%02d%02dT%02d%02d%02dZ), $year + 1900, $month + 1, $day, $hh, $mm, $ss );
 
    my $eday = $day;
    if ( $hh == 23 ) {
      $eday = $day + 1;
    }
    my $ehh = ($hh + 1) % 24;
    $end_time   = sprintf( q(%04d%02d%02dT%02d%02d%02dZ), $year + 1900, $month + 1, $eday, $ehh, $mm, $ss );
  }
  else {
    $start_time = '';
    $end_time = '';
  }
 
  if ( $mode eq $n_rsvp-&gt;toString() ) {
    my $id   = $n_id-&gt;toString();
    my $name = $n_name ? $n_name-&gt;toString() : "";
    my $desc = $n_desc ? $n_desc-&gt;toString() : "";
    my $addr = ( $n_addr0 ? $n_addr0-&gt;toString().', ' : "" )
             . ( $n_addr1 ? $n_addr1-&gt;toString().', ' : "" )
             . ( $n_addr2 ? $n_addr2-&gt;toString().', ' : "" )
             . ( $n_addr3 ? $n_addr3-&gt;toString().', ' : "" )
             . ( $n_addr4 ? $n_addr4-&gt;toString().', ' : "" )
             . ( $n_addr5 ? $n_addr5-&gt;toString().', ' : "" )
             . ( $n_addr6 ? $n_addr6-&gt;toString() : "" );
    #$desc =~ s/(.)/(ord($1) &gt; 127) ? "" : $1/egs;
 
    $name = HTML::Entities::decode_entities( $name );
    $desc = HTML::Entities::decode_entities( $desc );
    $addr = HTML::Entities::decode_entities( $addr );
    $name =~ s/,/\\,/g;
    $desc =~ s/,/\\,/g;
    $addr =~ s/,/\\,/g;
 
    $desc =~ s#
#\\n#gs;
    $desc .= "\\n\\n\\nGoing?\\n\\n";
    foreach my $response ( qw( yes no maybe ) ) {
      $desc .= uc($response).qq(: http://api.meetup.com/rsvp?event_id=$id&amp;key=$key&amp;rsvp=$response\\n);
    }
 
    my $geo = $n_lat &amp;&amp; $n_lon ? "GEO:" . $n_lat-&gt;toString() . ";" . $n_lon-&gt;toString() . "\n" : undef;
 
    #print sprintf( qq(BEGIN:VEVENT\nSUMMARY:%s\nDESCRIPTION:%s\nLAST-MODIFIED:%s\nUID:%s\nCLASS:%s\nCREATED:%s\nDTSTAMP:%s\nDTSTART:%s\nDTEND:%s\nLOCATION:%s\n\nURL:%s\nEND:VEVENT\n),
    print sprintf( qq(BEGIN:VEVENT\nSUMMARY:%s\nDESCRIPTION:%s\nLAST-MODIFIED:%s\nUID:%s\nCLASS:%s\nCREATED:%s\nDTSTAMP:%s\nDTSTART:%s\nDTEND:%s\n%sLOCATION:%s\nURL:%s\nEND:VEVENT\n),
      $name,
      $desc,
      $start_time,
      "event_$id\@meetup.com",
      "PUBLIC",
      $dummy_time,
      $dummy_time,
      $start_time,
      $end_time,
      $geo,
      $addr,
      $n_link ? $n_link-&gt;toString() : "",
    );
  }
}
 
print qq(END:VCALENDAR\n);

Administration
Life
Networking
Perl
Software

Comments (1)

Permalink

EveryDNS – free DNS service

http://www.everydns.com

Found this today, and it works as advertised. Need to look more closely, but with this I think I can stop paying dyndns.com $30/year/domain for custom DNS.

Administration
Business

Comments (0)

Permalink

WordPress – collapse redundant tags

I’ve been experimenting with automation of WordPress posts. Probably I’m doing something wrong with the way I make the XML RPC calls, but I find that I end up with redundant tags in my database. For instance, if I tagged two separate, RPC-posted posts with “orange”, I get two different tags both called “orange”. Until I figure out how to fix this properly, here’s a little script that will clean up the database by consolidating all redundantly named tags to one tag. You might want to back up your database before using this…

#!/usr/bin/perl
use strict;
use DBI;
 
######configuration
my $PREFIX = 'wp_h5otpn_';
my $DB = '';
my $HOST = '';
my $USER = '';
my $PASS = '';
######
my $dbh = DBI->connect(qq(dbi:mysql:database=$DB;host=$HOST), $USER, $PASS) or die $!;
 
my $term_sth   = $dbh->prepare(qq(SELECT * FROM (SELECT name, count(name) AS c FROM ${PREFIX}terms GROUP BY name) AS d WHERE d.c > 1));
my $name_sth   = $dbh->prepare(qq(SELECT term_id FROM ${PREFIX}terms WHERE name = ?));
my $update_sth = $dbh->prepare(qq(UPDATE ${PREFIX}term_relationships SET term_taxonomy_id = (SELECT term_taxonomy_id FROM ${PREFIX}term_taxonomy WHERE term_id = ?) WHERE term_taxonomy_id = (SELECT term_taxonomy_id FROM ${PREFIX}term_taxonomy WHERE term_id = ?)));
my $delete1_sth = $dbh->prepare(qq(DELETE FROM ${PREFIX}term_taxonomy WHERE term_id = ?));
my $delete2_sth = $dbh->prepare(qq(DELETE FROM ${PREFIX}terms WHERE term_id = ?));
$term_sth->execute();
 
while ( my ( $name, $count ) = $term_sth->fetchrow_array() ) {
  $name_sth->execute( $name );
  my $new = undef;
  while ( my ( $term_id ) = $name_sth->fetchrow_array() ) {
    if ( ! $new ) {
      $new = $term_id;
      next;
    }
    warn "$name\t$term_id\t->\t$new";
    $update_sth->execute( $new, $term_id );
    $delete1_sth->execute( $term_id );
    $delete2_sth->execute( $term_id );
  }
}
 
__DATA__
SELECT t.term_id, t.name, r.*, s.* FROM wp_h5otpn_terms AS t, wp_h5otpn_term_taxonomy AS r, wp_h5otpn_term_relationships AS s WHERE s.term_taxonomy_id = r.term_taxonomy_id AND r.term_id = t.term_id AND r.taxonomy = 'post_tag' AND t.name = 'whatever';

Administration
WordPress

Comments (0)

Permalink

aggregate – report event counts from a stream

Another shell utility. This one is useful for, e.g. counting 404, 500, 200, 302 HTTP codes from a log file.

#!/usr/bin/perl
$|++;
use strict;
use Getopt::Long;
 
my $mode = 'line';
my $tick = 100;
my $help = undef;
my $keysfile = undef;
my %keys = ();
 
GetOptions(
  'mode|m=s' => \$mode,
  'tick|t=i' => \$tick,
  'help|h'   => \$help,
  'keys|k=f' => \$keysfile,
);
 
if ( $help || ( $mode ne 'line' && $mode ne 'time' ) || $tick <= 0 || ( defined($keysfile) && !-f $keysfile ) ) {
  my $USAGE = join '', <DATA>;
  print STDERR $USAGE and exit(1);
}
 
if ( $keysfile ) {
  open(K, $keysfile) or die "Couldn't open keys file '$keysfile': $!";
  while ( my $line = <K> ) {
    chomp $line;
    $keys{ $line }++;
  }
  close(K);
}
 
my %count = %keys;
my $offset = 0;
my $mark = 0;
my $offset = 0;
 
if ( $mode eq 'time' ) {
  $mark = time();
}
 
while ( my $element = <> ) {
  chomp $element;
  if ( scalar( %keys ) ) {
    $count{ $element }++ if $keys{ $element };
  }
  else {
    $count{ $element }++;
  }
 
  if ( $mode eq 'line' ) {
    $offset++;
    $mark++;
    if ( $mark >= $tick ) {
      $mark = 0;
      flush();
    }
  }
  elsif ( $mode eq 'time' ) {
    if ( $mark + $tick < time() ) {
      $offset = time();
      $mark = time();
      flush();
    }
  }
}
flush();
 
sub flush {
  print "summary/$tick @ $offset\n";
  foreach my $k ( sort keys %count ) {
    print "\t", $count{ $k }, "\t", $k, "\n";
  }
  %count = %keys;
}
 
__DATA__
Usage: aggregate [-h] [-m <time|line>] [-t <# of seconds or lines>] [-k <keys file>]
 
Read lines from STDIN.  Print lines by frequency per input lines or time.
 
  -h    show help (this message)
  -m    mode.  one of 'time' or 'line'.  defaults to 'line'.
  -t    aggregation size.  an integer.  value is # of lines ('line' mode) or # of
        seconds ('time' mode) after which an aggregation is triggered.  defaults to 100.
  -k    keys file.  a text file of strings to *exactly* match in the input, one per line.
        if a keys file is provided, lines not present in the keys file will be silently
        ignored.

Administration
Analytics
Perl

Comments (0)

Permalink

shuffle – randomize a stream of data

Here’s another little shell utility I’ve been sitting on for a while. This one shuffles the line-oriented data read from a pipe. It has the notion of buffering and partial flushing so we can handle streams / very large data sets.

#!/usr/bin/perl
$|++;
use strict;
use Getopt::Long;
 
my $USAGE = join '', <DATA>;
 
my $B = 0;
my $D = 1;
my $H = 0;
 
GetOptions ("buffer|b=i"   => \$B,
            "draw|d=i"     => \$D,
            "help|h"       => \$H,
           ); 
 
if ( $D == 1 && $B > 0 ) {
  $D = $B;
}
 
if (
  ($B < 0) ||
  ($D < 1) ||
  ($B > $D) ||
  ($H)
) {
  print $USAGE and exit(1);
}
 
 
my @buf = ();
 
while ( my $element = <> ) {
  #buffer whole stream
  if ( $B == 0 ) {
    push @buf, $element;
  }
  #no-op
  elsif ( $B == 1 ) {
    print $element;
  }
  #buffer window
  else {
    push @buf, $element;
    if ( scalar( @buf ) >= $D && scalar( @buf ) > $B ) {
      flush();
    }
  }
}
flush();
 
sub flush {
  for ( my $j = scalar( @buf ) - 1 ; $j >= 0 ; $j-- ) {
    my $swap = int(rand($j));
    if ( $swap != $j ) {
      ($buf[ $j ], $buf[ $swap ]) = ($buf[ $swap ], $buf[ $j ]);
    }
  }
  while ( scalar( @buf ) - 1 > $B - $D ) {
    print shift @buf;
  }
}
 
 
__DATA__
Usage: shuffle [-h] [-b <buffer size>] [-d <draw size>]
 
Shuffle lines from a stream on STDIN.  Write lines to STDOUT.
 
  -h    show help (this message)
  -b    buffer size
        (default 0.  indicates shuffle whole stream, then write)
        range: 1..
  -d    draw size
        (defaults to value of -b.  number of items to remove from the
        buffer when it fills)
        range: 1..buffer size
 
You have to parameters available (besides -h for help).
 
* buffer size (-b).  Determines how many elements to temporarily hold
before shuffling.  The advantage of this buffer is to allow shuffling on
very long streams that would not fit into system memory.  The
disadavantage is that it is not a truly random shuffle, as each input
element can appear at most buffer-size positions away from the original
position.  Buffer size defaults to zero, so make sure to set it if your
data set size is large.
 
* draw size (-d).  Determines how frequently the buffer is shuffled and
flushed.  Rather than shuffling/flushing all elements in the buffer, only
do D elements.  The advantage here is elements can appear more than
buffer-size positions away from the original position.  The disadvantage
is that shuffling is done B/D times more frequently.  Draw size defaults
to buffer size, and has no effect.  Set it to 1 to maximize randomness.
 
Copyright/License:
 
  Allen Day <allenday@ucla.edu>, licensed under GPL 2006-2008

Administration
Analytics
Perl

Comments (0)

Permalink

sample – probabilistic sampling from a stream of lines

I’m frequently monitoring webservers, cache servers, database servers, etc by tailing their log files. See my previous post on making logs easier to monitor by color.

Sometimes you also have too much data, and you don’t want to look at all of it. Use this to sample.

sample source:

#!/usr/bin/perl
$|++;
use strict;
use Getopt::Long;
 
my $USAGE = join '', <DATA>;
 
my $T = 0;
my $K = 0;
my $P = 1;
my $H = 0;
my $N = 0;
my $S = 0;
 
GetOptions ("time|t=i"     => \$T,
            "number|n=i"   => \$N,
            "count|k=i"    => \$K,
            "prob|p=f"     => \$P,
            "shuffle|s"    => \$S,
            "help|h"       => \$H,
           ); 
 
if (
  ($T > 0 && $P != 1) ||
  ($K > 0 && $P != 1) ||
  ($K < 0 || $P < 0 || $T < 0 || $N < 0 || $P > 1 ) ||
  ($T > 0 && $N > 0) ||
  ($H)
) {
  print $USAGE and exit(1);
}
 
my $position = 0;
my @buf = ();
my $before = time();
 
while ( my $element = <> ) {
  # sample full stream, report at the end
  # sample K elements every T seconds
  if ( $K > 0 ) {
    if ( scalar( @buf ) < $K ) {
      push @buf, [$position, $element];
    }
    elsif ( $K/$position < rand() ) {
      my $index = int(rand($K));
      $buf[ $index ] = [$position, $element]; #save position for sort
    }
    #time-based K-sampling
    if ( $T > 0 && time() > $before + $T ) {
      flush();
    }
    #event-based K-sampling
    elsif ( $N > 0 && $position > $N ) {
      flush();
    }
  }
  # sample with probability
  elsif ( $P < 1 && rand() < $P ) {
    print $element;
  }
  $position++;
}
flush();
 
sub flush {
  $before = time();
  #Knuth shuffle
  if ( $S ) {
    for ( my $j = scalar( @buf ) - 1 ; $j >= 0 ; $j-- ) {
      my $swap = int(rand($j));
      if ( $swap != $j ) {
        ($buf[ $j ], $buf[ $swap ]) = ($buf[ $swap ], $buf[ $j ]);
      }
      print $buf[ $j ]->[ 1 ];
    }
  }
  else {
    foreach my $b ( sort {$a->[0] <=> $b->[0]} @buf ) {
      print $b->[1];
    }
  }
  @buf = ();
  $position = 0;
}
 
 
__DATA__
Usage: sample -[[h][p][t[k[n]]]]
 
Sample lines from a stream on STDIN.  Write lines to STDOUT.
 
  -h    show help (this message)
  -k    sample K elements from stream
        (default 0)
        range: 0..
  -p    sample elements from stream with probability
        (default 1)
        range: 0 <= p <= 1
  -n    sample over windows of N elements
        (default 0)
        range: 0..
  -t    sample over windows of T seconds
        (default 0, instantaneous with -p, infinity with -k)
        range: 0..
  -s    shuffle outputs
        (default false)
 
There are two modes of sampling:
 
  * sample with probability (-p)
  * sample a fixed number of elements (-k)
 
Both modes sample over a given time interval in seconds (-t).
-t defaults to zero (process full stream).  -p can only be
used alone.  -n can only be used with -k
 
Examples:
 
  * sample K elements from a stream:
    cat /etc/passwd | sample -k 5
 
  * sample 1% of elements from a stream:
    tail -f /var/logs/httpd/access_log | sample -p 0.01
 
  * sample K elements from a stream every 30 seconds:
    tail -f /var/logs/httpd/access_log | sample -k 5 -t 30
 
  * sample K elements from a stream every 30 seconds, shuffled:
    tail -f /var/logs/httpd/access_log | sample -k 5 -t 30 -s
 
  * sample K elements from a stream every 100 elements:
    tail -f /var/logs/httpd/access_log | sample -k 5 -n 100
 
Copyright/License:
 
  Allen Day <allenday@ucla.edu>, licensed under GPL 2006-2008

Administration
Analytics
Perl

Comments (0)

Permalink

Sun Grid Engine SGE state letter symbol codes meanings

Adapted from here.

Category State SGE Letter Code
Pending pending qw
pending, user hold qw
pending, system hold hqw
pending, user and system hold hqw
pending, user hold, re-queue hRwq
pending, system hold, re-queue hRwq
pending, user and system hold, re-queue hRwq
Running running r
transferring t
running, re-submit Rr
transferring, re-submit Rt
Suspended job suspended s, ts
queue suspended S, tS
queue suspended by alarm T, tT
all suspended with re-submit Rs, Rts, RS, RtS, RT, RtT
Error all pending states with error Eqw, Ehqw, EhRqw
Deleted all running and suspended states with deletion dr, dt, dRr, dRt, ds, dS, dT, dRs, dRS, dRT

Administration
Distributed Systems

Comments (0)

Permalink

MogileFS + FUSE + Bigfile support

Jordan and I have been migrating all the TinyTube data over to a MogileFS storage system.

Seems to be a very easy-to-use solution for scalable storage of lots of little files. Basically just works out of the box, including the alpha-quality mount-filepaths FUSE adapter available for MogileFS in the SixApart SVN.

I have two problems with mount-filepaths though:

  1. Doesn’t recognize data loaded outside FUSE. It assumes I want to see/read files that were loaded via its mapping of FUSE directory/file names to MogileFS keys, and it doesn’t show other keys at all. I have lots of preloaded data, so this is a dealbreaker.
  2. Doesn’t support large files. MogileFS has a 64MB limit per file, and if you want to load a bigger file, it splits it into chunks. This FUSE adapter is not aware of how to deal with big files.

So I did some heavy modification of the script. Here’s my version. No directory support, but it lets me read my big files by the original key (i.e. a stat on “bigfile1.mpg” may trigger a stat on “_big_info:bigfile1.mpg”), and any other file I’ve loaded outside of FUSE (e.g. with mogtool).

You can name it as myfuse.pl and then, assuming the script is in the current directory, use it to mount a filesystem like so: mkdir ./myfuse; perl ./myfuse.pl ./myfuse.

Maybe I’ll get around to re-formatting this as a patch for SixApart — but probably not.

#!/usr/bin/perl
use strict;
use warnings;
use threads;
use threads::shared;
use Fuse;
use LWP::UserAgent;
use LWP::Simple;
use List::MoreUtils qw(uniq);
use MogileFS::Client;
use Path::Class;
use POSIX qw(ENOENT EISDIR EINVAL);
my $DEBUG = 0;
our (%FILE_CACHE, $CACHE_SIZE, $CACHE_AGE) = (('/' => {size => 0, age => 0}));
 
# create client object w/ server-configured namespace and IPs of trackers
my $ua = LWP::UserAgent->new;
my $class = 'default';
my $mogilefs = MogileFS::Client->new(
  domain => 'ifap',
  hosts  => [ '10.10.0.100:6001', '10.10.0.101:6001' ],
);
 
my ($mountpoint) = "";
$mountpoint = shift(@ARGV) if @ARGV;
Fuse::main(
    debug      => $DEBUG,
    mountpoint => $mountpoint,
    threaded   => 1,
 
    getattr    => "main::e_getattr",
    getdir     => "main::e_getdir",
    mknod      => "main::e_mknod",
    open       => "main::e_open",
    read       => "main::e_read",
    rename     => "main::e_rename",
    statfs     => "main::e_statfs",
    unlink     => "main::e_unlink",
    write      => "main::e_write",
);
 
sub e_getattr {
    my $filename = shift;
    $filename =~ s#^.*/##;
    warn "main::e_getattr $filename\n" if $DEBUG;
 
    my ( $size, $modes );
    my ( $dev, $ino, $rdev, $blocks, $gid, $uid, $nlink, $blksize )
        = ( 0, 0, 0, 1, 0, 0, 1, 1024 );
    my ( $atime, $ctime, $mtime ) = ( time, time, time );
 
    if ( $filename !~ m{\.} ) {
        #        warn "directory!";
        $size  = 0;
        $modes = ( 0040 << 9 ) + 0755;
    } else {
        #        warn "file!";
        $size  = 123;
        $modes = ( 0100 << 9 ) + 0644;
 
        my @paths = $mogilefs->get_paths( $filename, { noverify => 1 } );
        my ( $content_type, $document_length, $modified_time, $expires, $server );
 
        if ( scalar( @paths ) ) {
            ( $content_type, $document_length, $modified_time, $expires, $server ) = head( $paths[0] );
            $size = $document_length;
            ( $atime, $ctime, $mtime ) = ($modified_time) x 3;
        }
        else {
            @paths = $mogilefs->get_paths( '_big_info:' . $filename, { noverify => 1 } );
            if ( scalar( @paths ) ) {
                my $data = $mogilefs->get_file_data( '_big_info:' . $filename );
                my ( $_des, $_type, $_compress, $_filename, $_chunks, $_size, undef, @_parts ) = split /\n/, $$data;
                ( $size ) = $_size =~ m#^size (\d+)$#;
                OUTER: foreach my $_part ( @_parts ) {
                    $_part =~ m#paths: (.+?)$#;
                    my @_paths = split ', ', $_part;
                    foreach my $_path ( @_paths ) {
                        ( $content_type, $document_length, $modified_time, $expires, $server ) = head( $paths[0] );
                        ( $atime, $ctime, $mtime ) = ($modified_time) x 3;
                        last OUTER if $server;
                    }
                }
            }
        }
        return -ENOENT() unless @paths;
    }
 
    warn(
        join(
            ",",
            (   $dev,   $ino,     $modes, $nlink, $uid,
                $gid,   $rdev,    $size,  $atime, $mtime,
                $ctime, $blksize, $blocks
            )
        ),
        "\n"
    ) if $DEBUG;
 
    return (
        $dev,  $ino,   $modes, $nlink, $uid,     $gid, $rdev,
        $size, $atime, $mtime, $ctime, $blksize, $blocks
    );
}
 
sub e_getdir {
    my $prefix = shift;
    warn "main::e_getdir $prefix\n" if $DEBUG;
    my @filenames;
    $mogilefs->foreach_key(
#        prefix => $prefix,
        sub {
          my $filename = shift;
          push @filenames, $filename;
          push @filenames, file( $filename )->parent();
        }
    );
 
    @filenames = uniq @filenames;
    warn "returning: @filenames\n" if $DEBUG;
    return ( @filenames, 0 );
}
 
sub e_mknod {
    my $filename = shift;
    $filename =~ s#.*/##;
 
    warn "main::e_mknod $filename\n" if $DEBUG;
 
    my $fh = $mogilefs->new_file($filename, undef);
    if ( $fh ) {
        print $fh "\n";
 
        unless ($fh->close) {
            my ($code, $str) = ($mogilefs->errcode || -1, $mogilefs->errstr || '');
            warn "Error creating file:$code: $str" if $DEBUG;
            $! = $str;
            $? = $code;
            return -1;
        }
        return 0;
    }
    else {
        return -1;
    }
}
 
sub e_open {
    my $filename = shift;
    $filename =~ s#.*/##;
    warn "main::e_open $filename\n" if $DEBUG;
 
    return -EISDIR() unless $filename =~ m{\.};
    my @paths = $mogilefs->get_paths( $filename, { noverify => 1 } );
    if ( ! scalar( @paths ) ) {
        @paths = $mogilefs->get_paths( '_big_info:' . $filename, { noverify => 1 } );
    }
    return -ENOENT() unless @paths;
    return 0;
}
 
sub e_read {
    my ( $filename, $length, $offset ) = @_;
    $filename =~ s#.*/##;
    warn "main::e_read $filename $length $offset\n" if $DEBUG;
 
    return -EISDIR() if $filename =~ m{/$};
 
 
 
    my $maxoff = $offset + ( $length - 1 );
warn "requested offset=$offset length=$length bytes=$maxoff" if $DEBUG;
    my @paths = $mogilefs->get_paths( $filename, { noverify => 1 } );
    my $size;
    my ( $content_type, $document_length, $modified_time, $expires, $server );
 
    if ( scalar( @paths ) ) {
warn "it's a small file" if $DEBUG;
        ( $content_type, $document_length, $modified_time, $expires, $server ) = head( $paths[0] );
        $size = $document_length;
        return 0 if $offset == $document_length;
 
        $maxoff = $document_length if $maxoff > $document_length;
        my $range = $offset . "-" . $maxoff;
        warn "  Range: bytes=$range\n" if $DEBUG;
        my $response = $ua->get( $paths[0], "Range" => "bytes=$range" );
        if ( $response->is_success ) {
            return $response->content;
        } else {
            warn $response->as_string if $DEBUG;
        }
    }
    else {
        @paths = $mogilefs->get_paths( '_big_info:' . $filename, { noverify => 1 } );
        return -ENOENT() unless @paths;
warn "it's a big file" if $DEBUG;
 
        my $data = $mogilefs->get_file_data( '_big_info:' . $filename );
        my ( $_des, $_type, $_compress, $_filename, $_chunks, $_size, undef, @_parts ) = split /\n/, $$data;
        ( $size ) = $_size =~ m#^size (\d+)$#;
        return 0 if $offset == $size;
 
        $maxoff = $size if $maxoff > $size;
 
        my $part_min = -1;
        my $part_max = -1;
#part 2 bytes=23080964 md5=af45f7ac80ca34328db3c90de1db1ab0 paths: http://10.10.0.100:7500/dev8/0/000/119/0000119969.fid, http://10.10.0.101:7500/dev2/0/000/119/0000119969.fid
 
        my $buf = '';
        foreach my $_part ( @_parts ) {
            my ( $_bytes, $_paths ) = $_part =~ m#bytes=(\d+) .+? paths: (.+?)$#;
 
            $part_min = $part_max > 0 ? $part_max + 1 : 0;
            $part_max += $_bytes;
 
 
warn "examining part $part_min -> $part_max" if $DEBUG;
 
            #chunk too early
            next if ( $part_max < $offset );
            #chunk too late;
            next if ( $part_min > $maxoff );
 
warn "using part $part_min -> $part_max" if $DEBUG;
 
            my @_paths = split ', ', $_paths;
            foreach my $_path ( @_paths ) {
warn "offset=$offset part_min=$part_min maxoff=$maxoff part_max=$part_max" if $DEBUG;
                my $range = ($offset-$part_min) . "-" . ($maxoff-$part_min);
warn "getting Range: bytes=$range" if $DEBUG;
                my $response = $ua->get( $_path, "Range" => "bytes=$range" );
                if ( $response->is_success() ) {
                    $buf .= $response->content();
warn "data length=".length($buf) if $DEBUG;
                    if ( length( $buf ) == $length ) {
warn "got all the data (1)!" if $DEBUG;
                        return $buf;
                    }
                    elsif ( $offset + length( $buf ) == $maxoff ) {
warn "got all the data (2)!" if $DEBUG;
                        return $buf;
                    }
                    $offset += length( $buf );
                    last;
                }
            }
        }
        if ( length( $buf ) != $length ) {
            return -ENOENT();
        }
        else {
            return $buf;
        }
    }
}
 
sub e_rename {
    my ( $old, $new ) = @_;
    $old =~ s#.*/##;
    $new =~ s#.*/##;
 
    warn "main::e_rename: $old -> $new" if $DEBUG;
 
    # Rename this file
    $mogilefs->rename($old, $new);
 
    return 0;
}
 
sub e_statfs { return 255, 1, 1, 1, 1, 2 }
 
sub e_unlink {
    my $filename = shift;
    $filename =~ s#.*/##;
 
    warn "main::e_unlink: $filename" if $DEBUG;
 
    $mogilefs->delete($filename);
 
    return 0;
}
 
sub e_write {
    my ( $filename, $buf, $offset ) = @_;
    $filename =~ s#.*/##;
 
    warn("main::e_write: $filename pos=$offset len=".length($buf)) if $DEBUG;
 
    my $finfo = get_file_info($filename);
 
    return -ENOENT() unless $finfo;
 
    my $cont = get_file_data($filename);
 
    substr($$cont, $offset, length($buf), $buf);
 
    $mogilefs->store_content($filename, undef, $cont);
    rm_file_cache($filename);
 
    return length($buf);
}
 
#################################################
sub logmsg { warn(join "\t",@_) if $DEBUG }
 
sub get_file_data {
    my ($file) = @_;
    my $entry = $FILE_CACHE{$file};
    my $meta  = $FILE_CACHE{'/'};
 
    if ($entry) {
        # See if this data is too old
        if ((time - $entry->{created}) < $CACHE_AGE) {
            logmsg(1, "-- get_file_data: hit");
 
            # If its still valid, return it
            return $entry->{data};
        } else {
            logmsg(1, "-- get_file_data: miss - expired");
 
            rm_file_cache($file);
        }
    }
 
    my $cont = $mogilefs->get_file_data($file);
    my $size = length($$cont);
 
    if ($meta->{size} + $size > $CACHE_SIZE) {
        # If adding this would go beyond our max cache size, delete things until
        # we can fit it
        foreach my $f (sort {$a->{age} <=> $b->{age}} keys %FILE_CACHE) {
            next if $f eq '/';
 
            my $rm_size = rm_file_cache($f);
 
            logmsg(1, "-- get_file_data: purging - $rm_size bytes");
 
            last if $meta->{size} + $size < $CACHE_SIZE;
        }
    }
 
    logmsg(1, "-- get_file_data: added - $size bytes");
 
    # Create a new entry
    $FILE_CACHE{$file} = {created => time,
                          size    => $size,
                          data    => $cont};
    $meta->{size} += $size;
 
    return $cont;
}
 
sub get_file_info {
    my ($path) = @_;
 
    if ($path eq '/') {
        return {name         => '/',
                is_directory => 1};
    }
    else {
      foreach my $f ( $mogilefs->list_keys( $path ) ) {
        return $f if $f eq $path;
      }
    }
    return undef;
}
 
sub rm_file_cache {
    my ($file) = @_;
    my $entry = delete $FILE_CACHE{$file};
    return unless $entry;
 
    # Decrement how large our cache size is
    my $size = $entry->{size};
    $FILE_CACHE{'/'}->{size} -= $size;
 
    return $size;
}
 
__DATA__
#des no description
#type file
#compressed 0
#filename somefile.mpeg
#chunks 2
#size 90189828
#
#part 1 bytes=67108864 md5=8066369552b71fd49cfbe9ccdce74051 paths: http://10.10.0.100:7500/dev6/0/000/119/0000119968.fid, http://10.10.0.101:7500/dev2/0/000/119/0000119968.fid
#part 2 bytes=23080964 md5=af45f7ac80ca34328db3c90de1db1ab0 paths: http://10.10.0.100:7500/dev8/0/000/119/0000119969.fid, http://10.10.0.101:7500/dev2/0/000/119/0000119969.fid

Administration
Distributed Systems
Perl
Scalability

Comments (2)

Permalink

pcoc – Piped Command Output Colorizer

I’m frequently monitoring webservers, cache servers, database servers, etc by tailing their log files, e.g.

tail -f /etc/httpd/logs/access_log

I like the –color option provided by grep, but found it to be too limited (only one allowed, no wildcard support). After a bit of searching to see if a tool existed for doing arbitrary colorizing, I found
acoc, the Arbitrary Command Output Colourer.

…which almost did what I needed, but couldn’t read from a pipe. So I wrote pcoc, the Piped Command Output Colorizer. I’m only publishing this because I’ve been using it for about 1 1/2 years, and still find it useful.

Source code at the end of this post. Here’s an example that highlights iPhone/iPod user agents and requests with a 500/400/404 HTTP response:

tail -f ./logs/access_log | pcoc -f '(iPod)=bold cyan' -f '(iPhone)=bold magenta' -f '\b(500|404|400)\b=red on_black'

Sorry, no screenshots :( .

pcoc source:

#!/usr/bin/perl
use strict;
use Getopt::Long;
use Term::ANSIColor qw(colored);
$|++;
 
my %format = ();
GetOptions( "format|f=s" => \%format);
 
if ( ! keys %format ) {
  print <<"EOF";
Synopsis:
        pcoc - Piped Command Output Colorizer.  Inspired by acoc.
 
Usage:
 
        $0 -f '<regex1>=<color1>' -f '<regex2>=<color2>'
 
$0 reads from a pipe and colorizes each line based on format (-f) parameters.
 
Arguments:
 
-f '<regex>=<color>'  Required, multiple values okay. 
 
        <regex>: A regular expression from which \$1 will be colorized
 
        <color>: One or more colorization keywords, see perldoc
        Term::ANSIColor, but briefly they are:
 
        boldness:
                bold
        foreground:
                red yellow green blue magenta cyan black white
        background:
                on_red on_yellow on_green on_blue on_magenta on_cyan
                on_black on_white
 
Examples:
 
        #highlight the account's shell in bold green
        cat /etc/passwd | $0 -f '.+:([^:]+)\$=bold green'
 
        #... and the username in red with black background
        cat /etc/passwd | $0 -f '([^:]+)=red on_black' -f '.+:([^:]+)\$=bold green'
 
Copyright/License:
 
        Allen Day <allenday\@ucla.edu>, licensed under GPL 2006-2008
 
EOF
  exit(1);
}
 
while ( my $line = <> ) {
  chomp( $line );
  foreach my $f ( keys %format ) {
    my @c = split ',', $format{ $f };
 
    if ( $line =~ qr/$f/ ) {
      while ( my ( $s, $t ) = $f =~ m/^(.*?)\(+(.+?)\)+/ ) {
        my $c = pop @c || last;
        $line =~ s/($s)($t)/$1.colored($2,$c)/e;
        $f =~ s/^(.*?)\((.+?)\)/$1$2/;
      }
    }
  }
  print "$line\n";
}

Administration
Analytics
Perl

Comments (0)

Permalink