Adventures with DB_File::Lock

I like the Unix DBM file format (a.k.a BerkeleyDB). I use it for static data (like the zip code-to-latitude/longitude database for the Hebcal Interactive Jewish Calendar) and for dynamic data (such as the subscriber database for the Mountain View High School Alumni Internet Directory).

BerkeleyDB is also great because it has many language interfaces. I can access the same DB files in both Perl and PHP.

My high school alumni directory subscriber database has experienced corruption a few times recently. It’s a good thing I also keep a daily text backup of the database in RCS because it makes it easy to rebuild the DB.

But it’s obvious to me that the underlying cause of the problem is concurrent access that isn’t protected by mutual exclusion. Heck, I wrote the code back in 1995 when I didn’t know better.

So I’ve gotta go add some locking code to the 25 scripts that manage the site.

However, older versions of BerkeleyDB (such as the one installed on my ISP) don’t natively support locking, so I’ve gotta use flock for concurrency. No problem; it’s relatively easy to turn every occurance of this:


use DB_File;

my(%DB);

tie(%DB, 'DB_File', $file, O_RDWR|O_CREAT, 0644, $DB_HASH);

$DB{'foo'} = 'bar';

untie(%DB);

into something that looks like this:


use DB_File;

use Fcntl qw(:DEFAULT :flock);

my(%DB);

my($db) = tie(%DB, 'DB_File', $file, O_RDWR|O_CREAT, 0644, $DB_HASH);

defined($db) || die "Can't tie $file: $!\n";

my($fd) = $db->fd;

open(DB_FH, "+<&=$fd") || die "dup $!";

unless (flock (DB_FH, LOCK_EX)) { die "flock: $!" }

$DB{'foo'} = 'bar';

flock(DB_FH, LOCK_UN);

undef $db;

undef $fd;

untie(%DB);

close(DB_FH);

Bingo. Problem seems to be fixed. No more DB corruption.

But then, a few weeks later, I get DB corruption again. Ugh. Turns out that I managed to fix 24 of the scripts, but there’s one that I occasionally run by hand (the one that removes someone from the directory) that I forgot to add locking code to. With flock, it only takes one script to screw it up.

So last night I was about to go through the scripts and update them, but reading the DB_File manpage, they point out a possible problem with the classic “tie the db, dup the fd, then flock” approach. So fixing the 25th script to use the same locking scheme won’t necessarily solve the problem either. Doh!

Reading a little further down the manpage, I see a reference to a simple CPAN module called DB_File::Lock that transparently does flocking when you tie and untie the DB. It’s perfect for what I need.

Now I can simply do a search-and-replace throughout the entire codebase and change all DB_File references to DB_File::Lock, and get rid of whatever dup/flock stuff I used to use.


use DB_File::Lock;

my(%DB);

tie(%DB, 'DB_File::Lock', $file, O_RDWR|O_CREAT, 0644, $DB_HASH, 'write');

$DB{'foo'} = 'bar';

untie(%DB);

I’ve also considered moving the code from DBM files into MySQL. My ISP started offering limited MySQL access for an additional buck a month, and relational DBs tend to solve the concurrent access problem in a much more elegant (and consistent) way.

Unfotunately, it would be too much work. I don’t want to rewrite all of my 8-year-old Perl code that serializes an alumni record (just a bunch of key=value pairs) into a delimited string. And the DB access parts of the code aren’t very well abstracted, so switching from a simple hash DB format to a more structured multi-column format is going to be trickier than it seems.

Someday when I find the time to do a complete rewrite I’ll use MySQL as the backing store. And I’ll use that opportunity to get rid of all of my perl4-isms and replace them with appropriate perl5 constructs. Heck, if I delay long enough, perhaps I can go straight from perl4 to perl6! :-)

For now, DB_File::Lock is good enough.

Dave Jeske: Linux on the Desktop

linux-penguin.jpg Dave Jeske, one the many brilliant Yahoo! alumni I know, is new to the blogging world. Here’s his second entry:

Linux on the Desktop is a long way off, here’s why. I’m a UNIX developer and proud of it. I love the stability, scriptability, and remote administration capabilities of UNIX. I’ve built everything from small scale scripts to large web-applications running on hundreds of machines. However, I’ve never run UNIX/X as… [unsolicitedDave]

I’m looking forward to reading more from him.

=rand(2,5)

Reading the most recent copy of PC Magazine, I stumbled across an article called Generating Dummy Text in Word. I gave it a shot this morning, and sure enough, it worked:

The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.

The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.

Although it’s nifty to have such a tool on your desktop, I much prefer the web-based Lorem Ipsum Generator, which uses a time-tested formula for page-filling text. Anyone who has ever worked for Adobe knows that “Lorem Ipsum” has been a favorite of typographic professionals for the past 500 years.

While we’re on the loosely-related subject of pangrams:

  • How razorback-jumping frogs can level six piqued gymnasts?
  • Cozy lummox gives smart squid who asks for job pen.

Got any other favorites?

Palm Zire

palm-zire.jpg I bought a Palm Zire at Fry’s Electronics last night. It was on sale for $89.

I lost my trusty old Palm V in November and have been miserable without it for the past 6 weeks. One of the reasons it took me so long to get a replacement was that I saw it as an opportunity to purchase some hot new technology. I toyed with the idea of getting a Dell Axim X5 for a while since I’ve been meaning to play with a handheld Microsoft OS, but I already know the Palm Desktop user interface and I’m too lazy to learn how to use Outlook. Hard to teach an old dog new tricks and all.

In the end, there’s nothing hot or new about the Zire. It’s a monochrome device with a wimpy 2MB of RAM and has absolutely no expansion capability whatsoever. The other extreme would’ve been to go with the Tungsten, but it’s kinda absurd to spend more on a handheld than you do on a desktop computer.