Ask Yahoo! RSS release

ask1.gif Ask Yahoo!, a daily column that features Q&A with Yahoo!’s expert team of Surfers, is officially syndicating its content via RSS.

As reported here back in April, RSS support for Ask Yahoo! had previously been available as a Beta release only.

This week, Ask Yahoo! marks five wonderful, question-filled years. To celebrate this momentous occasion, we’ve given the site a fresh look and some cool new features. So click around, read up, and ask away!

The Most Popular Questions page is my favorite of the new features.

To subscribe to Ask Yahoo!, click the green XML Sub button: Subscribe to "Ask Yahoo!" with your favorite aggregator.

Ploni ben Ploni

John Doe.

Joe Schmoe.

Ploni ben Ploni.

When the Gemara needs a placeholder name but doesn’t want to use a real one, apparently it uses the name פלוני (transliterated here as Ploni). Ploni can be used both as a person’s name as well as a name of a place.

In mock contracts, sometimes the formula פלוני בן פלוני במקומ פלוני (Ploni ben Ploni b’Makom Ploni) is used. This translates rougly as “Ploni, son of Ploni, from the city of Ploni.”

Delightful. I remember finding it equally amusing a few years ago when I learned that French programmers don’t name temporary variables foo or bar, but rather toto and tata.

snork.

Adventures with DB_File::Lock

I like the Unix DBM file format (a.k.a BerkeleyDB). I use it for static data (like the zip code-to-latitude/longitude database for the Hebcal Interactive Jewish Calendar) and for dynamic data (such as the subscriber database for the Mountain View High School Alumni Internet Directory).

BerkeleyDB is also great because it has many language interfaces. I can access the same DB files in both Perl and PHP.

My high school alumni directory subscriber database has experienced corruption a few times recently. It’s a good thing I also keep a daily text backup of the database in RCS because it makes it easy to rebuild the DB.

But it’s obvious to me that the underlying cause of the problem is concurrent access that isn’t protected by mutual exclusion. Heck, I wrote the code back in 1995 when I didn’t know better.

So I’ve gotta go add some locking code to the 25 scripts that manage the site.

However, older versions of BerkeleyDB (such as the one installed on my ISP) don’t natively support locking, so I’ve gotta use flock for concurrency. No problem; it’s relatively easy to turn every occurance of this:


use DB_File;

my(%DB);

tie(%DB, 'DB_File', $file, O_RDWR|O_CREAT, 0644, $DB_HASH);

$DB{'foo'} = 'bar';

untie(%DB);

into something that looks like this:


use DB_File;

use Fcntl qw(:DEFAULT :flock);

my(%DB);

my($db) = tie(%DB, 'DB_File', $file, O_RDWR|O_CREAT, 0644, $DB_HASH);

defined($db) || die "Can't tie $file: $!\n";

my($fd) = $db->fd;

open(DB_FH, "+<&=$fd") || die "dup $!";

unless (flock (DB_FH, LOCK_EX)) { die "flock: $!" }

$DB{'foo'} = 'bar';

flock(DB_FH, LOCK_UN);

undef $db;

undef $fd;

untie(%DB);

close(DB_FH);

Bingo. Problem seems to be fixed. No more DB corruption.

But then, a few weeks later, I get DB corruption again. Ugh. Turns out that I managed to fix 24 of the scripts, but there’s one that I occasionally run by hand (the one that removes someone from the directory) that I forgot to add locking code to. With flock, it only takes one script to screw it up.

So last night I was about to go through the scripts and update them, but reading the DB_File manpage, they point out a possible problem with the classic “tie the db, dup the fd, then flock” approach. So fixing the 25th script to use the same locking scheme won’t necessarily solve the problem either. Doh!

Reading a little further down the manpage, I see a reference to a simple CPAN module called DB_File::Lock that transparently does flocking when you tie and untie the DB. It’s perfect for what I need.

Now I can simply do a search-and-replace throughout the entire codebase and change all DB_File references to DB_File::Lock, and get rid of whatever dup/flock stuff I used to use.


use DB_File::Lock;

my(%DB);

tie(%DB, 'DB_File::Lock', $file, O_RDWR|O_CREAT, 0644, $DB_HASH, 'write');

$DB{'foo'} = 'bar';

untie(%DB);

I’ve also considered moving the code from DBM files into MySQL. My ISP started offering limited MySQL access for an additional buck a month, and relational DBs tend to solve the concurrent access problem in a much more elegant (and consistent) way.

Unfotunately, it would be too much work. I don’t want to rewrite all of my 8-year-old Perl code that serializes an alumni record (just a bunch of key=value pairs) into a delimited string. And the DB access parts of the code aren’t very well abstracted, so switching from a simple hash DB format to a more structured multi-column format is going to be trickier than it seems.

Someday when I find the time to do a complete rewrite I’ll use MySQL as the backing store. And I’ll use that opportunity to get rid of all of my perl4-isms and replace them with appropriate perl5 constructs. Heck, if I delay long enough, perhaps I can go straight from perl4 to perl6! :-)

For now, DB_File::Lock is good enough.

Mikel Maron: Reactive Links

A superb idea today from Yahoo! alumnus Mikel Maron:

Reactive Links. Anytime someone click-thrus on these redirect links, the service records that action… more active links could be big and red and quiet links could small and blue, or whatever you like. These links change their character depending on their usage. [Brain Off]

It reminds me of a little bit of internal visualization our data mining group did where a modified version of the Yahoo! homepage showed a click-percentage count next to each hyperlink on the page. You could pretty easily see that people were always interested in clicking on certain elements on the page (such as the word “Free”) and that you could also induce users to try different Yahoo! services by occasionally highlighting one of them (by displaying them in bold or with a background color).

Changing the size of the links is another interesting visualization technique, but it can throw off the page layout so much that it becomes distracting and less helpful.

Jerry’s Guide to the World Wide Web

akebono.jpg At lunch today we were talking about trademarks and whether Yahoo! is a brand name or a generic term. Since it’s used in Chapter 1 of Gulliver’s Travels, it clearly pre-dates the web company. And the first use with an exclamation point probably comes from the Erasure song which was released in 1988 on The Innocents album.

We never quite sorted it out, but the discussion morphed into the history of the company. We wondered how many links there are still pointing at akebono.Stanford.Edu.

Now there’s one more. :-)

Hebrew Computing on Mac OS X

mac-osx-1.gif We’re thinking about buying a Mac.

One of the things that has been holding us up is lack of support for Hebrew software. Until Mac OS X 10.2 was released, the operating system didn’t even offer native support for Hebrew. However, we’re still waiting for some important applications (such as NisusWriter) to come out with OS X native releases.

Last week I saw an email to the hebrewcomputing Y! group which listed off a list of some good Hebrew software for “real Hebrew computing” on Mac OS X.

  • Mellel for word processing (full Hebrew support)

  • OS X Mail app for Hebrew email
  • Safari and Camino for Hebrew web browsing
  • iChat and icy juice for instant messaging in Hebrew
  • iCal for calendar with Hebrew support
  • OS X address book with it’s built in Hebrew support
  • Keynote with the Hebrew template and direction services for Hebrew presentations

Now all we need are OS X editions of the Gemara and Tanach.