Category Archives: Open Source

Paternity leave coding update

Three and a half weeks into my sabbatical, and I’ve actually found some time to code on Hebcal!

Some Hebcal accomplishments in July:

  1. Moved Hebcal for Unix code from SourceForge to GitHub.
  2. Added two new user-requested features and released hebcal-3.13
  3. Added Printable PDF support for Hebcal.com
  4. Forked a copy of Hebcal for Unix especially for hebcal.com
    • Replaced sunset calculation engine with PHP’s datelib
    • Replaced timezone/daylight saving time with PHP’s datelib
    • Updated world city database from geonames.org to include 300+ new cities
  5. Updated the USA ZIP code database

Ariella has been incredibly supportive. I’ve done about half of the work at home, and half of it at Hacker Dojo.

On paternity leave, hope to write some code

I’m taking 3 months off to hang out with the family! Baby Emma is doing great, and the big kids are, well, big. It’s been about 7 years since I’ve had a sabbatical from work, so this seems like a great time to do it.

One of my first projects was to move hebcal for Unix from SourceForge to GitHub.

Hebcal for Unix has been around for 20+ years. Danny Sadinoff wrote 98% of the code, and Michael has been fixing bugs and adding features here and there.

SourceForge had been providing hosting for the GPL code for 14+ years. We even converted from CVS to Mercurial about 3 years ago. However, with the recent changes to SourceForge code hosting, Hebcal got stuck in some sort of limbo-land. Lots of 500 Internal Server Errors.

So… we’ve decided to join the cool kids and make the transition from hg to git. And while making that transition we’ve also moved to GitHub, which is where all of the open source developers are hanging out these days.

Over the coming month we’ll be cleaning up the code and the hebcal.com website, removing references to the old sourceforge.net URL.

And then we’ll get back to fixing bugs and adding new features.

MySQL User Defined Functions for FNV (Fowler/Noll/Vo) Hash

Sometimes you only need a 32- or 64-bit hash function. One of my favorites at Yahoo and something we’re using at Fraudwall Technologies is the FNV (Fowler/Noll/Vo) Hash.

If you’d like to use FNV inside of MySQL, you might find our udf_fnv.c useful. For example:


mysql> select FNV1A_64('The quick brown fox jumps over the lazy dog.');

+----------------------------------------------------------+

| FNV1A_64('The quick brown fox jumps over the lazy dog.') |

+----------------------------------------------------------+

| 75c4d4d9092c6c5a                                         |

+----------------------------------------------------------+

1 row in set (0.00 sec)

mysql> select FNV1A_32('The quick brown fox jumps over the lazy dog.');

+----------------------------------------------------------+

| FNV1A_32('The quick brown fox jumps over the lazy dog.') |

+----------------------------------------------------------+

| ecaf981a                                                 |

+----------------------------------------------------------+

1 row in set (0.00 sec)

mysql>

The functions behave similarly to the MySQL built-ins MD5() and SHA1() in the sense that they return hex strings. The module defines 32- and 64-bit versions of all three variants of the FNV hash: FNV-0, FNV-1, and FNV-1a. Enjoy.

php.ini hacks: –with-config-file-scan-dir and ini variable expansion

I whipped up a quick 3-minute presentation entitled php.ini hacks for today’s PHP Lightning Talks session at the O’Reilly Open Source Convention. It demonstrates two features:

  1. The --with-config-file-scan-dir option to ./configure

  2. ini variable expansion (“open_basedir = ${open_basedir}:/tmp“)

Why? Because George and Laura asked me to, and this is all I could think of with 20 minutes notice. And because the ini variable expansion feature isn’t documented anywhere on the php.net website except for a passing reference in the PHP 5 ChangeLog:

Added possibility to access INI variables from within .ini file. (Andrei)

Tomorrow I’ll be giving a talk entitled Hacking Apache HTTP Server at Yahoo! It’s a repeat performance of the well-attended presentation I gave at ApacheCon 2005.

Migrating MVHS Alumni Directory data from BerkleyDB to MySQL

MVHS Spartan I recently rewrote large parts of the MVHS Alumni Directory to use MySQL instead of BerkleyDB. I’ve been on paternity leave from Yahoo! for 7 weeks now, and this is one of the few projects on my todo list that I have actually completed.

I’ve been maintaining this list of alumni for over 10 years. It began as a bunch of Perl 4 scripts and a single text file (colon-delimited, a la /etc/passwd) back when I was an undergraduate in college, and has morphed over the years as I have moved from ISP to ISP.

I was forced to port it to Perl 5 at one point when one of my ISPs did an OS upgrade, and although I got it to work, there was no way I was going to go through the pain to make it use strict. Later, I rewrote all of the DBM access routines to use DB_File::Lock to avoid race conditions that occasionally corrupted the data.

At the end of last year, my ISP (DreamHost) upgraded their Linux distro from Perl 5.6 to Perl 5.8 and everthing broke again. Plus, the BerkleyDB file format on their new distro was incompatible with the old files, so I had to recreate the files from a text dump. I got it working again with a little hackery, but still wasn’t ready to spend the time to dump BerkleyDB for MySQL.

Well, it’s finally done. The only new functionality is an RSS feed for each graduating class. It was fun to do a little bit of hacking.

The new version is about 7,000 lines of code, and it’s still very ugly, largely because I have tried to adhere to the Principle of Least Change, and I wasn’t such a great coder back in 1995. Download it if you so desire; it is released under the BSD License. The README needs a little updating, but the Makefile should actually work.

OSCON 2004 Sessions

I haven’t attended too many tutorials or sessions this year. Yesterday I saw Jim Winstead’s Practical I18N with PHP and MySQL and David Sklar’s Cleaning Up SOAP.

php-version5.gif Right now I’m sitting in Adam Trachtenberg’s PHP 5 + MySQL 5 = A Perfect 10. He quipped that it really should’ve been called PHP 5 + MySQL 4.1 = A Perfect 9.1, but the O’Reilly folks didn’t think the title was sexy enough.

Initially we looked at the mysqli (“MySQL Improved”) extension which offers prepared statements, an Object-Oriented interface, and the ability to query the database over SSL.

mysql.png Next, Adam started speaking about new MySQL 4.1 features. He gave some tips on how to use the new subselect functionality, reminding the audience to think carefully about using = or IN if the subselect returns a single or multiple rows. Then he spoke about MySQL 5.0 features such as Stored Procedures, Cursors and Views.

HTTP Caching and Cache-busting for Content Publishers

oscon-logo.gif Slides are now online (HTML, PPT) for today’s talk on HTTP Caching and Cache-busting for Content Publishers.

Abstract: A user’s web experience can often be improved by the proper use of HTTP caches. Radwin discusses when to use and when to avoid caching, and how to employ cache-busting techniques most effectively. Radwin also explains the top 5 caching ad cache-busting techniques for large content publishers.

OSCON 2004

oscon-logo.gif I just arrived in Portland, Oregon. I’ll be speaking about HTTP caching and cache-busting at the O’Reilly Open Source Convention tomorrow. If the talk goes well, I’ll propose it for ApacheCon this fall.

The conference hotel was all booked up by the time I made my travel arrangements, so I’m staying at the closest available hotel (which is about a mile away). Not sure if there’s something else going on here in Portland this week or if OSCON’s attendance spiked this year.

stubgen 2.06

Today I released stubgen 2.06, the first release since 1998.

stubgen is a C++ development tool that keeps code files in sync with their associated headers. When it finds a member function declaration in a header file that doesn’t have a corresponding implementation, it creates an empty skeleton with descriptive comment headers.

Last week Raphael Assenat sent me a message suggesting two new command-line flags to customize the output to his liking. He included very clean patch to implement the feature and his code worked perfectly. He even included manpage updates in his diff! This is exactly the way Open Source is supposed to work.

I took the opportunity to remove copies of getopt() and basename() that were bundled with the distribution since they’re found in any modern libc. Doing so also let me change the license from GNU to BSD, since I no longer want to contribute to RMS’s zealotry.

stubgen’s parser does not conform to the latest C++ standard. It’s a gigantic hack that I created when I was teaching myself lex/yacc. Hacking the yacc grammar further probably isn’t a good idea, since C++ isn’t an LALR(1) language anyways. It really oughta be rewritten to use a real C++ parser library.