Banzai on FOX

Last night I saw the premiere of Banzai, a mock Japanese game show of sorts where the contestants do a bunch of wacky things and the audience is supposed to bet on the outcome.

Banzai - Sundays at 8:30pm on FOX

I imagine that if you’re drunk it makes for a very entertaining experience.

MySQL Scaling Pains

MySQL logo Jeremy Zawodny spoke Friday

morning about MySQL

Scaling Pains.

I’m still just waking up, so here are some abbreviated notes.

  • Security administration (don’t just GRANT ALL PRIVILEGES ON *.* TO

    someuser, but think seriously about delegating privileges to

    separate users)

  • Size Limits (MyISAM default 4GB limit can be modified, you just need

    to know the magic incantation)

  • Lock Contention – consider using InnoDB instead of MyISAM if you

    have as many readers as writers. MyISAM tends to work fine when you’ve

    got 90-95% readers and just a few writers (or vice-versa) but you can

    run into lock contention when there are lots of both. InnoDB doesn’t

    fix locking problems; it actually introduces some problems of its own.

  • ALTER TABLE is slow. Requires an exclusive

    write lock on the entire table, all queries will back up until it

    finishes. Plan ahead.

  • Disks often tend to be the bottleneck. You can add all of the CPU

    power in the world and it won’t matter if it’s waiting on a slow disk.

    Low seek times are more important than high transfer rates. RAID can

    help. If you have time, benchmark different disk combinations

    (suggested a tool called Bonnie++).

  • Load balancers. If you use one, choose the correct algorithm.

    Sometimes the “least connections” algorithm can make things worse.

    Often a simple “round-robin” algorithm works just great.

  • Handling many connections. Setting wait_timeout to a

    lower value will force idle connections to disconnect. Sometimes this

    can improve overall efficiency.

  • Data partitioning by servers (i.e. putting 1/Nth of your data on

    each of N clusters of servers). Instead of a single “users” table, you

    have 4 different tables (“users_abcdefg”, “users_hijklmn”,

    “users_opqrstu”, “users_vwxyz”) and the application needs to look at the

    first letter of the key to figure out which table to query.

  • Full-Text search is neat, but it has its limits. First, be sure to

    use 4.x, not 3.23. Also, it’s not as flexible as other software.

Zawodny also inserted a small Yahoo! advertisement in his slides; Yahoo! is hiring engineers. His

incentive is twofold. (1) Smart folks tend to go to OSCON, so it’s a

targeted audience, and (2) if you send him (or me) your resume we can

get the employee referral bonus if you end up getting hired.

Apache 1.3.28 next week?

apache-feather.gif I don’t have time to read most of the Apache mailing lists, but I do keep an eye on the low-traffic cvs commit list.

There’s been a lot of discussion over the past month or so about the upcoming 1.3.28 release, and even a couple of dates proposed. The most recent message suggests that we’ll see a 1.3.28 release next week.

Taking a look at the CHANGES file, there’s not too much that I really need in this release. The past year has been pretty slow for Apache 1.3 development, in large part because folks are starting to move to 2.0.

Why XML Hasn’t Cured Our Ills or Saved the World


After lunch and a little bit of work-related email, I went to Randy Ray‘s Why

XML Hasn’t Cured Our Ills or Saved the World (slides).

The talk centered around five things Ray thinks we do wrong with XML:

  1. People are too quick to use XML.

    You have to aks yourself if it’s really necessary. Is it just for


    • If there is no reason other than the fact that there are XML

      parsers, then there probably is a simpler solution

    • If only a single consumer, there may be a more economical solution.
  2. People are too slow to use XML.
    • Plan ahead for more than one customer of data?
    • If another part of the system is already using XML for a more

      “legitimate” task, why not use XML for other things, too?

      (i.e. configuration data)

    • It isn’t always an extra cost. If the data format (and therefore

      the parser) would be sufficiently complex, maybe using an XML parser

      would be easier?

  3. Lack of cooperation or sharing.
    • Not often due to malice, perhaps lack of central authority. Who

      moderates DTD repositories? Registries on and contain

      outdataed information, and UDDI is too business-centric.

    • Example: difficult to find schema for recipies. Had to wade through

      3 pages of Google results to eventually find RecipeML

    • Intellectual Property issues. For example, Microsoft hasn’t

      openened up the XML formats for Office 2003. Compare to open formats

      like DocBook

  4. Misunderstanding the application of XML
    • XML is the “NetPBM” of generic data. (NetPBM broke new ground in

      image file format transformations by reducing an N * M problem

      to N + M).

    • People think that XML is only for “document” data.
  5. People want to make XML hard.
    • Tough topics make money. How can businesses sell

      books/tools/software/training/services when customers think that XML is

      “easy”? Vested interest in making it complicated.

In conclusion, Ray mused that no one technology is (yet) a universal solution and XML is no different when it comes data formats. His charge to the audience: just think about XML before using (or not using) it. Self-described experts don’t necessarily have all the answers.

Ruby for Perl Programmers

I stuck around for local software guy Phil Tomson’s Ruby for Perl Programmers talk. This session was more technical, with the first code example showing up on the 4th slide.

Phil’s slides are online, so I won’t attempt to replicate them here.

Something listed as a “gotcha” actually seems to be a feature to me. Since all variables hold references to objects, you have to explicitly call .dup to clone an object. It’s more Java-like than Perl, but it probably ends up being higher performance since you only make copies when you explicitly want them.

The Power and Philosophy of Ruby

Tower of Babel Yukihiro Matsumoto spoke about The Power and Philosophy of Ruby on Thursday morning. The talk was all philosophy, no code. Very entertaining.

We started off by discussing natural languages and the Tower of Babel, with a comparison of Japanese and its use of ideograms versus English. Matsumoto said that he was heavily influenced by the science fiction novel Babel-17. In some part, the power of the “super-language” in this book inspired him to create the Ruby programming language.

He spoke about the importance of choosing good names; those that are short and well-chosen usually convey meaning very easily. He also spoke about the importance of the machine making it easier for humans (Moore’s Law, evolution of programming languages to higher-level concepts). He feels it’s important for programming languages to cause the programmer as little stress as possible, and pointed out that one metric of a good programming language is that the programmer still has time to go out and have fun.

However, Matsumoto made it clear that simplicity is not a goal of Ruby. After all, human thoughts are not simple, and programs are essentially complex things. Rather, the design adheres to the principle of least surprise. If some aspect of the language meets your expectation, then it’s achieving its goal. Succinctness is highly valued because Matsumoto believes it leads to productivity and efficiency.

In Ruby, like in Perl, There’s More Than One Way To Do It, but the language can encourage one way. For example, Ruby does allow global variables, but you have to put a $ character before globals. Since too many $ are considered ugly, it discourages use of globals. “Dangerous” methods in Ruby have a ! in their name, for example sort and sort!. The “dangerous” methods might be faster, but they have side-effects, and the ! character reminds you to be careful.

Perl Lightning Talks

The Sound of Music Wandering around after lunch, I stopped by the Perl Lightning Talks (slides) session. I was delighted to hear Autrijus Tang‘s five-minute rap These are 1% of my favourite CPAN… in Chinese, followed by an English translation sung to the tune of These are a few of my favorite things… from The Sound of Music.

It was incredible. Standing ovation.

Allison Randal’s lightning talk was a parody of Arlo Guthrie’s Alice’s Restaurant. “You can get anything you want / in Perl 6 development.” Clever, but Autrijus is a hard act to follow.

Also notable was Dave Rolsky’s talk on DateTime. Dave, like my friends Gabriel and Rachel, is from Minnesota.

OSCON Wednesday morning

I bounced around on Wednesday between a bunch of different sessions. In the morning, I did some last-minute touch-ups on my slides, then caught the tail end of John Coggeshall’s Interfacing Java / COM with PHP. After my talk on One Year of PHP at Yahoo! (slides), I grabbed some lunch in the speaker’s room. Shane asked me to collect some feedback from my co-workers about Komodo since they’re starting to think about what might go into their 3.0 release.

I showed up a little bit late for Adam Trachtenberg’s Introduction to Web Services in PHP: SOAP versus REST talk, but the room was packed so I couldn’t find a seat. So I stuck my head inside Zak and Monty’s Guided Tour of the MySQL Source Code to catch an updated version of what had changed since the users conference in April.

I also checked out Shane’s Introduction to PEAR talk, but the conference room had run out of seats again. Too bad they didn’t pick a bigger room for the PHP talks this year.

Tim O’Reilly: Paradigm Shift

oreilly_header_part1.gif Tim O’Reilly gave this morning’s keynote address, “The Open Source Paradigm Shift”. The talk was reminiscent of last year’s Watching The Alpha Geeks keynote at ApacheCon, although now he is able to say the phrase “paradigm shift” with a straight face.

Largely the talk was trying to make the case that we shouldn’t try to think about Open Source software in the traditional commercial software business model. Instead, we should recognize that the software (to some extent) has become a commodity, just like hardware has become a commodity. The true value in Open Source is the businesses that grow up around it. For example, nobody pays for Sendmail and Apache, yet thousands of ISPs make money from providing web/email hosting services for their customers.

His charge to the audience was to embrace the fact that Open Source software has become a commodity, and to start to think of it (and all of the services that have grown up around it) as a platform. If we can develop services that support collaboration and end-user customization, and the data flows freely enough, we’ll somehow find a way to feed our families.