ApacheCon: Scalable Internet Architectures 2

floodgate-1_emanage.gif

High Availability and Load Balancing

Theo challenged the audience to recognize the difference between replicateable data and non-replicateable data. Again the theme of the right tool for the job came up. Replicateable data needs marginal protection so you can use commodity hardware. Non-replicateable data needs single-point reliability, so you should consider “Enterprise” hardware.

He then put up a picture of the typical two tier model of load-balancers, content web servers, image web servers and Master/Slave DBs. I’m glad to see that he pointed out the idea of splitting images out onto a separate web server; simple trick which can help you scale. The typical three tier architecture looks mostly the same, but adds Application Servers and some more load balancers. This picture looks a lot more like the Yahoo! architecture.

Theo described hardware- and software-based load-balancing alternatives, and the tradeoffs. I was tickled to see DNS Round Robin as one of the software load balancing choices. OmniIT seems to really like wackamole and mod_backhand. In general, he seemed to prefer free software solutions over the expensive black box hardware solutions. I guess this is probably because he makes money off his consulting practice by customizing all of that complicated software.

Diving into the software load-balancing altnernatives, he described a project called Walrus. Walrus tries to pick the right server cluster by taking advantage of something in the DNS RFC which says that clients are supposed to measure DNS latency and pick the “closest” DNS server. Eventually users migrate towards one DNS server, and those servers (east coast vs. west coast) return disparate sets of IP addresses. Walrus is great in theory, but DNS isn’t implemented consistently on all clients, so it doesn’t work universally.

Theo proposes using Shared IP for DNS servers (but not for Web clusters) and assigning the same IP address to your DNS servers in different locations. This only really works well if your network provider is the same in both places and willing to work with you to make it happen.

Log collection

George felt that the rsync/scp/ftp method of collecting logs was terrible. He doesn’t like the fact that it uses unicast, so if you need to copy the logs to more than one place you need to do it multiple times, and he really disliked the fact that you can’t run real-time analysis on the logs.

He examined using syslog as an alternative to support real-time logging to a loghost, but due to the fact that it’s built on top of UDP, it’s unreliable (which might not work well with your business requirements). Also, the syslog implementations on many hosts are inefficient.

Database logging solves the reliability, real-time and centralization problems, but all of that relational database overhead substantially slower than writing to a file. And all of those rows start to add up quickly. Imagine a website like Yahoo! with over 1.5 billion pageviews a day.

mod_log_spread does a reliable multicast approach which allows for realtime processing of log data. George pointed out that realtime processing is fantastic for helping to notify you of things like 500 Internal Server Errors so you know when to do some on-the-fly debugging in your production environment.

Finally, Theo demo’d a cool Cocoa app for seeing real-time web statistics using the Spread Daemon.

2 thoughts on “ApacheCon: Scalable Internet Architectures 2”

  1. While I wish we made alot of money configuring mod_backhand for clients, we in fact make almost none. In general we like to push backhand, especially at apachecon, because it’s free, extremely configurable, and seems apropos for Apachecon.

    All of our large clients run hardware load balancers, and Theo and I are both pretty big fans of them. Sometimes people can’t really afford it though.If you’re putting together a low budget site on a small cluster of machines, why spend more than all your machines combined on a load-balancer?

    Personally (as a non-author but close bystander to backhand), I see it as an extremely smart and functional mod_proxy-esque solution. I think for a large, high-traffic environment it has a solid role as a ‘smart’ alternative for mod_proxy/squid as a reverse proxy accelerator.

    Just my $0.02.

    Oh… and it’s not so complicated at all!

  2. First, thanks for the review. About paragraph three. It was not my intention to promote mod_backhand/wackamole over other alternatives. Of course, I am biased as I am a creator of both. Many of the architectures that OmniTI deploys use hardware load balancers, some use both hardware and software and very few use “software-only” solutions.

    The point of the load balancing discussion was supposed to be that these software techniques are powerful and often othogonal to hardware. One can implement a site and gain the benefits of both by using a combination of hardware and software load-balancing tools. I see you didn’t get that point, which means others didn’t get it either, which means I didn’t make it very clearly. I will make sure to refactor my talk to make this point more clearly in the future — thanks for the comments.

    OmniTI religiously uses the “right tool for the job” paradigm. Software load balancing is just another cog that may or may not be used depending on the machine being built.

Comments are closed.