ApacheCon: Apache 2.0 Filters – Michael J. Radwin

I went to the talk on Apache 2.0 Filters by Greg Ames. I already knew a little bit about the Apache 2.0 Filtered I/O model from a session at the O’Reilly Open Source conference, but since I’m going to sit down and write one Real Soon Now, I oughta learn a little more about ’em.

Ames gave an example of the bucket brigade API by showing snippets of the mod_case_filter code. Looks pretty elegant and simple.

He then went into details about why the naive approach fails miserably from a performance perspective, and showed some examples of how to do a filter the Right Way. It turns out that this is really complex.

Aside from all sorts of error conditions, there are lots of things to worry about with resource management. Do you allocate too much virtual memory? How often do you flush? Looks like you need to regularly flip between blocking and non-blocking I/O to do this right.

Filters that need to examine every single byte of the input (such as things that parse HTML or other tags) are even more complicated because you need to allocate private memory when a tag spans more than one bucket. Bleh. My mod_highlight_filter idea is going to be difficult to implement.

Ames then talked about the mod_ext_filter module from the Apache Directive perspective. I would’ve rather seen some slides about the implementation of this rather complex filter, but perhaps that would have been too technical for the audience.

He also discussed some tricks about how to debug Apache more easily with gdb and using 2 Listen statements (as a way to avoid starting with the -X option), and some useful gdb macros for your ~/.gdbinit file which make examining the bucket brigade easier. Cool tips. I guess I misjudged the technical level here; he probably skipped the implementation of mod_ext_filter because it would’ve taken too much time.