Announcing mod_proxy_content 1.0

I’m pleased to announce the availability of mod_proxy_content, an Apache 2.x module that is essential for reverse proxying modern web content. It is a fork of mod_proxy_html 3.0.1, with bug fix patches through 3.1.2, enhanced to better handle CSS and Javascript content without the need to create messy regular expressions in the Apache configuration files. It does not depend on mod_xml2enc, and so has the same problems with non-ASCII/UTF-8 as mod_proxy_html without mod_xml2enc.

Major enhancements include:

  • Automatic creation of regular expressions specific to CSS and Javascript for use in <style> and <script> tags, as well as external files marked with appropriate content types. This avoids false matches, in particular for Javascript when a private server’s entire URL space is mapped to a subdirectory on the public server.
  • Support for remapping URLs in style attributes.
  • A new flag allows a URL mapping to be ignored for styles.
  • Fixed a bug that silently discarded HTML comments when ProxyHTMLExtended was on. IE conditional comments look like HTML comments, and so were lost prior to this fix.
  • HTML comments are scanned for URL matches, again to fix up resources referenced inside an IE conditional comment.
  • URLs are always mapped in the order they are defined. Global mappings (outside of a <Location> directive) are always applied first.
  • Fixed the HTML 4.01 DOCTYPEs so that they always emit a complete URI.
  • ProxyHTMLExtended is more fine-grained and can be enabled for just styles or scripts, as well as both (the old behavior when set to “on”).

The development of this module was funded by the Council of Better Business Bureaus in late 2008. Some internal discussion held up the release of the code until now, but I’m pleased it is being made available to the Apache community. Note that because my work on this module took place two years ago, I can offer only very limited support for it.

The source code is available on GitHub.