This module enables smart, context-sensitive configuration of
output content filters. For example, apache can be configured to
process different content-types through different filters, even
when the content-type is not known in advance (e.g. in a proxy).
mod_filter works by introducing indirection into
the filter chain. Instead of inserting filters in the chain, we insert
a filter harness which in turn dispatches conditionally
to a filter provider. Any content filter may be used as a provider
to mod_filter; no change to existing filter modules is
required (although it may be possible to simplify them).
In the traditional filtering model, filters are inserted unconditionally
using AddOutputFilter and family.
Each filter then needs to determine whether to run, and there is little
flexibility available for server admins to allow the chain to be
configured dynamically.
mod_filter by contrast gives server administrators a
great deal of flexibility in configuring the filter chain. In fact,
filters can be inserted based on any Request Header, Response Header
or Environment Variable. This generalises the limited flexibility offered
by AddOutputFilterByType, and fixes
it to work correctly with dynamic content, regardless of the
content generator. The ability to dispatch based on Environment
Variables offers the full flexibility of configuration with
mod_rewrite to anyone who needs it.
In the traditional model, output filters are a simple chain
from the content generator (handler) to the client. This works well
provided the filter chain can be correctly configured, but presents
problems when the filters need to be configured dynamically based on
the outcome of the handler.
mod_filter works by introducing indirection into
the filter chain. Instead of inserting filters in the chain, we insert
a filter harness which in turn dispatches conditionally
to a filter provider. Any content filter may be used as a provider
to mod_filter; no change to existing filter modules
is required (although it may be possible to simplify them). There can be
multiple providers for one filter, but no more than one provider will
run for any single request.
A filter chain comprises any number of instances of the filter
harness, each of which may have any number of providers. A special
case is that of a single provider with unconditional dispatch: this
is equivalent to inserting the provider filter directly into the chain.
There are three stages to configuring a filter chain with
mod_filter. For details of the directives, see below.
Declare Filters
The FilterDeclare directive
declares a filter, assigning it a name and filter type. Required
only if the filter is not the default type AP_FTYPE_RESOURCE.
Register Providers
The FilterProvider
directive registers a provider with a filter. The filter may have
been declared with FilterDeclare; if not, FilterProvider will implicitly
declare it with the default type AP_FTYPE_RESOURCE. The provider
must have been
registered with ap_register_output_filter by some module.
The remaining arguments to FilterProvider are a dispatch criterion and a match string.
The former may be an HTTP request or response header, an environment
variable, or the Handler used by this request. The latter is matched
to it for each request, to determine whether this provider will be
used to implement the filter for this request.
Configure the Chain
The above directives build components of a smart filter chain,
but do not configure it to run. The FilterChain directive builds a filter chain from smart
filters declared, offering the flexibility to insert filters at the
beginning or end of the chain, remove a filter, or clear the chain.
Historically, each filter is responsible for ensuring that whatever
changes it makes are correctly represented in the HTTP response headers,
and that it does not run when it would make an illegal change. This
imposes a burden on filter authors to re-implement some common
functionality in every filter:
Many filters will change the content, invalidating existing content
tags, checksums, hashes, and lengths.
Filters that require an entire, unbroken response in input need to
ensure they don't get byteranges from a backend.
Filters that transform output in a filter need to ensure they don't
violate a Cache-Control: no-transform header from the
backend.
Filters may make responses uncacheable.
mod_filter aims to offer generic handling of these
details of filter implementation, reducing the complexity required of
content filter modules. This is work-in-progress; the
FilterProtocol implements
some of this functionality for back-compatibility with Apache 2.0
modules. For httpd 2.1 and later, the
ap_register_output_filter_protocol and
ap_filter_protocol API enables filter modules to
declare their own behaviour.
At the same time, mod_filter should not interfere
with a filter that wants to handle all aspects of the protocol. By
default (i.e. in the absence of any FilterProtocol directives), mod_filter
will leave the headers untouched.
At the time of writing, this feature is largely untested,
as modules in common use are designed to work with 2.0.
Modules using it should test it carefully.
This configures an actual filter chain, from declared filters.
FilterChain takes any number of arguments,
each optionally preceded with a single-character control that
determines what to do:
+filter-name
Add filter-name to the end of the filter chain
@filter-name
Insert filter-name at the start of the filter chain
This directive declares an output filter together with a
header or environment variable that will determine runtime
configuration. The first argument is a filter-name
for use in FilterProvider,
FilterChain and
FilterProtocol directives.
The final (optional) argument
is the type of filter, and takes values of ap_filter_type
- namely RESOURCE (the default), CONTENT_SET,
PROTOCOL, TRANSCODE, CONNECTION
or NETWORK.
This directs mod_filter to deal with ensuring the
filter doesn't run when it shouldn't, and that the HTTP response
headers are correctly set taking into account the effects of the
filter.
There are two forms of this directive. With three arguments, it
applies specifically to a filter-name and a
provider-name for that filter.
With two arguments it applies to a filter-name whenever the
filter runs any provider.
proto-flags is one or more of
change=yes
The filter changes the content, including possibly the content
length
change=1:1
The filter changes the content, but will not change the content
length
byteranges=no
The filter cannot work on byteranges and requires complete input
proxy=no
The filter should not run in a proxy context
proxy=transform
The filter transforms the response in a manner incompatible with
the HTTP Cache-Control: no-transform header.
cache=no
The filter renders the output uncacheable (eg by introducing randomised
content changes)
This directive registers a provider for the smart filter.
The provider will be called if and only if the match declared
here matches the value of the header or environment variable declared
as dispatch.
provider-name must have been registered by loading
a module that registers the name with
ap_register_output_filter.
The dispatch argument is a string with optional
req=, resp= or env= prefix
causing it to dispatch on (respectively) the request header, response
header, or environment variable named. In the absence of a
prefix, it defaults to a response header. A special case is the
word handler, which causes mod_filter
to dispatch on the content handler.
The match argument specifies a match that will be applied to
the filter's dispatch criterion. The match may be
a string match (exact match or substring), a regex, an integer (greater, lessthan or equals), or
unconditional. The first characters of the match argument
determines this:
First, if the first character is an exclamation mark
(!), this reverses the rule, so the provider will be used
if and only if the match fails.
Second, it interprets the first character excluding
any leading ! as follows:
This directive generates debug information from
mod_filter.
It is designed to help test and debug providers (filter modules), although
it may also help with mod_filter itself.
The debug output depends on the level set:
0 (default)
No debug information is generated.
1
mod_filter will record buckets and brigades
passing through the filter to the error log, before the provider has
processed them. This is similar to the information generated by
mod_diagnostics.
2 (not yet implemented)
Will dump the full data passing through to a tempfile before the
provider. For single-user debug only; this will not
support concurrent hits.