For those of you who are Serving XHTML with Apache MultiViews you may want to be careful about how MultiViews interacts with ErrorDocument. Configuring error documents with content negotiation can lead to compound errors in the case that the client does not accept any of the types available for the error document. This results in both unexpected behavior and a suboptimal user experience. This post describes how to avoid such errors while still negotiating the returned content type.

The Issue

Lets assume that you have a website with the following .htaccess file:

Options +MultiViews
ErrorDocument 404 /404

Along with 404.html:

<!DOCTYPE html>
<html>
<head><title>404 HTML</title></head>
<body><h1>404 HTML</h1></body>
</html>

and 404.xhtml:

<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml">
<head><title>404 XHTML</title></head>
<body><h1>404 XHTML</h1></body>
</html>

Accessing a non-existent URL from your browser will result in a 404 response with the content from the file of the preferred content type being returned, as expected. But what happens to a user agent which requests an unsupported type, such as a bot collecting favicons? Lets examine the result of running curl -i -H "Accept: image/vnd.microsoft.icon, image/x-icon" http://localhost/favicon.ico:

HTTP/1.1 404 Not Found
Date: Thu, 01 Oct 2015 03:57:33 GMT
Server: Apache/2.4.16 (Debian)
Alternates: {"404.html" 1 {type text/html} {length 99}}, {"404.xhtml" 1 {type application/xhtml+xml} {length 155}}
Vary: negotiate,accept
TCN: list
Content-Length: 403
Content-Type: text/html; charset=iso-8859-1

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>404 Not Found</title>
</head><body>
<h1>Not Found</h1>
<p>The requested URL /favicon.ico was not found on this server.</p>
<p>Additionally, a 404 Not Found
error was encountered while trying to use an ErrorDocument to handle the request.</p>
<hr>
<address>Apache/2.4.16 (Debian) Server at localhost Port 80</address>
</body></html>

Which is accompanied by the following error in the Apache log:

[negotiation:error] [pid XXXX] [client ::1:XXXXX] AH00690: no acceptable variant: /path/to/404

The problem is that the client has requested /favicon.ico, which doesn’t exist, and negotiating the ErrorDocument for /404 has failed because the client only accepts icon types and the ErrorDocument is only available in HTML types. The result is a server-generated error page announcing the compound-error. (Although it describes the second error as another 404, rather than a 406, which is odd.) Not ideal.

A Solution

Ideally Apache would provide a configuration option to specify a fallback content type when none is accepted, analogous to what ForceLanguagePriority does for content language. This could then be scoped to error pages using <Directory> or <Files>. Unfortunately, I could not find any way to specify such a fallback.

The solution that I came up with is to set ErrorDocument conditionally by matching against the Accept header. This is basically a poor-man’s content negotiation, but works reasonably well when there are few types to choose between and quality comparison isn’t required. To negotiate between the HTML and XHTML versions of the 404 page, modify .htaccess as follows:

Options +MultiViews
<If "%{HTTP_ACCEPT} =~ m#application/xhtml\+xml#i">
	ErrorDocument 404 /404.xhtml
</If>
<Else>
	ErrorDocument 404 /404.html
</Else>

This sends the XHTML version whenever application/xhtml+xml is present in the Accept header (which, in practice, is only true for browsers which support it and prefer it equally to text/html) and otherwise send the HTML version. The same curl command now returns:

HTTP/1.1 404 Not Found
Date: Thu, 01 Oct 2015 04:16:13 GMT
Server: Apache/2.4.16 (Debian)
Vary: Accept
Last-Modified: Thu, 01 Oct 2015 03:54:21 GMT
ETag: "63-5210300883cb3;521034eaa61f4"
Accept-Ranges: bytes
Content-Length: 99
Content-Type: text/html

<!DOCTYPE html>
<html>
<head><title>404 HTML</title></head>
<body><h1>404 HTML</h1></body>
</html>

Another drawback of this approach, which you can see from the response above, is that the response omits the Transparent Content Negotiation (TCN) headers. Although RFC 2295 does not specify 404 behavior explicitly, my reading is that since the representation of the 404 is negotiated, the headers should indicate the chosen representation and expose the negotiation process. But, as with the above limitations, it has little practical effect since the response includes the preferred type and I am not aware of any user agents that would want to dynamically negotiate error documents.

Additional Considerations for Language Negotiation

Although the above solution works well when only one dimension, with few alternatives, is being negotiated. It quickly becomes unwieldy when multiple dimensions (e.g. type and language) are under consideration. My attempts to use MultiViews to negotiate only the language have so far been unsuccessful. The Apache documentation includes an example of a language-negotiated ErrorDocument configuration using a type-map in httpd-multilang-errordoc.conf. Although I expected that combining this technique with a type-map that only negotiates on language (such as in Apache: The Definitive Guide Section 6.4) would achieve the desired effect, it does not appear to. I am still seeing the same AH00690 error as before.

Since I am currently only negotiating the content type, this is not an issue for me. However, if anyone is able to solve this problem, I would be very curious about how you did it, and more than happy to post the solution here for others to use. Until then, best of luck with the type-only solution!