Philip Taylor's Header Data/README.txt
author Geoffrey Sneddon <geoffers@gmail.com>
Thu Mar 27 22:34:17 2008 +0000 (2008-03-27)
changeset 74 2f867e70fcfc
parent 38 96df15d57efb
child 76 e69fb6142541
permissions -rw-r--r--
Terminology! Exciting!
     1 This gives the headers of ~15k pages as parsed by HttpClient, specifically by
     2 <http://jakarta.apache.org/httpcomponents/httpclient-3.x/apidocs/org/apache/
     3 commons/httpclient/HttpMethod.html#getResponseHeaders()>.
     4 
     5 This does not include any non-200 responses, and any US-ASCII control characters
     6 (with the exception of 0x09, 0x0A, and 0x0D) are replaced by a 0x20 (space)
     7 character.
     8 
     9 It may not be grouped by URI fully, as it is not processed by a single thread.