Our log files are based on the Apache-standard "combined" log file format, which is a superset of the "common" format. This format is good for web log analyzing software. The fields, in order, are:
clientip - username [time] "request" status bytes "referer" "useragent"
1.2.3.4 - - [22/Nov/2013:22:32:01 -0700] "GET /index.html HTTP/1.0" 200 0 "-" "Wget/1.9.1"
The meaning of each field is shown below:
- clientip
- The IP address of the machine that requested the page. If you get a "-" in this field, it may mean that your site performs HTTP accesses on itself, or that we had to test your site with some internal application.
- dash
- This field used to be used, but isn't anymore. Lots of log analyzers still expect it to be there, so it always appears as a dash (-).
- username
- If you use Basic HTTP Authentication, the authenticated username will appear here. If you don't, it will be a dash (-).
- time
- The time in the format DD/Mon/Year:HH:MM:SS tzone
- request
- The request method, requested URI, and protocol, like GET /index.html HTTP/1.0.
- status
- The return code. The most common codes are 200 (ok), 304 (served elsewhere), 404 (file not found), and 500 (internal server error - a script error or a problem with our server).
- bytes
- The number of raw bytes of content served. This number does not include HTTP headers, transfer encoding, or TCP/IP protocol overhead.
- referer
- The page that contained the link to the requested resource. You can use this to see who is linking to you, and weblog analyzers use it to map your site. "referer" is not a typo, although it is spelled incorrectly. It's actually in the published standard that way.
- useragent
- The web browser sends this to identify itself. Lots of log analyzers can graph this.
More information about Apache log file formats is available on the Apache site.