官术网_书友最值得收藏!

  • Nginx Troubleshooting
  • Alex Kapranoff
  • 2240字
  • 2021-07-09 19:38:16

Configuring Nginx logging

There are two types of logs that Nginx may write. One could also say that there are infinite types because of the log_format directive that allows you to create your own types of logs.

To refresh your memory about what directives are used to configure Nginx logging, here they are:

  • The error_log directive configures the logging of exceptional events that the developers of Nginx consider worth noting. Usually, this is all kinds of errors.

    The format of the directive is this:

    error_log <destination> <log level>;
    Note

    The first parameter is usually a path to the file with the log. Recent versions of Nginx starting with 1.7.1 also support logging via syslog, to a local or remote syslog server. There is also a rarely used misnamed special value stderr, which, by the way, does not redirect logging to stderr (the third standard stdio stream or &2 in shell terms) because it does not make much sense to log to stderr from a daemon—daemonization involves closing all standard file descriptors. The stderr value means "log into the file that was configured during compilation time" and that depends on the package or even the OS distribution you use. You will mostly want to specify an actual file instead of stderr just to be sure where it ends. By the way, to make things more confusing, there is a way to specify logging to actual stderr during compile time. It is not very useful for daemons; you will not probably ever use it.

    You often want several error logs. Remember that in Chapter 1, Searching for Problems in Nginx Configuration, we discussed multiline configuration directives named contexts. They provide a topic, that is, a narrow scope for the directives inside them. You may (and that is usually a very good idea) have different log files in different contexts. And using stderr prevents that because everything will get written to the same place.

    The log level parameter of the error_log directive is a way of specifying a threshold of severity of events that end up in the log. Most of the time, you will want to set this to warn, but feel free to increase up to debug whenever you have a reproducible problem that you want more information about.

    The debug level requires a special compile-time switch. The reason for this is that debug logging makes some performance compromises and the code for it should not be included in production systems, ideally. Unless Nginx is really your bottleneck (a rare situation), you may safely use --with-debug when compiling Nginx. See a little more about it at http://nginx.org/en/docs/debugging_log.html.

  • The other logging directive is access_log. And it includes much more functionality than error_log and also more potential for mistakes. Let's look at it more closely.

    This is how access logs are configured:

    access_log <destination> <log format> <misc arguments>

    The idea of access log is to have a journal of all request-response pairs processed by Nginx. As opposed to the error log, the records in access logs have a thoroughly specified format, usually a chain of whitespace-delimited values that contain some information about the current request-response pair or general state of Nginx. All access log records have this format. Access logs and error logs work together. In case Nginx has something unusual to say about a request it processes, you will find a strictly formatted line of data in your access log and then some warnings or errors of mostly free text nature in your error log.

    The destination parameter takes the same values as the respective parameter of the error_log directive. You may still log to syslog or a file.

    Note

    Modern Nginx also has an interesting performance feature of buffered access logging. You will find more information about turning buffered logging on with flush or gzip arguments at http://nginx.org/en/docs/http/ngx_http_log_module.html#access_log. Do understand what buffering means before turning it on. One of the expected features of all error reporting mechanisms is being real time and buffered logs are exactly the opposite, that is, log records are not written to disk and not made available for inspection immediately. They are held in the buffer for some time. You will need this only in high-load scenarios where writing logs starts to take noticeable time because of the disk waits.

    The log format parameter of the access_log directive is the heart of access logging. It expects a name of a template that models each record in the log. You create such templates with the log_format directive. There is a predefined format named combined, which is also a good example to show here:

    log_format combined '$remote_addr - $remote_user [$time_local] '
                        '"$request" $status $body_bytes_sent '
                        '"$http_referer" "$http_user_agent"';

As you can see, the log_format directive's second argument is a long line of variables with "talking" names. All characters between and around variables will be included in the log. Variables will be evaluated at the time of logging, and their values will take their places.

Let's look at a real example of a log record generated with this very template:

85.90.193.224 - - [01/Feb/2016:12:01:34 +0400] "GET / HTTP/1.0" 200 137426 "http://example.com/" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:33.0) Gecko/20100101 Firefox/33.0"

You are probably very familiar with the combined log format from previous experience with Nginx, Apache, or some other web server software. Still, going through the inpidual items of each combined log line with us may provide you with some nonobvious insights. Let's parse the example record and learn some facts about those variables along the way:

There is a lot of information that you can add to your access logs using different variables that Nginx provides during processing of each request.

The whole list of them is at http://nginx.org/en/docs/varindex.html.

There are also several variables that are available only during log record generation and are listed in the description of the log_format directive at http://nginx.org/en/docs/http/ngx_http_log_module.html#log_format.

The recommendation is to keep saving logs in the combined format to be able to use a huge number of tools that community has created over the years. In addition to these, you may create some extended logs with more data to help you debug problems.

Here is a list of variables that are often useful but not included in the default combined format:

Logging POST requests

Once you start debugging a problem with a web application that runs behind one of your Nginx instances by tracing user requests and application responses via access logs, you will see that GET/HEAD requests are logged fully while POST request log records lack any information except the URI to which the data was posted. This is one of the questions that many system administrators ask, especially after trying to get away with tcpdumps only. tcpdump is a wonderful Swiss army knife of protocol tracing, but it requires active participation during the events that need to be traced. And tracing HTTPS with tcpdump is very hard.

Nginx is able to log POST request bodies and many more. You should already be fully equipped to at least try to implement such a logging yourself.

Remember that we talked about custom log formats and using variables to record the state of requests and responses. If you search through the list of variables available during request processing, you will see the variable named $request_body. See http://nginx.org/en/docs/http/ngx_http_core_module.html#var_request_body.

Let's invent a simple log format including $request_body:

log_format request_body_log 'body: "$request_body"';

Now we enable logging with this format by adding this directive:

access_log /var/log/nginx/requests.log request_body_log;

Remember that the log_format directive should be used in one of the higher contexts, for example, the http context. Multiple access_log directives may be in effect for all the requests, and because of this, we do not need to specify the rest of the variables in the template for the request_body_log format. Your usual preconfigured combined-formatted logs will still get written to.

What would we see in requests.log for some simple GET requests to download a couple of static files?

body: "-"
body: "-"
body: "-"

Make sure that you understand the result before proceeding.

Now, we need POST requests. And POST requests to static files are useless. They never happen in real life. Clients POST data to web applications, and for Nginx administrators, a web application is an upstream to which Nginx proxies the requests and from which it proxies back the responses.

Suppose that we build something like this. It will be a very simple Dancer application in Perl accepting a simple POST and responding with a piece of dynamic HTML.

The source code is at http://kapranoff.ru/~kappa/nginx-troubleshooting/simple-post.pl:

Now we will set up a proxy inside our Nginx instance:

location /simple-post {
    proxy_pass http://localhost:3000/;
}

We will point our browser to http://localhost/simple-post.

If the Dancer app is running, you will see a simple form of one field and a button. Type in something, click on the button and rush to your requests.log:

body: "-"
body: "a=Nginx+rules%21"

The first line is the empty body of the GET request for the form, whereas the second contains the body of the POST that the form generated with the help of your browser. There are two ways an HTML form may be encoded into a POST body; this one is the default application/x-www-form-urlencoded. The other one is multipart/form-data; it is widely used for forms that allow file uploads. This is a little bit out of scope of this book already. We should add that form encodings are quickly becoming a thing of the past because more and more POST bodies are constructed by the client-side JavaScript and the browsers themselves.

What is important here is that you now have a simple way to log what is coming your way via POST requests.

Conditional logging

This example will also allow demonstration of one of the more recent Nginx logging features named conditional logging.

The directive access_log has a number of optional parameters and among them is a parameter if that specifies a condition on which a record is appended to this particular access log. When we configured request body logging in the previous section, we still ended up with a log full of "-"; those are empty bodies of all the non-POST requests. Let's fix that. First, we add a condition to our access_log directive:

access_log /var/log/nginx/requests.log request_body_log if=$method_is_post;

The condition that we use is a simple custom variable. We intentionally show this technique using syntax very similar to what is documented in the official documentation at http://nginx.org/en/docs/http/ngx_http_log_module.html#access_log.

So the next step for us is to create this variable. There are several ways to create a variable in Nginx. The most straightforward is using the set directive inside an if context. But it is a good habit to cringe any time you see an if directive in Nginx configuration. if should always be the last choice. Remember that there is no programming inside configuration files; everything should be as declarative as possible.

And there is a good declarative way to create a variable:

map $request_method $method_is_post {
    POST 1;
    default 0;
}

This is everything you need to do to enable conditional logging. If your Nginx version is modern enough, you will get only bodies of POST requests in your requests.log from now on.

There is a probability that your Nginx is not modern enough (at least 1.7.0 is required). Use nginx -t to test the configuration. Can you think of a way to work around the problem without upgrading Nginx? This is not a hypothetical question. Running Nginx installed from packages provided by your distribution is highly recommended, and they are notoriously not up to date.

Logging big request bodies

There is one more thing to tell you about logging request bodies. Actually, two things that will manifest in exactly the same way while having different reasons.

The variable $request_body is not guaranteed to have any content even in the case of a good POST request with data inside. The first possible reason for an empty $request_body is a situation where Nginx has decided that parsing the body is not needed and optimized it away. That is a documented behavior that still strikes in the least expected moments. The documentation says clearly:

"The variable's value is made available in locations processed by the proxy_pass, fastcgi_pass, uwsgi_pass, and scgi_pass directives."

See for yourself: http://nginx.org/en/docs/http/ngx_http_core_module.html#var_request_body.

These are the only four cases in which Nginx populates the $request_body variable. Fortunately, POST requests to locations that do not contain any of those directives are very rare. POSTs are intended to accept data from clients and feed to server applications for which Nginx is acting as a proxy.

Be careful not to harm yourself debugging empty request bodies for some uncommon configuration with POST requests and no proxying directives in that context.

The other reason for empty $request_body is the request being too large. If the size of the request body exceeds the value set up by the client_body_buffer_size directive, it is not available via $request_body variable. Instead, the whole body is saved to a temporary file on the file system, and its name is written into the new $request_body_file variable.

There is also another very interesting directive named client_body_in_file_only that provides a way to always save requests to files. It may be used instead of the mechanism that we showed earlier altogether! You will add $request_body_file to one of your log formats and turn on client_body_in_file_only. After this, Nginx will create an ever-growing store of files containing all your request bodies. Do not forget to clean them up from a crontab or they will fill the filesystem.

主站蜘蛛池模板: 静宁县| 沿河| 麻阳| 新蔡县| 宜宾县| 新晃| 罗定市| 兴城市| 仁化县| 清远市| 弋阳县| 鹤岗市| 寻甸| 抚宁县| 永胜县| 石阡县| 綦江县| 修水县| 玉龙| 惠东县| 讷河市| 湘潭县| 三都| 三台县| 福海县| 新巴尔虎左旗| 子洲县| 新昌县| 谢通门县| 金堂县| 迭部县| 吉木萨尔县| 天门市| 含山县| 安化县| 梁平县| 历史| 原平市| 沙湾县| 临澧县| 皋兰县|