Debugging Gateway Errors


August 14th, 2023

Debugging Gateway Errors

You'll sometimes hit Gateways errors, usually 502 Bad Gateway or 504 Gateway Timeout.

These are errors Nginx returns when it sends a request to PHP but PHP is returning some error saying it can't process the request. Typically these are NOT errors occuring in your application, but instead are (usually) errors hit before the application even processes a request.

What is Gateway

A Gateway is a thing sitting between the web server (usually Nginx) and your application. For most of us, this is PHP-FPM. Nginx will use the fastcgi protocol to convert a web request into something PHP-FPM can understand. PHP-FPM then runs your application, setting up PHP with the information it needs (setting superglobals $_GET, $_POST, $_SESSION, $_SERVER, etc).

If PHP-FPM returns and error, Nginx gives us a Gateway error.

Bad Gateway

Bad Gateway is returned when PHP-FPM returns an error. This usually is one of:

  1. PHP-FPM is not running (perhaps due to too many error)
  2. PHP-FPM has reached it's max_children limit and cannot process any more requests
  3. Some sort of PHP error such as a segfault

Gateway Timeout

Gateway timeout errors typically occur when your app is handling too much traffic. This can correspond to PHP-FPM max_children errors (too many requests that it's configured to handle) but mostly occurs when your database is overloaded and it can't handle additional connections making queries. It takes too much time to return queries.

This can also occur if any network connection your application makes is not returning responses in time, but the database is the most common bottleneck.

Debugging Gateway Errors

The tl;dr for debugging gateway errors is the logs. I start from the top of the network request stack, and move down. This means the order of logs I check are:

  1. Nginx
  2. PHP-FPM
  3. Server resource usage
  4. Application logs

The Nginx logs typically have the least useful data, altho it might clue you into issues of PHP-FPM not running (if it can't find PHP-FPM's socket file such as /var/run/php-fpm.sock for example).

The FPM logs are usually the most beneficial, since PHP-FPM is the gateway returning an error! Often you'll see an error about hitting (or being close to) the max_children limit. Less often you might see a setfault error (if you see that, you probably have some recursion in your code somewhere).

Server resource usage is what I'd check next. You can use htop or similar to check CPU/RAM usage, and what processes are using them. You should also check disk usage via df -h to check if the disk is out of space.

You may also run out of inodes! Inodes are "index nodes", things used to track file usage on linux systems. Since everything is a file (including how Linux handles open network connections!) running out of inodes can be an issue. You can run df -i to see inode usage per disk drive.

Finally I check application logs. These MIGHT show errors related to timeouts of or database errors, but sometimes the issue is not specific to the application code base. How useful these logs are will vary for Gateway errors.

Filed in:

Chris Fidao

Teaching coding and servers at CloudCasts and Servers for Hackers. Co-founder of Chipper CI.