Wednesday, September 4, 2013

REMOTE_ADDR and HTTP_X_FORWARDED_FOR : the bad idea

Logging IP addresses is generally a good idea for security purpose or if you want to debug stuff. It's easy just to spot a faulty request by IP and then just grep all the logs searching for that string.

However, in many projects that need a more complex setup such as using load balancers or proxies, this can be a problem because the usual REMOTE_IP is replaced with the other component of your infrastructure. This renders the logging of IP accessing directly your Web server next to useless.

In some cases, IP addresses are also used in order to prevent brute force attempts, or doing some sort of access control. In that case, having a good configuration can impact way more than just logging.

The widely proposed solution is to use the X-Forwarded-For header in order to fetch the IP of the real client accessing the Web server.

Many people forget that X-Forwarded-For is actually a list that can be a chain of multiples proxies and not just a single IP address, so saying that you could replace the remote IP with it is wrong.

So basically, what is needed is to make sure that you are using the IP that hits just before your infrastructure.

If you have only one you could do this in python django :
request.META['HTTP_X_FORWARDED_FOR'].split(",")[-1].strip()

You would be tempted to say that if you just take what is at the beginning of the string would be good, no matter how many proxies you have installed. But it is again wrong because you have to keep in mind that any HTTP header like this one can be forged.

So the good thing to do is to keep a list of your legit proxies and go trough them starting from right going to left and take the first IP after. Be aware then that the remote IP might be another proxy, but since it's not in your architecture you can't be sure.

Some people suggest to remove the X-Forwarded-For header at your front facing server, but if you do that you'll loose a way to troubleshoot the request.

Don't forget that using the X-Forwarded-For header when you don't have any proxy is bad, because then someone could set it up to a value and spoof it's way in easily.

A good example of implementation is the way Drupal 7 is doing it :
        // If an array of known reverse proxy IPs is provided, then trust
        // the XFF header if request really comes from one of them.
        $reverse_proxy_addresses = variable_get('reverse_proxy_addresses', array());

        // Turn XFF header into an array.
        $forwarded = explode(',', $_SERVER[$reverse_proxy_header]);

        // Trim the forwarded IPs; they may have been delimited by commas and spaces.
        $forwarded = array_map('trim', $forwarded);

        // Tack direct client IP onto end of forwarded array.
        $forwarded[] = $ip_address;

        // Eliminate all trusted IPs.
        $untrusted = array_diff($forwarded, $reverse_proxy_addresses);

        // The right-most IP is the most specific we can trust.
        $ip_address = array_pop($untrusted);
This require a configuration file with the list of proxies that needs to be updated each time the infrastructure change. This is uncommon in standard procedures where system administration is separated from the developers, but any DevOps team should be OK. Either way, it should be noted that change to the servers should be reflected in the application.

Last but not least, you could as well use another field than X-Forwarded-For that just identify the outside IP address. Especially if you want to ease the process with your servers: for example, it's way easier in an Apache log config to replace the %h with another field that is sure to not be an array. Replacing %h with X-Forwarded-For can leads to non-standard log files because it can be multiple IPs.

Drupal again does that well with a configuration variable :
$reverse_proxy_header = variable_get('reverse_proxy_header', 'HTTP_X_FORWARDED_FOR');
Other advantages of this method can be if the number of proxies varies from time to time.

So next time you see an access log like this:
192.168.1.190 - - [04/Feb/2013:12:24:47 -0400] "POST /admin/login HTTP/1.1" 200
And all the other requests come from 192.168.1.190, ask yourself if you are using a proxy and if everything is properly configured.


No comments:

Post a Comment