NFS attribute caching performance impact on web applications

NFS attribute caching performance impact on web applications

A couple of days ago, I had some issues with NFS consistency, not every servers were up to date. Some servers had the good version of the file some hadn’t. However performing a ls -l seemed to fixed tempory the problem after each update (where a simple ls didn’t). Indeed issuing ls with the -l option triggers stat() where ls doesn’t, just because file attributes are called from the stats() function. I needed to investigate…


I. Story

We determined that some web virtual didn’t delivered the same content, some page weren’t updated properly. We quickly figured out that the issue was located of the NFS attribute caching, by default too long, at least in our setup. The first change we made was to enable the noac option on the client mount. However while trying to enhance the consistency of the NFS data, the performance impact was pretty high. The impact was easy to detect and reproduce. Basically everytime a page was request from the webser, the client had to request the NFS server to be sure to deliver the latest version. As I said, it’s quite easy to notice it, espacially the impact on the Time To First Bite value. You will mainly notice that the website takes a lot of time to load and all of sudden the content is quickly delivered.

Quick definition of the TTFB, stolen from Wikipedia, Time To First Byte or TTFB is a measurement that is often used as an indication of the responsiveness of a webserver or other network resources.

A bad TTFB can be related to a lot of things:

  • HTTP KeepAlive
  • Slow storage backend
  • Limited webserver connection

To mesure the Time To First Bite, I always use the following command (see example for my Website):

$ curl -o /dev/null -w "Connect: %{time_connect} TTFB: %{time_starttransfer} Total time: %{time_total} \n" -s http://sebastien-han.fr
Connect: 0,042 TTFB: 0,223 Total time: 0,224 

Quite fast :)


II. Tests

Now see the performance impact by using different options/behavior to cache file attribute on a NFS mount (client side).

II.1. NOAC option

Stolen from the NFS man page.

NOAC: Use the noac mount option to achieve attribute cache coherence among multiple clients. Almost every file system operation checks file attribute information. The client keeps this information cached for a period of time to reduce network and server load. When noac is in effect, a client’s file attribute cache is disabled, so each operation that needs to check a file’s attributes is forced to go back to the server. This permits a client to see changes to a file very quickly, at the cost of many extra network operations.

$ for i in `seq 5`; do curl -o /dev/null -w "Connect: %{time_connect} TTFB: %{time_starttransfer} Total time: %{time_total} \n" -s http://one-of-the-website-I-host ; done
Connect: 0,018 TTFB: 9,265 Total time: 9,322 
Connect: 0,009 TTFB: 7,150 Total time: 7,195 
Connect: 0,012 TTFB: 7,172 Total time: 7,220 
Connect: 0,010 TTFB: 7,082 Total time: 7,156 
Connect: 0,019 TTFB: 10,663 Total time: 10,743 

II.2. lookupcache=none

Stolen from the NFS man page.

If the client ignores its cache and validates every application lookup request with the server, that client can immediately detect when a new directory entry has been either created or removed by another client. You can specify this behavior using lookupcache=none. The extra NFS requests needed if the client does not cache directory entries can exact a performance penalty. Disabling lookup caching should result in less of a performance penalty than using noac, and has no effect on how the NFS client caches the attributes of files.

$ for i in `seq 5`; do curl -o /dev/null -w "Connect: %{time_connect} TTFB: %{time_starttransfer} Total time: %{time_total} \n" -s http://one-of-the-website-I-host ; done
Connect: 0,011 TTFB: 3,654 Total time: 3,696 
Connect: 0,011 TTFB: 3,350 Total time: 3,392 
Connect: 0,010 TTFB: 3,535 Total time: 3,581 
Connect: 0,009 TTFB: 3,416 Total time: 3,460 
Connect: 0,009 TTFB: 3,312 Total time: 3,356 

II.3. actimeo=3

Stolen from the NFS man page.

Using actimeo sets all of acregmin, acregmax, acdirmin, and acdirmax to the same value.

  • acregmin=n, The minimum time (in seconds) that the NFS client caches attributes of a regular file before it requests fresh attribute information from a server. If this option is not specified, the NFS client uses a 3-second minimum.
  • acregmax=n, The maximum time (in seconds) that the NFS client caches attributes of a regular file before it requests fresh attribute information from a server. If this option is not specified, the NFS client uses a 60-second maximum.
  • acdirmin=n, The minimum time (in seconds) that the NFS client caches attributes of a directory before it requests fresh attribute information from a server. If this option is not specified, the NFS client uses a 30-second minimum.
  • acdirmax=n, The maximum time (in seconds) that the NFS client caches attributes of a directory before it requests fresh attribute information from a server. If this option is not specified, the NFS client uses a 60-second maximum.
$ for i in `seq 5`; do curl -o /dev/null -w "Connect: %{time_connect} TTFB: %{time_starttransfer} Total time: %{time_total} \n" -s http://one-of-the-website-I-host ; done
Connect: 0,010 TTFB: 2,592 Total time: 2,639 
Connect: 0,010 TTFB: 1,592 Total time: 1,636 
Connect: 0,010 TTFB: 1,679 Total time: 1,727 
Connect: 0,010 TTFB: 1,592 Total time: 1,656 
Connect: 0,010 TTFB: 1,695 Total time: 1,740 

II.4. actimeo=1

$ for i in `seq 5`; do curl -o /dev/null -w "Connect: %{time_connect} TTFB: %{time_starttransfer} Total time: %{time_total} \n" -s http://one-of-the-website-I-host ; done
Connect: 0,010 TTFB: 1,726 Total time: 1,769 
Connect: 0,009 TTFB: 1,739 Total time: 1,782 
Connect: 0,009 TTFB: 1,704 Total time: 1,750 
Connect: 0,009 TTFB: 3,136 Total time: 3,212 
Connect: 0,014 TTFB: 1,730 Total time: 1,770 

Use this mount option only if you have really good reason… From my experience, the noactimeo option did the job pretty well.