Effect of Interrupt coalescence on NTP timing

These are the scatter plots of ntp packet round trip vs offset for a client to a server with an Intel Gigabit ethernet card. [ Intel Corporation 82541PI Gigabit Ethernet Controller (rev 05)] The Gigabit card started off running at 100Mbps with rx-usecs: 3 which corresponds to interrupt coalescence and to a specific dynamic heuristic for packet receipt running on the card. It was then changed to running at 1000Mbps but still with rx-usecs: 3. Finally it was changed to 1000 Mbps with rx-usecs: 0 (no interrupt coalescence. The client is running on a 100Mbps card [Intel Corporation 82562EZ 10/100 Ethernet Controller (rev 01)] but other clients running other cards, including Gbps cards, behave in the same way. Ie, there is no way to differentiate the cards the clients run by the behaviour of ntp on them.
In all cases the ntp program used was chrony, both 1.24 and 1.26. (http://chrony.tuxfamily.org/) an implimentation of ntp version 3. It's main differences from ntpd (www.ntp.org), the reference implimentation of ntp, is the use of a least squares linear fit over the past N offset measurements (where N is adjusted depending on the goodness of that linear fit), instead of using a simple Markovian feedback algorithm. But this is largely irrelevant to the issue discussed here. This program tends to produce a much smaller offset scatter than does ntpd (by factors of 2-3) in many situations.

This first graph shows the round trip time from the client to the server as a function of time (in days) (the days are the calendar days of June 2012 in UTC). The client was at poll interval 4 throughout. Note the decrease in the round trip time in each change of the server's Gbps ethernet card.

A scatter plot on the above from the initial time to the time 10.8. This was with the Gbps card set at 100Mbps and with rx-usecs: 3. Note the strong correlation between the round trip and the offset. This indicates that the round trip changes are maximally asymmetric with the extra delay occurring on the trip from the client to the server.

The scatter plot from the above from t=11.0 to 11.7. The Gbps card was running at 1000 Mbps but with rx-usecs: 3. There is still a strong correlation between the round trip time and the offset suggesting once again that the round trip time is asymmetric with the delay occurring in the client to server leg.

The scatter plot to the server when the interrupt coalescence has been turned off (rx-usecs: 0) Note the absence of correlation between offset and round trip time. The offset standard deviation is about 5us. This is close to the standard deviation of the server offsets with respect to the gps pps source (1.8usec) This is an amazing degree of accuracy over an ethernet between server and client.
Note that a client whose signals go through three switches on the way to the server, instead of just one, and is located on the opposite side of the building also has an offset standard deviation of about 5us.

The offset of the server from the GPS PPS signal as a function of time. There is no change in the server's behaviour with respect to the GPS over the course of the experiment.


For comments on Interrupt coalescence on Gb NIC cards by Rick Jones, see here