1. Client using public servers
A common configuration of chronyd
is a client using public servers from
the pool.ntp.org
project. It is the default configuration included in many
packages of chrony
.
The configuration file could be:
pool pool.ntp.org iburst driftfile /var/lib/chrony/drift makestep 0.1 3 rtcsync
The servers used by the client are selected randomly by the pool DNS servers from the country of the client (according to IP geolocation data, which are not always accurate). The polling interval is automatically adjusted between the default minimum of 64 and maximum of 1024 seconds. As the client is running, it should slowly increase its polling interval to the maximum and reduce the load on the servers.
Accuracy of the system clock is usually within a few milliseconds, but it can be significantly worse, depending on how symmetric are network routes between the servers and client, how stable is the network delay and client’s clock, and how accurate are the servers themselves.
The set of servers can change on each restart of the client. There can be a significant offset between different clients in a local network using the same configuration.
Example reports from chronyc
on a client using the configuration:
$ chronyc -n tracking Reference ID : D91E4B93 (217.30.75.147) Stratum : 2 Ref time (UTC) : Fri Jan 21 12:41:47 2022 System time : 0.000483869 seconds fast of NTP time Last offset : +0.000763419 seconds RMS offset : 0.000790034 seconds Frequency : 0.310 ppm fast Residual freq : +0.215 ppm Skew : 1.199 ppm Root delay : 0.012714397 seconds Root dispersion : 0.001104208 seconds Update interval : 522.2 seconds Leap status : Normal $ chronyc -n sources MS Name/IP address Stratum Poll Reach LastRx Last sample =============================================================================== ^+ 82.113.53.41 3 10 377 92 -1044us[-1044us] +/- 12ms ^* 217.30.75.147 1 9 377 347 +442us[+1206us] +/- 6357us ^- 89.221.218.101 2 10 377 683 +749us[+1232us] +/- 36ms ^+ 81.25.28.124 1 10 377 57 -68us[ -68us] +/- 7681us $ chronyc -n sourcestats Name/IP Address NP NR Span Frequency Freq Skew Offset Std Dev ============================================================================== 82.113.53.41 6 4 77m +0.143 0.806 -829us 408us 217.30.75.147 7 7 51m +0.422 0.966 +520us 334us 89.221.218.101 6 3 68m +0.048 2.032 -323us 896us 81.25.28.124 6 4 77m +0.226 1.850 +598us 764us
The offset from the client’s tracking log (the 7th column in the file) is
shown in the graph below as dots.
That is the error of the system clock that chronyd
is assuming from its
measurements and correcting. The red line is the actual error of the clock
measured by a separate monitoring instance of chronyd
using a GPS reference
clock, which was not adjusting the system clock.
In this test the clock was off by about 2-3 milliseconds most of the time, but there were some large excursions in the error, one reaching about 8 milliseconds. They could be reduced by limiting the maximum polling interval (e.g. to 64 seconds), but that would increase the load on the public servers.
2. Client using local server and software timestamping
A local server with its own reference clock (e.g. a GPS receiver) is needed if
better accuracy is required on clients. They should be configured with a
shorter polling interval and have the interleaved mode enabled if supported on
the server (e.g. if it runs chronyd
). Multiple servers should be used for
reliability.
In this example the client uses a single server with a polling interval of 4 seconds and software timestamping (the default).
server 192.168.123.1 iburst minpoll 2 maxpoll 2 xleave driftfile /var/lib/chrony/drift makestep 0.1 3 rtcsync
Reports from chronyc
:
$ chronyc -n tracking Reference ID : C0A87B01 (192.168.123.1) Stratum : 2 Ref time (UTC) : Wed Jan 19 16:59:22 2022 System time : 0.000000075 seconds slow of NTP time Last offset : -0.000000146 seconds RMS offset : 0.000000970 seconds Frequency : 24.067 ppm fast Residual freq : -0.004 ppm Skew : 0.029 ppm Root delay : 0.000061748 seconds Root dispersion : 0.000024459 seconds Update interval : 4.0 seconds Leap status : Normal $ chronyc -n sources MS Name/IP address Stratum Poll Reach LastRx Last sample =============================================================================== ^* 192.168.123.1 1 2 377 8 +1002ns[ +864ns] +/- 47us $ chronyc -n sourcestats Name/IP Address NP NR Span Frequency Freq Skew Offset Std Dev ============================================================================== 192.168.123.1 9 6 32 -0.004 0.045 -14ns 223ns
There were three network switches between the server and client in the test. The network was loaded with 20-second data transfers from the client to the server every minute.
The following graph shows the client’s tracking offset and the actual error of the system clock measured with a local reference clock with a sub-microsecond accuracy.
The clock was stable to about 2 microseconds. There was a constant error of about 5 microseconds, which was caused mainly by asymmetric errors in software timestamps used by the client. The server used hardware timestamps with insignificant errors when compared to the client. If the server used software timestamping and had the same hardware, OS, and configuration as the client, the asymmetry could cancel out, but the synchronisation would be less stable.
3. Client using local server and hardware timestamping
For best accuracy it is necessary to use a NIC which supports hardware
timestamping. In this example the client and server both have the Intel I210
card. They also both run chrony
version 4.2, which supports the experimental
extension field F323 improving synchronisation stability. The client is
configured to send 16 requests per second and filter 5 measurements at a time,
making about 3 updates of the clock per second. The polling of the hardware
clock matches the minimum polling interval of the NTP source.
server 192.168.123.1 minpoll -4 maxpoll -4 xleave extfield F323 filter 5 hwtimestamp * minpoll -4 driftfile /var/lib/chrony/drift makestep 0.1 3 rtcsync
The Intel I210 has timestamping errors compensated in the Linux igb
driver
(it is not necessary to compensate them with the rxcomp
and txcomp
options
in the hwtimestamp
directive). For better stability, Energy-Efficient
Ethernet (EEE) was disabled in the network and the CPU on both server and
client was set to a constant frequency.
Reports from chronyc
:
$ chronyc -n tracking Reference ID : C0A87B01 (192.168.123.1) Stratum : 2 Ref time (UTC) : Wed Jan 19 14:12:20 2022 System time : 0.000000010 seconds fast of NTP time Last offset : -0.000000003 seconds RMS offset : 0.000000010 seconds Frequency : 24.096 ppm fast Residual freq : +0.000 ppm Skew : 0.004 ppm Root delay : 0.000015813 seconds Root dispersion : 0.000003070 seconds Update interval : 0.3 seconds Leap status : Normal $ chronyc -n sources MS Name/IP address Stratum Poll Reach LastRx Last sample =============================================================================== ^* 192.168.123.1 1 -4 377 1 +33ns[ +30ns] +/- 11us $ chronyc -n sourcestats Name/IP Address NP NR Span Frequency Freq Skew Offset Std Dev ============================================================================== 192.168.123.1 37 13 12 +0.000 0.004 +0ns 24ns
There were three network switches between the server and client in the test. The network was loaded with 20-second data transfers from the client to the server every minute. Network load typically has only a small impact on accuracy of hardware timestamping, but it can cause an NTP packet to be queued in a switch and cause a large error in the NTP measurement due to asymmetric delay. As long as this does not happen for too many measurements in a row, the client should be able to ignore the impacted measurements and keep the clock well synchronised. Some switches can be configured to prioritise NTP packets (by port number or DSCP) to limit the queueing delays.
The NTP measurements and the clock were stable to few tens of nanoseconds. Measuring accuracy of the system clock at this level is difficult. The main problem is communication over the PCIe bus between the system clock (CPU) and the NIC, which can have an asymmetric latency causing errors in the readings of the hardware clock up to a few hundred nanoseconds.
The following graph shows the client’s tracking offset and an error of the clock measured with a PPS signal (shared with the server) connected to the NIC.
The asymmetry of about 70 nanoseconds is caused by the network switches. It is common for switches to have a different forwarding delay from port A to port B than from port B to port A and different asymmetries on different pairs of ports.
Other asymmetries in this test should cancel out due to the server and client using the same model of the NIC for timestamping of NTP packets and timestamping of the shared PPS signal (connected with cables of equal length). If the error due to PCIe latency was not larger than 100 nanoseconds, the system clock would be accurate to about 250 nanoseconds relative to the reference clock of the server.
4. Server using reference clock on serial port
One of the easier ways to make a stratum-1 NTP server is to connect a GPS
receiver to a serial port of the computer. The receiver needs to provide a
pulse per second (PPS) signal to enable accuracy at the microsecond level. It
is usually connected to the DCD pin of the port. The gpsd
daemon can combine
the serial data with PPS and provide a SHM or SOCK reference clock for
chronyd
.
The following example uses the SOCK refclock:
refclock SOCK /var/run/chrony.ttyS0.sock makestep 0.1 3 allow rtcsync driftfile /var/lib/chrony/drift leapsectz right/UTC
gpsd
needs to be started after chronyd
in order to connect to the socket
and it needs to be started with the -n
option to not wait for clients to
connect before polling the receiver. For example:
# gpsd -n /dev/ttyS0
Reports from chronyc
:
$ chronyc -n tracking Reference ID : 47505300 (GPS) Stratum : 1 Ref time (UTC) : Mon Jan 24 12:42:11 2022 System time : 0.000000043 seconds fast of NTP time Last offset : +0.000000046 seconds RMS offset : 0.000000489 seconds Frequency : 2.331 ppm fast Residual freq : +0.000 ppm Skew : 0.010 ppm Root delay : 0.000000001 seconds Root dispersion : 0.000007050 seconds Update interval : 4.0 seconds Leap status : Normal $ chronyc -n sources MS Name/IP address Stratum Poll Reach LastRx Last sample =============================================================================== #* GPS 0 2 377 6 +262ns[ +307ns] +/- 1246ns $ chronyc -n sourcestats Name/IP Address NP NR Span Frequency Freq Skew Offset Std Dev ============================================================================== GPS 18 11 69 +0.000 0.010 +1ns 209ns
The following graph shows the tracking offset and the error of the system clock measured with a more accurate reference clock (PPS signal connected to a hardware clock on the NIC).
The clock was off by about 20 microseconds most of the time. Most of this error is caused by hardware and software delays in timestamping of the interrupt triggered by the serial port. The main issue is stability of the delay. There are periods where it is significantly shorter, which causes the offset to jump by about 12 microsecond. The CPU was set to a constant frequency in this test. The jumps were probably caused by changes in the CPU load or changes in timing of some processes, which prevented it from entering a power-saving state before the interrupts and avoided the delay of waking up.
Disabling power-saving states (e.g. with the Linux kernel idle=poll
option)
would make the delay more stable, but it would increase the power consumption.
The server did not use hardware timestamping, which means a similar issue with interrupts impacted its software timestamps. The delay is sensitive to CPU load and also network load as NICs implement interrupt coalescing in order to reduce their rate. The following graph shows an example of errors in software receive and transmit timestamps.
On some NICs the coalescing can be limited or disabled with the ethtool -C
command (on Linux) to improve the timestamping stability.
5. Server using reference clock on NIC
The best way to make a highly accurate stratum-1 NTP server is to connect the PPS signal to a software defined pin (SDP) on the NIC which is receiving requests and sending responses to NTP clients. This allows the PPS signal to be timestamped in hardware, avoiding the PCIe and interrupt delays, with the same clock as is timestamping NTP packets, which cancels out any asymmetry between the system clock and hardware clock in the server’s timestamps of NTP packets.
In this example the server has the Intel I210 card, which has a 6-pin header on the board exposing two SDPs (3.3V level) with the following layout:
+-------------+ | GND | SDP0 | +-------------+ | GND | SDP1 | +-------------+ | ? | ? | +-------------+
A 16Hz PPS signal from a u-blox NEO-6M GPS receiver is connected to SDP0. The
receiver is connected also to a USB port for the serial data to be processed by
gpsd
to provide the SHM 0
refclock needed for PPS locking. The timing
stability of the received messages limits the maximum rate of PPS. At 16 Hz,
the SHM 0
refclock needs to be accurate to 25 milliseconds in order for the
PHC refclock to correctly and reliably lock to it.
The following command (executed when gpsd
is running) configures the receiver
to make 16 pulses per second with 50% duty cycle and compensate a 20ns antenna
cable delay:
$ ubxtool -p CFG-TP5,0,20,0,1,16,0,2147483648,0,111
To improve stability of reading of the hardware clock, the CPU is set to a constant frequency with disabled boosting:
# cpupower frequency-set -g userspace -d 3600000 -u 3600000 # echo 0 > /sys/devices/system/cpu/cpufreq/boost
The server has the following configuration:
refclock PHC /dev/ptp0:extpps:pin=0 dpoll -4 poll -2 rate 16 width 0.03125 refid GPS lock NMEA maxlockage 32 refclock SHM 0 refid NMEA noselect offset 0.120 poll 6 delay 0.010 hwtimestamp * minpoll -4 makestep 0.1 3 allow rtcsync driftfile /var/lib/chrony/drift leapsectz right/UTC
The extpps
option enables external PPS timestamping on the PHC. The pin=0
setting selects the SDP0 pin. The dpoll
option configures the driver to poll
16 times per second and with the poll
option it provides a median measurement
4 times per second. The rate
option specifies the 16Hz PPS rate. The width
option is needed to filter falling edges in the PPS signal as the hardware
clock timestamps both edges. It specifies 50% of the 16Hz PPS interval,
matching the receiver PPS configuration. The maxlockage
option is needed to
enable locking of the PPS to the SHM refclock providing only one sample per
second.
The offset
option of the SHM 0
refclock compensates for the delay of
messages received on the USB port. It needs to be measured carefully, e.g.
against a known good NTP server. A wrong offset could cause the server to be
off by an integer multiple of 62.5 milliseconds (1/16s).
The hardware timestamping errors are already compensated in the kernel igb
driver for the I210.
Reports from chronyc
:
$ chronyc -n tracking Reference ID : 47505300 (GPS) Stratum : 1 Ref time (UTC) : Mon Jan 24 15:38:25 2022 System time : 0.000000008 seconds slow of NTP time Last offset : +0.000000000 seconds RMS offset : 0.000000004 seconds Frequency : 0.696 ppm slow Residual freq : -0.000 ppm Skew : 0.015 ppm Root delay : 0.000000001 seconds Root dispersion : 0.000002471 seconds Update interval : 0.3 seconds Leap status : Normal $ chronyc -n sources MS Name/IP address Stratum Poll Reach LastRx Last sample =============================================================================== #* GPS 0 -2 377 1 -1ns[ -1ns] +/- 1308ns #? NMEA 0 6 377 46 -8397us[-8392us] +/- 5176us $ chronyc -n sourcestats Name/IP Address NP NR Span Frequency Freq Skew Offset Std Dev ============================================================================== GPS 9 5 2 -0.000 0.014 -0ns 5ns NMEA 8 5 446 -0.027 9.862 -8848us 627us
The following graph shows the tracking offset:
It shows that chronyd
can track the reference clock to about 20 nanoseconds.
A better reference clock would be needed to measure the accuracy and stability.
In this case they are probably limited by the GPS receiver - it is a cheap
non-timing-grade model without a stabilised oscillator.