Network Time Sync and DataTurbine

1. Introduction

When using the DataTurbine to stream data from different computers to clients across the internet, we need the clocks on all the computers to have the same time. While this sounds like a simple problem, it is unexpectedly difficult and has been a research topic for many years. For example, the plot from ntp.org shows clock drift of 50 milliseconds per hour on a commodity PC.

When using the PC clock for DAQ timestamps, 50 milliseconds is 50 samples at 1kHz.

1.1. Why this matters to RBNB DataTurbine

Note that the sources, servers and sinks may all be on different machines. This means that when data is displayed, sources will appear skewed from each other, or may not display at all. Client PCs are less critical, since time is mostly handled before they see it, but it is recommend that these be corrected as well.

1.2. The basics: Time zones and network time

First off, you want to have your computer synchronize its clock with an internet time server. Assuming Windows, follow the instructions on this page to get weekly time correction. This Microsoft page has more information.

One problem seen on many occasions is that of time zones. This is particularly common on laptops that travel. Make sure that the system clock is set to the local time zone. This is a basic concept that is easy to overlook.

2. Solving it on UNIX

UNIX systems have standardized on NTP, the Network Time Protocol. This, currently in version four, is a sophisticated system that can handle multiple servers, all sorts of network problems, cryptographic authentication, multicast and so forth.

A good introduction to NTP can be found here.

Fortunately, the basic configuration will usually suffice. For Linux, install the appropriate NTP client package and put

server pool.ntp.org

into /etc/ntp.conf.

This will use the public NTP pool for synchronization. This is usually good enough. Check with local sysadmins, as most organizations run local NTP servers that may be used instead. For multiple time sources, use more than one entry:

server pool.ntp.org 
	server pool.ntp.org

Each one is randomly selected, so each entry will actually be a different server. This will usually get clocks synchronized to a better than 100 milliseconds. If better accuracy is desired, the local NTP servers are usually better. If possible, sync to a ‘stratum 1’ machine. NTP has 16 strata, where 1 is definitive, e.g. time.nist.gov, and 16 is unsynchronized.

3. Solving it on Windows

Windows also uses NTP for clock synchronization. Unfortunately, it uses SNTP (Simple NTP) instead of the full protocol. (According to this page, Windows 2003 does use NTP, the rest use SNTP. Microsoft confirms this.) SNTP has a number of limitations that make it problematic:

  1. NTP adjusts the clock on the fly, so that the time changes gradually. SNTP does single corrections and simply resets the clock. For software, this means that the system clock time may suddenly change.
  2. The native time synchronization was designed for the Kerberos ‘loose sync’ specification, which only requires that PC clocks only be within 20 seconds of each other. (Reference: this MS (Word format) document on their time implementation).
  3. By default, SNTP is only run once every eight days. It runs daily on an Active Domain.
  4. SNTP does not account for local clock inaccuracy, so between syncs the clock will drift uncorrected.

There are several ways to get better synchronization on Windows:

  1. Run SNTP more often. This can be done by a free Java program, the NetTime (open source pascal) program or by editing the registry to use NTP.
  2. Run a full NTP client. Although it started as a Unix program, NTP has been ported to Windows. Meinburg hasthis page of clients, including installers and GUIs. This page also has links to a port of NTP to Win32.

Of those solutions, NEESit strongly recommends using a full NTP client. SNTP is too disruptive to endorse. NEESit have used and tested the Meinburg-linked clients. However, for client machines, SNTP is all that is usually required.

4. NTP Debugging and Monitoring

NTP can be difficult to diagnose when problems occur. Generally, it is extremely reliable, but flaky motherboards, failed cooling fans, bad router configuration and such can cause problems. A couple places to start are the NTP query program and this NTP debugging guide.

5. Higher-Precision Synchronization

To use NTP to correlate DAQ data on multiple PCs tighter synchronization is needed, preferably sub-millisecond. This is usually not possible unless the NTP server in on the same LAN as the client machines. There are other cases where more accurate synchronization is required (banks are a good example), but we’re focused on multiple-source data.

An accurate local clock will still be uncorrelated with the DAQ, particularly if the DAQ uses an internal clock (as is typical) or is distributed (e.g. SCRAMNet). This is a problem for real-time and accelerated-time sites such as shake tables, centrifuges and tsunami labs. If sampling at more than a few Hertz, the time spent querying the system time will introduce unacceptable overhead. In the case of fast sampling, a few partial remedies exist:

  1. You could use an external clock synchronized to the PC, though that will not be a solution for most existing systems.
  2. Query the system time at the start and end of a sampling run, and compute indivual timestamps using the known delta-t.
  3. Hardware-timed DAQ with DMA buffers can be used with technique #2 – use a second thread/loop to compute timestamps on data before streaming it. For example, LabVIEW provides timestamps on DAQ data that can be used.

Digital I/O is also problematic, as transactions may not be triggered at all, and therefore difficult to timestamp with any precision.

Video synchronization is discussed below in section six.

Most importantly, understand that synchronized PC clocks can still produce skewed measurements. As with other sources of error (limited ADC resolution, analog noise, etc), time is part of the error budget, to be minimized as time and money permit. Different labs will have different priorities, i.e. jitter, skew, drift, stability, skewness, kurtosis, etc. Timing tolerances should be part of the lab and experiment documentation, so that other researchers can understand the nature and limitations of your data.

5.1. Types of clocks

There are three sources of higher-precision time available: GPS, CDMA and atomic clocks. GPS-based clocks are generally less expensive. However, they (or their antenna) have to be able to ‘see’ enough satellites for a GPS fix to obtain the time. CDMA clocks are a more recent development, and take their clock from the CDMA (usually Verizon) cell network. Free-running atomic clocks are expensive, accurate and independent of external references. Various vendors for all three are listed below.

There are also solutions based on the WWVB radio signal from Ft Collins, CO. These are generally less expensive, but have to be within about 2000 miles of Colorado to function. Radio reception is better at night, and the signal may not be accessible without an antenna. Best-case accuracy is about 100 microseconds. Inexpensive receivers are usually accurate to around 250 milliseconds. You can also get inexpensive wristwatches and clocks that use WWVB to set the time 1-5 times per day. Combined with a quartz crystal, this makes an excellent personal timekeeper.

There are many variations on these themes – LORAN-C, quartz/cesium/rubidium standards, hydrogen masers, etc.

5.2. From clock to computer

Once a clock source has been obtained, a method is required to get the time signal from the clock onto the computer. There are four common methods for connecting to the clock:

  1. Serial port. This is the worst option, as the serial overhead degrades the accuracy. (Serial over USB is not much better)
  2. Pulse-per-second (PPS) TTL-level output. Precise, but low frequency. When used with a fast counter, microsecond-level accuracy may be obtained.
  3. Ethernet (10/100/1000). Some (more expensive) boxes have the own OS and NTP server, and communicate entirely over TCP/IP.
  4. IRIG-B. This is a digital interface between the clock and the computer. Requires a digital I/O card to read. Extremely accurate, on the order of 1-10 microseconds.

Many clocks have more than one connection method. Depending on needs and budget, one may obtain one clock per PC, or use one clock as an NTP server for the LAN.

5.3. From computer to the LAN

Once the computer has accurate time, an NTP server needs to run on it to share the clock with the LAN. Then NTP clients may be used, as discussed above, to share the clock. UNIX is the better choice for this purpose, but Windows will also work. UNIX servers can use kernel precision timekeeping to get sub-microsecond synchronization.

New IEEE 1588 LAN standard

NTP is usually good for half-millisecond or so accuracy on a LAN. There are many applications that want higher precision and, if possible, less complexity. IEEE 1588, also known as Precision Time Protocol (PTP), originally¬†designed by Agilent for test equipment, fulfills this niche. This clock claims better-than-100ns on a LAN! It’s not yet clear how this will compete or complement NTP, but support appears to be increasingly available.

5.4. Vendors

6.0 Video synchronization

As of June 2005, video feeds using DataTurbine software have very poor synchronization. The individual Axis units have NTP onboard, but video frames are timestamped when the are captured by the Java-based AxisSource. Similarly, high-res frames from JCamera are timestamped after transfer over USB. While the software can be improved, it illustrates the difficulty of accurately timestamping video. As with DAQ, high-end cameras may have TTL clock inputs for external triggering, and these must be correlated with wall-clock time as discussed in section five.

Even if you correlate the video clock to the wall clock, most sensors are scanned, meaning that time elapses while the sensor is captured. The sample time is really a range of time.

As with DAQ data, it is very important that your experiments’ documentation discuss synchronization: how you addressed it, what tolerances exist, and how to interpret the data.

Open Source DataTurbine Initiative © 2017 Frontier Theme