Discussion:
pool server death and unexpected resurrection
Add Reply
Roger
2024-11-20 10:34:50 UTC
Reply
Permalink
ntp-4.2.8p18 using the pool and coasting along at poll 11.

These entries appeared in protostats yesterday (2024-11-19)
(there are no other entries between them):

07:43:13 178.238.156.140 0013 83 unreachable
11:14:34 178.238.156.140 0014 84 reachable

I had assumed that ntpd would mobilize a few servers and choose
one to replace the unreachable server. Why assume this? If the
server had been removed from the pool then sending packets
forever would be wrong. However, there were no new mobilization
attempts, the server came back with the same association number.

In this instance it was an "internet malfunction", see graphs on
link below.

https://www.ntppool.org/a/markcpowell

Was my expectation wrong?

Did Dave Hart's ntp-dev-3792-msm-v2 contain such code which
didn't yet get into the released code?
--
Roger
"Marco Davids (SIDN)" via questions Mailing List
2024-11-20 15:48:00 UTC
Reply
Permalink
Hi,
Post by Roger
I had assumed that ntpd would mobilize a few servers and choose
one to replace the unreachable server.
How did you configure the NTP pool in your ntp.conf?

With the 'server'-directive perhaps?

I wouldn't recommend that.

The 'pool'-directive is the right way to go, but unfortunately the
documentation of the pool was never updated for this:

https://community.ntppool.org/t/fyi-removing-server-from-the-pool/2424
https://www.ntppool.org/en/use.html
https://community.ntppool.org/t/should-the-how-to-use-pools-page-be-updated-with-an-option-that-uses-pool-instead-of-server/2518

--
Marco Davids
Research Engineer

SIDN | Meander 501 | 6825 MD | Postbus 5022 | 6802 EA | ARNHEM
T +31 (0)26 352 55 00 | www.sidnlabs.nl | Twitter: @marcodavids
https://mastodon.social/@marcodavids | Matrix: @marco:sidnlabs.nl
Nostr: 11ed01ff277d94705c2931867b8d900d8bacce6f27aaf7440ce98bb50e02fb34
Roger
2024-11-20 17:11:46 UTC
Reply
Permalink
On Wed, 20 Nov 2024 15:48:00 -0000 (UTC), "\"Marco Davids
Post by "Marco Davids (SIDN)" via questions Mailing List
Hi,
Post by Roger
I had assumed that ntpd would mobilize a few servers and choose
one to replace the unreachable server.
How did you configure the NTP pool in your ntp.conf?
With the 'server'-directive perhaps?
No, I am using "pool 0.pool.ntp.org poll 11" (and 1. 2. 3. as
well). This is why I thought the non-responding server would
be replaced. If I had used "server 178.238.156.140" then I would
expect ntpd to keep trying to get an answer.
--
Roger
b***@systematicsw.ab.ca
2024-11-20 19:53:00 UTC
Reply
Permalink
Post by Roger
On Wed, 20 Nov 2024 15:48:00 -0000 (UTC), "\"Marco Davids
Post by "Marco Davids (SIDN)" via questions Mailing List
Post by Roger
I had assumed that ntpd would mobilize a few servers and choose
one to replace the unreachable server.
How did you configure the NTP pool in your ntp.conf?
With the 'server'-directive perhaps?
No, I am using "pool 0.pool.ntp.org poll 11" (and 1. 2. 3. as
well). This is why I thought the non-responding server would
be replaced. If I had used "server 178.238.156.140" then I would
expect ntpd to keep trying to get an answer.
Maybe add "iburst preempt" options and drop "poll 11" or perhaps change to
"maxpoll 11" or higher, unless you have very good reasons to require a longer
interval than the default maximum, instead of adaptive polling based on the error.
--
Take care. Thanks, Brian Inglis Calgary, Alberta, Canada

La perfection est atteinte Perfection is achieved
non pas lorsqu'il n'y a plus rien à ajouter not when there is no more to add
mais lorsqu'il n'y a plus rien à retirer but when there is no more to cut
-- Antoine de Saint-Exupéry
Roger
2024-11-20 21:32:08 UTC
Reply
Permalink
On Wed, 20 Nov 2024 19:53:00 -0000 (UTC),
Post by b***@systematicsw.ab.ca
Post by Roger
On Wed, 20 Nov 2024 15:48:00 -0000 (UTC), "\"Marco Davids
Post by "Marco Davids (SIDN)" via questions Mailing List
Post by Roger
I had assumed that ntpd would mobilize a few servers and choose
one to replace the unreachable server.
How did you configure the NTP pool in your ntp.conf?
With the 'server'-directive perhaps?
No, I am using "pool 0.pool.ntp.org poll 11" (and 1. 2. 3. as
well). This is why I thought the non-responding server would
be replaced. If I had used "server 178.238.156.140" then I would
expect ntpd to keep trying to get an answer.
Maybe add "iburst preempt" options and drop "poll 11" or perhaps change to
"maxpoll 11" or higher, unless you have very good reasons to require a longer
interval than the default maximum, instead of adaptive polling based on the error.
Well, the documentation (confopt) tells me that the pool command
"mobilizes a preemptable pool client mode association for the
DNS name specified." Why would adding "preempt" change anything?

Although I have "pool ... poll 11" the poll does shorten
sometimes, going down to poll 6 if necessary. It seems to be
when the temperature (whether ambient or due to processor load)
changes too quickly.

My question is why would a preemptable server, acquired using
"pool ...", continue to be polled after it has stopped
responding, i.e., the reach has gone to 0? It is a
misunderstanding on my part or is there an bug in the code?
--
Roger
Harlan Stenn via questions Mailing List
2024-11-22 11:13:00 UTC
Reply
Permalink
Post by Roger
On Fri, 22 Nov 2024 05:58:00 -0000 (UTC), "Dave Hart via
Post by Dave Hart via questions Mailing List
As an aside, using "preempt" on a non-pool non-manycastclient association
(basically, configured via "server" or "peer") seems quixotic to me, as it
allows the association to be removed but nothing is done to replace it. I
have a difficult time imagining where that might be useful.
Something in the ntp.conf man page I can't get my head around is
why one would have "pool ... prefer". If one were using only one
pool line it would, presumably, result in all servers being
preferred.
Note that the BUGS section of the ntp.conf man page says:

The syntax checking is not picky; some combinations of
ridiculous and even hilarious options and modes may not be
detected.

In the above case, I'd recommend looking at the code and perhaps seeing
if the description of the "prefer" options could be improved.
--
Harlan Stenn <***@nwtime.org>
http://networktimefoundation.org - be a member!
Roger
2024-11-22 12:36:49 UTC
Reply
Permalink
On Fri, 22 Nov 2024 11:13:00 -0000 (UTC), "Harlan Stenn via
Post by Harlan Stenn via questions Mailing List
Post by Roger
On Fri, 22 Nov 2024 05:58:00 -0000 (UTC), "Dave Hart via
Post by Dave Hart via questions Mailing List
As an aside, using "preempt" on a non-pool non-manycastclient association
(basically, configured via "server" or "peer") seems quixotic to me, as it
allows the association to be removed but nothing is done to replace it. I
have a difficult time imagining where that might be useful.
Something in the ntp.conf man page I can't get my head around is
why one would have "pool ... prefer". If one were using only one
pool line it would, presumably, result in all servers being
preferred.
The syntax checking is not picky; some combinations of
ridiculous and even hilarious options and modes may not be
detected.
I'd forgotten that. I've reached the age where I sometimes feel
as though I've forgotten more than I ever knew.
Post by Harlan Stenn via questions Mailing List
In the above case, I'd recommend looking at the code and perhaps seeing
if the description of the "prefer" options could be improved.
That might be beyond my capabilities. The pool options given in
the ntp.conf man page are:

pool address [burst] [iburst] [version version] [prefer]
[minpoll minpoll] [maxpoll maxpoll] [xmtnonce]

Removing "[prefer]" would be a good idea.

The explanation of "prefer" seems okay; the words "this host
will be chosen" imply that only one server should be marked
thus.
--
Roger
Dave Hart
2025-01-01 11:23:00 UTC
Reply
Permalink
On Fri, Nov 22, 2024 at 7:52 PM Brian Inglis <
Post by Dave Hart via questions Mailing List
Post by Dave Hart via questions Mailing List
As an aside, using "preempt" on a non-pool
non-manycastclient association
Post by Dave Hart via questions Mailing List
(basically, configured via "server" or "peer") seems quixotic to me, as
it
Post by Dave Hart via questions Mailing List
allows the association to be removed but nothing is done to replace it.
I have
Post by Dave Hart via questions Mailing List
a difficult time imagining where that might be useful.
[...]
Post by Dave Hart via questions Mailing List
It can be useful when you have an adequate number of (some local) backup servers
or (former) pools, but some (local) have an annoying habit of going unreachable,
but not being noticed, and support not being responsive to hints for weeks, e.g.
...
server ...
...
server ntp2.cpsc.ucalgary.ca iburst preempt # U Calgary T2N AB CA
server ntp1.yycix.ca iburst preempt # YYCIX, Calgary T2P AB CA
server ntp2.switch.ca iburst preempt # TELUS, Edmonton T6H AB CA
...
server ...
...
tos minsane 3 minclock 5 maxclock 7
I'm not sure I follow what you're trying to do. I'm still stuck on the
idea that you can't count on any of the preempt-marked servers being around
to meet your minsane and minclock requirements as they can be removed
automatically if they're unreachable for a bit but then will never be
restored. So you're down to your "server ..." lines, and those don't
inform me much.
Post by Dave Hart via questions Mailing List
If you're wondering why I mentioned "manycastclient", it shares much of the
Post by Dave Hart via questions Mailing List
implementation with "pool". They use different approaches to finding
servers,
Post by Dave Hart via questions Mailing List
but the rest of the code is common. Both are intended to be automatic
server
Post by Dave Hart via questions Mailing List
discovery schemes that discard, or preempt, servers which haven't been
useful
Post by Dave Hart via questions Mailing List
for 10 poll intervals so that another server can be solicited to replace
it.
I noticed those comments.
Post by Dave Hart via questions Mailing List
$ grep 'pool.*preempt' ~/src/time/ntp/ntp-4.2.8p18/ntpd/
complete.conf.in
Post by Dave Hart via questions Mailing List
<http://complete.conf.in>
pool 2.ubuntu.pool.ntp.org <http://2.ubuntu.pool.ntp.org>. iburst
preempt
Post by Dave Hart via questions Mailing List
complete.conf.in <http://complete.conf.in>is part of the "make check"
tests and
Post by Dave Hart via questions Mailing List
is not intended to suggest useful configurations. Rather it's used both
to
Post by Dave Hart via questions Mailing List
ensure every keyword in the configuration file parser is covered, and to
ensure
Post by Dave Hart via questions Mailing List
a configuration can successfully round-trip through ntpd's reading and
applying
Post by Dave Hart via questions Mailing List
the configuration and exporting the configuration via the --
saveconfigquit command-line option added specifically for that developer
test to
Post by Dave Hart via questions Mailing List
catch changes which break that functionality. It's no coincidenceit
isordered
Post by Dave Hart via questions Mailing List
exactly the same as the output of ntpq's saveconfigcommand, which
requires
Post by Dave Hart via questions Mailing List
authentication and that a directory for such saved configuration files
has been
Post by Dave Hart via questions Mailing List
specified in ntp.conf with "saveconfigdir".
Implies that pool will round trip with iburst preempt?
complete.conf.in is part of build-time testing of the configuration file
reading and writing code. When I said round-trip, I mean that as part of
the "make" ntpd just after compiling is invoked to read complete.conf and
then write it out using --saveconfigquit and make sure there's no change
except comments. It has nothing to do with recommended configuration of
ntpd.

I go by the ntp.conf.def files and date-time stamps, as the html and other
Post by Dave Hart via questions Mailing List
docs
should be Autogen-erated from these masters?
It's a mess. Traditionally the documentation came from two sources, first
the static html documentation in the html source directory, and secondly
the man pages, html equivalents, and option processing code that comes from
Autogen .def files. That documentation was both installed as part of "make
install" as well as posted to the udel.edu NTP pages of Dr. Mills. After
he was no longer maintaining those pages, Harlan Stenn initiated a change
to maintain all the documentation on the web only, derived using Hugo. The
documentation in the source releases was left to bit-rot, essentially, as
most work went into maintaining the shiny new object. I've never been
particularly productive maintaining documentation to match code changes,
and I have no interest in doing all that work twice to get changes done in
both web-only and source-only divergent forms. There's been noise about
somehow resolving this forking of the documentation and the need to
maintain it in two places, but no progress I'm aware of.

Cheers,
Dave Hart

Harlan Stenn via questions Mailing List
2024-11-22 07:23:00 UTC
Reply
Permalink
Post by Roger
On Thu, 21 Nov 2024 17:03:00 -0000 (UTC), "Brian Inglis"
HUGE snip
Post by Brian Inglis
Or a doc bug?
Thank you. That's interesting. I was looking at confopt.html
contained within the ntp-4.2.8p18 source tree. I see that its
file date is 2020-03-03 whereas the man page has a file date of
2024-05-25. I shall now add preempt to my pool lines.
The dates can be misleading.

The website is now generated via Hugo, so (as best I understand) Dru
takes the man pages, converts them to Hugo, and that's what's on the
website.

So if the date of a page on the website is more recent than the date on
the man page, that doesn't mean that the content is newer, it just means
it was (at least) formatted more recently.

The intent and expectation is that if a change is made to the Hugo
version of the docs, that change is applied "upstream" as well.

The goal is to get the master documentation formatted for Hugo, and at
that point we'll be generating all of the documentation output targets
from the same (single) source documents, and the dates should then all
match.
--
Harlan Stenn <***@nwtime.org>
http://networktimefoundation.org - be a member!
Roger
2024-11-22 08:51:41 UTC
Reply
Permalink
On Fri, 22 Nov 2024 07:23:00 -0000 (UTC), "Harlan Stenn via
Post by Harlan Stenn via questions Mailing List
Post by Roger
On Thu, 21 Nov 2024 17:03:00 -0000 (UTC), "Brian Inglis"
HUGE snip
Post by Brian Inglis
Or a doc bug?
Thank you. That's interesting. I was looking at confopt.html
contained within the ntp-4.2.8p18 source tree. I see that its
file date is 2020-03-03 whereas the man page has a file date of
2024-05-25. I shall now add preempt to my pool lines.
The dates can be misleading.
The website is now generated via Hugo, so (as best I understand) Dru
takes the man pages, converts them to Hugo, and that's what's on the
website.
So if the date of a page on the website is more recent than the date on
the man page, that doesn't mean that the content is newer, it just means
it was (at least) formatted more recently.
The intent and expectation is that if a change is made to the Hugo
version of the docs, that change is applied "upstream" as well.
The goal is to get the master documentation formatted for Hugo, and at
that point we'll be generating all of the documentation output targets
from the same (single) source documents, and the dates should then all
match.
Thank you for the explanation. Unfortunately that hasn't reached
the 4.2.8p18 source tar and so "pool" is preemptable or
persistent according to where one looks. I'm sure I've read
somewhere that writing documentation is everyone's least
favourite pasttime.
--
Roger
Brian Inglis
2024-11-21 17:03:00 UTC
Reply
Permalink
Post by Roger
Post by b***@systematicsw.ab.ca
Post by Roger
On Wed, 20 Nov 2024 15:48:00 -0000 (UTC), "\"Marco Davids
Post by "Marco Davids (SIDN)" via questions Mailing List
Post by Roger
I had assumed that ntpd would mobilize a few servers and choose
one to replace the unreachable server.
How did you configure the NTP pool in your ntp.conf?
With the 'server'-directive perhaps?
No, I am using "pool 0.pool.ntp.org poll 11" (and 1. 2. 3. as
well). This is why I thought the non-responding server would
be replaced. If I had used "server 178.238.156.140" then I would
expect ntpd to keep trying to get an answer.
Maybe add "iburst preempt" options and drop "poll 11" or perhaps change to
"maxpoll 11" or higher, unless you have very good reasons to require a longer
interval than the default maximum, instead of adaptive polling based on the error.
Well, the documentation (confopt) tells me that the pool command
"mobilizes a preemptable pool client mode association for the
DNS name specified." Why would adding "preempt" change anything?
It *may* be required and can never hurt:

$ grep 'pool.*preempt' ~/src/time/ntp/ntp-4.2.8p18/ntpd/complete.conf.in
pool 2.ubuntu.pool.ntp.org. iburst preempt

$ man 5 ntp.conf
...
Configuration Commands
...
*pool* For type s addresses, this command mobilizes a persistent
client mode association with a number of remote servers. In
this mode the local clock can synchronized to the remote server,
but the remote server can never be synchronized to the local
clock.
...
Options:
...
*preempt* Says the association can be preempted.
...
This manual page was AutoGen‐erated from the ntp.conf option definitions.

4.2.8p18 25 May 2024 ntp.conf(5man)

although the older:

https://www.ntp.org/documentation/4.2.8-series/confopt/#server-commands

says:

"Server Commands and Options
Last update: March 23, 2023 21:05 UTC (6ad51a76f)
...
Server Commands
...
pool
For type s addresses (only) this command mobilizes a preemptable pool client
mode association for the DNS name specified. "
...
Server Command Options
...
preempt
Specifies the association as preemptable rather than the default persistent.
This option is ignored with the broadcast command and is most useful with the
manycastclient and pool commands."
Post by Roger
Although I have "pool ... poll 11" the poll does shorten
sometimes, going down to poll 6 if necessary. It seems to be
when the temperature (whether ambient or due to processor load)
changes too quickly.
My question is why would a preemptable server, acquired using
"pool ...", continue to be polled after it has stopped
responding, i.e., the reach has gone to 0? It is a
misunderstanding on my part or is there an bug in the code?
Or a doc bug?
--
Take care. Thanks, Brian Inglis Calgary, Alberta, Canada

La perfection est atteinte Perfection is achieved
non pas lorsqu'il n'y a plus rien à ajouter not when there is no more to add
mais lorsqu'il n'y a plus rien à retirer but when there is no more to cut
-- Antoine de Saint-Exupéry
Roger
2024-11-21 20:28:10 UTC
Reply
Permalink
On Thu, 21 Nov 2024 17:03:00 -0000 (UTC), "Brian Inglis"
<***@SystematicSW.ab.ca> wrote:

HUGE snip
Post by Brian Inglis
Or a doc bug?
Thank you. That's interesting. I was looking at confopt.html
contained within the ntp-4.2.8p18 source tree. I see that its
file date is 2020-03-03 whereas the man page has a file date of
2024-05-25. I shall now add preempt to my pool lines.
--
Roger
Dave Hart
2024-11-22 05:13:05 UTC
Reply
Permalink
Post by Roger
On Wed, 20 Nov 2024 15:48:00 -0000 (UTC), "\"Marco Davids
Post by "Marco Davids (SIDN)" via questions Mailing List
How did you configure the NTP pool in your ntp.conf?
With the 'server'-directive perhaps?
No, I am using "pool 0.pool.ntp.org poll 11" (and 1. 2. 3. as
well). [...]
Are you sure you didn't mean "maxpoll 11"? My reading of the code suggests
the line you provided would be rejected as a syntax error by ntpd.

Cheers,
Dave Hart
Roger
2024-11-22 08:51:13 UTC
Reply
Permalink
On Fri, 22 Nov 2024 05:13:05 -0000 (UTC), "Dave Hart"
Post by Dave Hart
Post by Roger
On Wed, 20 Nov 2024 15:48:00 -0000 (UTC), "\"Marco Davids
Post by "Marco Davids (SIDN)" via questions Mailing List
How did you configure the NTP pool in your ntp.conf?
With the 'server'-directive perhaps?
No, I am using "pool 0.pool.ntp.org poll 11" (and 1. 2. 3. as
well). [...]
Are you sure you didn't mean "maxpoll 11"? My reading of the code suggests
the line you provided would be rejected as a syntax error by ntpd.
You are correct. ntp.conf does have "maxpoll 11". I was
concentrating on not muddling my "pool"s and "poll"s and
missed out the "max".
--
Roger
Dave Hart via questions Mailing List
2024-11-22 05:58:00 UTC
Reply
Permalink
On Thu, Nov 21, 2024 at 4:56 PM Brian Inglis <
Post by b***@systematicsw.ab.ca
Post by b***@systematicsw.ab.ca
Maybe add "iburst preempt" options and drop "poll 11" or perhaps change
to
Post by b***@systematicsw.ab.ca
Post by b***@systematicsw.ab.ca
"maxpoll 11" or higher, unless you have very good reasons to require a
longer
Post by b***@systematicsw.ab.ca
Post by b***@systematicsw.ab.ca
interval than the default maximum, instead of adaptive polling based on
the error.
Post by b***@systematicsw.ab.ca
Well, the documentation (confopt) tells me that the pool command
"mobilizes a preemptable pool client mode association for the
DNS name specified." Why would adding "preempt" change anything?
In fact it won't change anything. The only options propagated from the "
pool" directive in ntp.conf (and thereby set on the prototype pool
association listed with refid POOL in the peers billboard) to the resulting
pool server associations are "iburst" and "noselect". See POOL_FLAG_PMASK
in source code file ntp_proto.c.

The preemptible option is forced on for pool servers, so they are
preemptible with or without that option. However, that option doesn't do
much in 4.2.8 as the code intended to preempt useless servers has an
off-by-one error that's corrected in my test 3792 release, so preemption
only happens in the unusual case where there are more than 2 times as many
pool or manycast client associations as "tos maxclock" which defaults to
10. Arguably this could be fixed in the stable 4.2.8 branch but it would
be a substantial change in behavior without any configuration change that
might break existing setups that depend on the off-by-one error.

As an aside, using "preempt" on a non-pool non-manycastclient association
(basically, configured via "server" or "peer") seems quixotic to me, as it
allows the association to be removed but nothing is done to replace it. I
have a difficult time imagining where that might be useful. It may have
been useful in the pre-2009 implementation of "pool" which I'm having a
hard time remembering because I thought it was primitive and needed
improvement, as it did all its work at startup and never changed the
servers selected once up and running. I re-implemented it to the current
iteration, but didn't catch that the preemption was suffering the
aforementioned off-by-one error, or it wasn't back then.

If you're wondering why I mentioned "manycastclient", it shares much of the
implementation with "pool". They use different approaches to finding
servers, but the rest of the code is common. Both are intended to be
automatic server discovery schemes that discard, or preempt, servers which
haven't been useful for 10 poll intervals so that another server can be
solicited to replace it.
Post by b***@systematicsw.ab.ca
$ grep 'pool.*preempt' ~/src/time/ntp/ntp-4.2.8p18/ntpd/complete.conf.in
pool 2.ubuntu.pool.ntp.org. iburst preempt
complete.conf.in is part of the "make check" tests and is not intended to
suggest useful configurations. Rather it's used both to ensure every
keyword in the configuration file parser is covered, and to ensure a
configuration can successfully round-trip through ntpd's reading and
applying the configuration and exporting the configuration via the
--saveconfigquit command-line option added specifically for that developer
test to catch changes which break that functionality. It's no coincidence it
is ordered exactly the same as the output of ntpq's saveconfig command,
which requires authentication and that a directory for such saved
configuration files has been specified in ntp.conf with "saveconfigdir".
Post by b***@systematicsw.ab.ca
$ man 5 ntp.conf
...
Configuration Commands
...
*pool* For type s addresses, this command mobilizes a persistent
client mode association with a number of remote servers. In
this mode the local clock can synchronized to the remote server,
but the remote server can never be synchronized to the local
clock.
...
...
*preempt* Says the association can be preempted.
...
This manual page was AutoGen‐erated from the ntp.conf option definitions.
4.2.8p18 25 May 2024 ntp.conf(5man)
https://www.ntp.org/documentation/4.2.8-series/confopt/#server-commands
"Server Commands and Options
Last update: March 23, 2023 21:05 UTC (6ad51a76f)
...
Server Commands
...
pool
For type s addresses (only) this command mobilizes a preemptable pool client
mode association for the DNS name specified. "
...
Server Command Options
...
preempt
Specifies the association as preemptable rather than the default persistent.
This option is ignored with the broadcast command and is most useful with the
manycastclient and pool commands."
Despite the timestamps you quoted, the web version is likely newer.
Autogen is run against the documentation source files with every release,
so that timestamp reflects the release date, not the last update of the
documentation source files (.html in this case).

Since the overhaul of the www.ntp.org website a few years back, that
documentation sadly is maintained in two places, and there's no process to
ensure they stay in sync. The web version is considered the more
authoritative source, and is maintained in .md (Markdown) published only
via the converted HTML on the website. It started as a copy of the
documentation from the source tarballs' /html directory, but after
conversion to Markdown and subsequent improvements, those changes have
generally not been made to the HTML version distributed with the source.
I'm partly to blame because I find writing documentation tedious enough
without having to update it in two places, and I've been kept quite busy
with coding work and haven't wanted to take the time to correct
documentation that no longer reflects the reality of the code. In theory
one day I will have time to dedicate to that, but I welcome anyone who
enjoys documentation work or at least really wants accurate NTP
documentation to please volunteer to help out.
Post by b***@systematicsw.ab.ca
Post by b***@systematicsw.ab.ca
My question is why would a preemptable server, acquired using
"pool ...", continue to be polled after it has stopped
responding, i.e., the reach has gone to 0? It is a
misunderstanding on my part or is there an bug in the code?
Or a doc bug?
A doc bug and an off-by-one bug in the preemption logic.

Cheers,
Dave Hart
Roger
2024-11-22 08:52:04 UTC
Reply
Permalink
On Fri, 22 Nov 2024 05:58:00 -0000 (UTC), "Dave Hart via
Post by Dave Hart via questions Mailing List
As an aside, using "preempt" on a non-pool non-manycastclient association
(basically, configured via "server" or "peer") seems quixotic to me, as it
allows the association to be removed but nothing is done to replace it. I
have a difficult time imagining where that might be useful.
Something in the ntp.conf man page I can't get my head around is
why one would have "pool ... prefer". If one were using only one
pool line it would, presumably, result in all servers being
preferred.
--
Roger
James Browning
2024-11-22 14:33:05 UTC
Reply
Permalink
...
Post by Dave Hart via questions Mailing List
The preemptible option is forced on for pool servers, so they are
preemptible with or without that option. However, that option doesn't do
much in 4.2.8 as the code intended to preempt useless servers has an
off-by-one error that's corrected in my test 3792 release, so preemption
only happens in the unusual case where there are more than 2 times as many
pool or manycast client associations as "tos maxclock" which defaults to
10. Arguably this could be fixed in the stable 4.2.8 branch but it would
be a substantial change in behavior without any configuration change that
might break existing setups that depend on the off-by-one error.
I suppose the possibility of preempt causing ntp to kick out the pool entry
itself is unthinkable.

As an aside, using "preempt" on a non-pool non-manycastclient association
Post by Dave Hart via questions Mailing List
(basically, configured via "server" or "peer") seems quixotic to me, as
it allows the association to be removed but nothing is done to replace it.
I have a difficult time imagining where that might be useful.
...

I have a couple of toys that set preempt for a development branch of that
hostile fork we don't mention here. One queries _ntp._udp.local to add
MDNS? enabled servers. The other digs against a public domain for SRV
records to add NTS-KE servers. I thought they were the best ways to solve
those particular tasks at the time.
Brian Inglis
2024-11-22 19:58:05 UTC
Reply
Permalink
Post by b***@systematicsw.ab.ca
Post by Roger
Post by b***@systematicsw.ab.ca
Maybe add "iburst preempt" options and drop "poll 11" or perhaps change to
"maxpoll 11" or higher, unless you have very good reasons to require a
longer
Post by Roger
Post by b***@systematicsw.ab.ca
interval than the default maximum, instead of adaptive polling based on
the error.
Post by Roger
Well, the documentation (confopt) tells me that the pool command
"mobilizes a preemptable pool client mode association for the
DNS name specified." Why would adding "preempt" change anything?
In fact it won't change anything.  The only options propagated from the "pool"
directive in ntp.conf (and thereby set on the prototype pool association listed
with refid POOLin the peers billboard) to the resulting pool server associations
are "iburst" and "noselect".  See POOL_FLAG_PMASK in source code file ntp_proto.c.
I saw that and for mcast clients in protos, and it is documented.

But it looks as if the option is set on when flagged in ntp_peer.c.
Post by b***@systematicsw.ab.ca
The preemptible option is forced on for pool servers, so they are preemptible
with or without that option.  However, that option doesn't do much in 4.2.8 as
the code intended to preempt useless servers has an off-by-one error that's
corrected in my test 3792 release, so preemption only happens in the unusual
case where there are more than 2 times as many pool or manycast client
associations as "tos maxclock" which defaults to 10.  Arguably this could be
fixed in the stable 4.2.8 branch but it would be a substantial change in
behavior without any configuration change that might break existing setups that
depend on the off-by-one error.
I am not seeing anywhere else that preempt is set on for any peers?
Post by b***@systematicsw.ab.ca
As an aside, using "preempt" on a non-pool non-manycastclient association
(basically, configured via "server" or "peer") seems quixotic to me, as it
allows the association to be removed but nothing is done to replace it.  I have
a difficult time imagining where that might be useful.  It may have been useful
in the pre-2009 implementation of "pool" which I'm having a hard time
remembering because I thought it was primitive and needed improvement, as it did
all its work at startup and never changed the servers selected once up and
running.  I re-implemented it to the current iteration, but didn't catch that
the preemption was suffering the aforementioned off-by-one error, or it wasn't
back then.
It can be useful when you have an adequate number of (some local) backup servers
or (former) pools, but some (local) have an annoying habit of going unreachable,
but not being noticed, and support not being responsive to hints for weeks, e.g.

...
server ...
...
server ntp2.cpsc.ucalgary.ca iburst preempt # U Calgary T2N AB CA
server ntp1.yycix.ca iburst preempt # YYCIX, Calgary T2P AB CA
server ntp2.switch.ca iburst preempt # TELUS, Edmonton T6H AB CA
...
server ...
...
tos minsane 3 minclock 5 maxclock 7
Post by b***@systematicsw.ab.ca
If you're wondering why I mentioned "manycastclient", it shares much of the
implementation with "pool".  They use different approaches to finding servers,
but the rest of the code is common.  Both are intended to be automatic server
discovery schemes that discard, or preempt, servers which haven't been useful
for 10 poll intervals so that another server can be solicited to replace it.
I noticed those comments.
Post by b***@systematicsw.ab.ca
$ grep 'pool.*preempt' ~/src/time/ntp/ntp-4.2.8p18/ntpd/complete.conf.in
<http://complete.conf.in>
pool 2.ubuntu.pool.ntp.org <http://2.ubuntu.pool.ntp.org>. iburst preempt
complete.conf.in <http://complete.conf.in>is part of the "make check" tests and
is not intended to suggest useful configurations.  Rather it's used both to
ensure every keyword in the configuration file parser is covered, and to ensure
a configuration can successfully round-trip through ntpd's reading and applying
the configuration and exporting the configuration via the --
saveconfigquit command-line option added specifically for that developer test to
catch changes which break that functionality.  It's no coincidenceit isordered
exactly the same as the output of ntpq's saveconfigcommand, which requires
authentication and that a directory for such saved configuration files has been
specified in ntp.conf with "saveconfigdir".
Implies that pool will round trip with iburst preempt?
Post by b***@systematicsw.ab.ca
$ man 5 ntp.conf
...
Configuration Commands
...
*pool*  For type s addresses, this command mobilizes a persistent
        client mode association with a number of remote servers. In
        this mode the local clock can synchronized to the remote server,
        but the remote server can never be synchronized to the local
        clock.
...
...
*preempt*       Says the association can be preempted.
...
This manual page was AutoGen‐erated from the ntp.conf option definitions.
4.2.8p18        25 May 2024     ntp.conf(5man)
https://www.ntp.org/documentation/4.2.8-series/confopt/#server-commands
<https://www.ntp.org/documentation/4.2.8-series/confopt/#server-commands>
"Server Commands and Options
Last update: March 23, 2023 21:05 UTC (6ad51a76f)
...
Server Commands
...
pool
For type s addresses (only) this command mobilizes a preemptable pool client
mode association for the DNS name specified. "
...
Server Command Options
...
preempt
Specifies the association as preemptable rather than the default persistent.
This option is ignored with the broadcast command and is most useful with the
manycastclient and pool commands."
Despite the timestamps you quoted, the web version is likely newer.  Autogen is
run against the documentation source files with every release, so that timestamp
reflects the release date, not the last update of the documentation source files
(.html in this case).
I go by the ntp.conf.def files and date-time stamps, as the html and other docs
should be Autogen-erated from these masters?
Post by b***@systematicsw.ab.ca
Since the overhaul of the www.ntp.org <http://www.ntp.org> website a few years
back, that documentation sadly is maintained in two places, and there's no
process to ensure they stay in sync.  The web version is considered the more
authoritative source, and is maintained in .md (Markdown) published only via the
converted HTML on the website.  It started as a copy of the documentation from
the source tarballs' /html directory, but after conversion to Markdown and
subsequent improvements, those changes have generally not been made to the HTML
version distributed with the source.  I'm partly to blame because I find writing
documentation tedious enough without having to update it in two places, and I've
been kept quite busy with coding work and haven't wanted to take the time to
correct documentation that no longer reflects the reality of the code.  In
theory one day I will have time to dedicate to that, but I welcome anyone who
enjoys documentation work or at least really wants accurate NTP documentation to
please volunteer to help out.
FYI mandoc 1.14.6 (2021) will generate markdown from mdoc or man formats!

I see the sources are in https://git.nwtime.org/websites/ntpwww and require a Go
package Hugo to generate.
Surprised you don't use the option to convert the ancient GIFs to webp and save
space and time, especially on mobiles.
Post by b***@systematicsw.ab.ca
Post by Roger
My question is why would a preemptable server, acquired using
"pool ...", continue to be polled after it has stopped
responding, i.e., the reach has gone to 0? It is a
misunderstanding on my part or is there an bug in the code?
Or a doc bug?
A doc bug and an off-by-one bug in the preemption logic.
--
Take care. Thanks, Brian Inglis Calgary, Alberta, Canada

La perfection est atteinte Perfection is achieved
non pas lorsqu'il n'y a plus rien à ajouter not when there is no more to add
mais lorsqu'il n'y a plus rien à retirer but when there is no more to cut
-- Antoine de Saint-Exupéry
Dave Hart
2024-11-22 05:08:00 UTC
Reply
Permalink
Post by Roger
I had assumed that ntpd would mobilize a few servers and choose
one to replace the unreachable server. Why assume this? If the
server had been removed from the pool then sending packets
forever would be wrong. However, there were no new mobilization
attempts, the server came back with the same association number.
In this instance it was an "internet malfunction", see graphs on
link below.
https://www.ntppool.org/a/markcpowell
Was my expectation wrong?
Sadly, yes.
Post by Roger
Did Dave Hart's ntp-dev-3792-msm-v2 contain such code which
didn't yet get into the released code?
Yes, that code has not been released yet, and as it's based on the source
code from about two years ago, it's going to be a bit painful to merge into
the current code. The 3792 test release code contains logic to gradually
refine the pool servers by removing one at a time when certain conditions
are met, and not responding for 10 poll intervals is one of those
conditions. It may also contain code I worked on around the same time to
re-resolve the DNS name of non-pool servers when they haven't responded in
a while, to allow NTP server operators to change the DNS and (eventually,
once the code is widespread) see clients move over to the new IP
address(es). This would also help with folks who have followed the
instructions at https://www.ntppool.org/en/use.html which suggest using "
server" not "pool" with the *.pool.ntp.org DNS names, as they would also
then move from no-longer-responsive pool servers to a different one.

I haven't attempted to integrate either one because I consider the changes
too disruptive for a stable point release, and since I wrote that code,
there have been only stable point releases. I have been begging NTP
release management to restart ntp-dev testing releases for over two years
now, so I could integrate the code and it could get more widespread testing
of these changes before it hopefully makes it into a stable release in a
reasonable timeframe.

I've been told the entire time that ntp-dev would be revived from its 2019
dormancy soon for the past two-plus years, so I'm a bit jaded about the
prospects, despite hearing it really is very close to happening now. In
the meantime, I've been kept more than busy fixing bugs and adding less
disruptive improvements to the ntp-stable releases.

I'll have another response shortly to a different post in this thread.
Cheers,
Dave Hart
Roger
2024-11-22 08:51:55 UTC
Reply
Permalink
On Fri, 22 Nov 2024 05:08:00 -0000 (UTC), "Dave Hart"
Post by Dave Hart
Post by Roger
I had assumed that ntpd would mobilize a few servers and choose
one to replace the unreachable server. Why assume this? If the
server had been removed from the pool then sending packets
forever would be wrong. However, there were no new mobilization
attempts, the server came back with the same association number.
In this instance it was an "internet malfunction", see graphs on
link below.
https://www.ntppool.org/a/markcpowell
Was my expectation wrong?
Sadly, yes.
Post by Roger
Did Dave Hart's ntp-dev-3792-msm-v2 contain such code which
didn't yet get into the released code?
Yes, that code has not been released yet, and as it's based on the source
code from about two years ago, it's going to be a bit painful to merge into
the current code. The 3792 test release code contains logic to gradually
refine the pool servers by removing one at a time when certain conditions
are met, and not responding for 10 poll intervals is one of those
conditions.
Thank you. Hopefully, you'll be successful in getting it
integrated and accepted sooner rather than later.
--
Roger
Loading...