Discussion:
Aborting ntpd when unable to control the clock
(too old to reply)
Dave Hart
2024-11-12 19:13:05 UTC
Permalink
It seems obvious to me that ntpd should log an error and terminate when it
is unable to adjust the system clock. To my surprise,
https://bugs.ntp.org/1433 pointed out that when a Linux ntpd binary built
to use capabilities is run on a kernel build without capability capability,
ntpd blithely runs without complaint while effectively doing nothing. For
this specific problem, you could blame the user and say they need to use
ntpd built --without-linux-caps, but there's a more general issue of ntpd
not reporting let alone aborting on a failure to control the clock.

To explain the context a bit, I came across bug 1433 somehow and saw that
in 2019 the decade-old bug was fixed by having ntpd test for whether
capabilities work before dropping root (they're needed to crank the clock
when not running as root on Linux). When capabilities do not work, ntpd
was then ignoring the request to drop root and run as a user, typically
"ntp". This meant it was silently opening up an opportunity for more
useful privilege elevation or remote code execution despite the user's
explicit configuration, and that's unacceptable to me. My intention is to
change the behavior to error out when controlling the clock fails (via step
or slew). If you think that's a bad idea, please speak up and explain your
reasoning.

Cheers,
Dave Hart
Terje Mathisen
2024-11-13 12:13:05 UTC
Permalink
Post by Dave Hart
It seems obvious to me that ntpd should log an error and terminate
when it is unable to adjust the system clock.  To my surprise,
https://bugs.ntp.org/1433 pointed out that when a Linux ntpd binary
built to use capabilities is run on a kernel build without capability
capability, ntpd blithely runs without complaint while effectively
doing nothing.  For this specific problem, you could blame the user
and say they need to use ntpd built --without-linux-caps, but there's
a more general issue of ntpd not reporting let alone aborting on a
failure to control the clock.
To explain the context a bit, I came across bug 1433 somehow and saw
that in 2019 the decade-old bug was fixed by having ntpd test for
whether capabilities work before dropping root (they're needed to
crank the clock when not running as root on Linux).  When capabilities
do not work, ntpd was then ignoring the request to drop root and run
as a user, typically "ntp".  This meant it was silently opening up an
opportunity for more useful privilege elevation or remote code
execution despite the user's explicit configuration, and that's
unacceptable to me.  My intention is to change the behavior to error
out when controlling the clock fails (via step or slew).  If you think
that's a bad idea, please speak up and explain your reasoning.
Cheers,
Dave Hart
I agree, that seems like The Right Thing to do.

Terje
PS. I'm going to retire soon, so my intention is to get back into NTP
Hackers work at that point!
--
- <***@tmsw.no>
"almost all programming can be viewed as an exercise in caching"
Dave Hart
2024-11-14 02:38:00 UTC
Permalink
Post by Dave Hart
Post by Dave Hart
It seems obvious to me that ntpd should log an error and terminate when
it
Post by Dave Hart
is unable to adjust the system clock. To my surprise,
https://bugs.ntp.org/1433 pointed out that when a Linux ntpd binary
built
Post by Dave Hart
to use capabilities is run on a kernel build without capability
capability,
Post by Dave Hart
ntpd blithely runs without complaint while effectively doing nothing.
For
Post by Dave Hart
this specific problem, you could blame the user and say they need to use
ntpd built --without-linux-caps, but there's a more general issue of ntpd
not reporting let alone aborting on a failure to control the clock.
Note that widely used operating systems, like Apple's OS X, run
ntpd as a monitoring service that explicitly does not/cannot discipline
the clock.
I've also heard of people explicitly running ntpd to monitor and
log statistics, without wanting it to discipline the clock.
Perhaps the cleanest way to do this is add a flag to run the
daemon without attempting to discipline the clock?
I believe that flag is already there, "disable ntp". I haven't used it
though.

Cheers,
Dave Hart
Majdi S. Abbas
2024-11-14 02:38:00 UTC
Permalink
Post by Dave Hart
It seems obvious to me that ntpd should log an error and terminate when it
is unable to adjust the system clock. To my surprise,
https://bugs.ntp.org/1433 pointed out that when a Linux ntpd binary built
to use capabilities is run on a kernel build without capability capability,
ntpd blithely runs without complaint while effectively doing nothing. For
this specific problem, you could blame the user and say they need to use
ntpd built --without-linux-caps, but there's a more general issue of ntpd
not reporting let alone aborting on a failure to control the clock.
Note that widely used operating systems, like Apple's OS X, run
ntpd as a monitoring service that explicitly does not/cannot discipline
the clock.

I've also heard of people explicitly running ntpd to monitor and
log statistics, without wanting it to discipline the clock.

Perhaps the cleanest way to do this is add a flag to run the
daemon without attempting to discipline the clock?

--msa
Harlan Stenn via questions Mailing List
2024-11-14 07:13:00 UTC
Permalink
Post by Dave Hart
Post by Dave Hart
It seems obvious to me that ntpd should log an error and
terminate when it
Post by Dave Hart
is unable to adjust the system clock.  To my surprise,
https://bugs.ntp.org/1433 <https://bugs.ntp.org/1433> pointed out
that when a Linux ntpd binary built
Post by Dave Hart
to use capabilities is run on a kernel build without capability
capability,
Post by Dave Hart
ntpd blithely runs without complaint while effectively doing
nothing.  For
Post by Dave Hart
this specific problem, you could blame the user and say they need
to use
Post by Dave Hart
ntpd built --without-linux-caps, but there's a more general issue
of ntpd
Post by Dave Hart
not reporting let alone aborting on a failure to control the clock.
        Note that widely used operating systems, like Apple's OS X, run
ntpd as a monitoring service that explicitly does not/cannot discipline
the clock.
        I've also heard of people explicitly running ntpd to
monitor and
log statistics, without wanting it to discipline the clock.
        Perhaps the cleanest way to do this is add a flag to run the
daemon without attempting to discipline the clock?
I believe that flag is already there, "disable ntp".  I haven't used it
though.
To be clear, deciding when ntpd should abort if it cannot discipline the
clock should be done at the "right" place in the code - not too early,
and not too late.
Post by Dave Hart
Cheers,
Dave Hart
--
Harlan Stenn <***@nwtime.org>
https://www.nwtime.org/ - be a member!
Dave Hart
2024-11-14 08:53:05 UTC
Permalink
Post by Harlan Stenn via questions Mailing List
To be clear, deciding when ntpd should abort if it cannot discipline the
clock should be done at the "right" place in the code - not too early,
and not too late.
Well, Goldilocks, it seems obvious to me -- when an attempt to modify the
system time/clock rate fails.

Cheers,
Dave Hart
Brian Utterback via questions Mailing List
2024-11-15 14:03:05 UTC
Permalink
Solaris also has a monitor mode for NTP. When that mode is invoked, we
explicitly drop the privilege needed to adjust the clock. Another
situation is if ntpd is run in a Solaris zone (similar to a Linux
container). By default these zones don't have the privilege needed to
set the clock.  In a perfect world, ntpd would exit abnormally with an
error message when it is run in a zone without the priv to set the clock
and when run in monitor mode, the ntp.conf file would always contain
"disable ntp". However, I can't control what the admin puts into the
ntp.conf file. Further, it can be convenient to be able to switch
between the two modes without  having to change the ntp.conf file.

What we actually do is report the error once the first time, and if it
persists, stop trying to set the clock. When in monitor mode, we set an
environment variable to suppress reporting the error the first time.

That being said, I am sure I can work around whatever you put into the
code, as I am doing now. Indeed, perhaps a better approach would be to
set "disable ntp" when ntpd sees the environment variable.
Post by Harlan Stenn via questions Mailing List
To be clear, deciding when ntpd should abort if it cannot
discipline the
clock should be done at the "right" place in the code - not too early,
and not too late.
Well, Goldilocks, it seems obvious to me -- when an attempt to modify
the system time/clock rate fails.
Cheers,
Dave Hart
--
--
I haven't lost my mind, it is backed up in the cloud somewhere.

Oracle <http://www.oracle.com>
Brian Utterback, Principal Software Engineer
Phone: +16038973049 <tel:+16038973049>, Mobile: +16035577683
<tel:+16035577683>
https://oracle.zoom.us/s/2728168892
One Oracle Drive, Nashua, NH 03062
Loading...