Biscuit Ninja
2017-04-21 16:33:07 UTC
We run an APS SmartUPS at the office with a network card. PCNS is
installed on Windows clients. APCUPSD is installed on Linux boxen.
We've had an infuriating issue for a while now where the Linux boxen
would occasionally bleat about losing connectivity to the UPS. A
colleague firmware upgraded the UPS yesterday in an effort to resolve
the issue. And of course that's backfired and made the problem much
worse.
Here's an excerpt from the apcupsd.event log:
2017-04-21 10:53:05 +0100 Communications with UPS lost.
2017-04-21 10:53:08 +0100 Communications with UPS restored.
2017-04-21 11:10:30 +0100 Communications with UPS lost.
2017-04-21 11:10:39 +0100 Communications with UPS restored.
2017-04-21 12:26:12 +0100 Communications with UPS lost.
2017-04-21 12:26:25 +0100 Communications with UPS restored.
2017-04-21 12:55:25 +0100 Communications with UPS lost.
2017-04-21 12:55:36 +0100 Communications with UPS restored.
2017-04-21 13:11:10 +0100 Communications with UPS lost.
2017-04-21 13:11:27 +0100 Communications with UPS restored.
2017-04-21 16:11:21 +0100 Communications with UPS lost.
2017-04-21 16:11:25 +0100 Communications with UPS restored.
We are running version 3.14.10 under Ubuntu 12.04 and 3.14.12 under
Debian Jessie - same situation with both distributions.
I've done some digging into APCUPSD with the pcnet driver. It is reliant
up on receiving UDP broadcast MASTATUS packets from the UPS at ~25
second intervals. After every 3rd packet, the UPS waits ~19 seconds and
emits an MACONFIG packet, which APCUPSD silently ignores. ~25 seconds
later the cycle starts again with the broadcast of another MASTATUS
packet.
Occasionally (five times this morning, just once this afternoon) the UPS
emits two consecutive MACONFIG packets. That means APCUPSD goes ~69
seconds without seeing an MASTATUS packet. The longest it will wait for
an MASTATUS packet is 60 seconds (equal to 25 * 2 + "a bit"). Thus the
regularly reported commfail event.
Has anyone else experienced this? Any idea why the UPS generates
MACONFIG packets, and what could cause it to generate two of them
consecutively?
I'm at the extent of my knowledge now and I figured it might be useful
to get some broader insight into the problem.
Thanks
installed on Windows clients. APCUPSD is installed on Linux boxen.
We've had an infuriating issue for a while now where the Linux boxen
would occasionally bleat about losing connectivity to the UPS. A
colleague firmware upgraded the UPS yesterday in an effort to resolve
the issue. And of course that's backfired and made the problem much
worse.
Here's an excerpt from the apcupsd.event log:
2017-04-21 10:53:05 +0100 Communications with UPS lost.
2017-04-21 10:53:08 +0100 Communications with UPS restored.
2017-04-21 11:10:30 +0100 Communications with UPS lost.
2017-04-21 11:10:39 +0100 Communications with UPS restored.
2017-04-21 12:26:12 +0100 Communications with UPS lost.
2017-04-21 12:26:25 +0100 Communications with UPS restored.
2017-04-21 12:55:25 +0100 Communications with UPS lost.
2017-04-21 12:55:36 +0100 Communications with UPS restored.
2017-04-21 13:11:10 +0100 Communications with UPS lost.
2017-04-21 13:11:27 +0100 Communications with UPS restored.
2017-04-21 16:11:21 +0100 Communications with UPS lost.
2017-04-21 16:11:25 +0100 Communications with UPS restored.
We are running version 3.14.10 under Ubuntu 12.04 and 3.14.12 under
Debian Jessie - same situation with both distributions.
I've done some digging into APCUPSD with the pcnet driver. It is reliant
up on receiving UDP broadcast MASTATUS packets from the UPS at ~25
second intervals. After every 3rd packet, the UPS waits ~19 seconds and
emits an MACONFIG packet, which APCUPSD silently ignores. ~25 seconds
later the cycle starts again with the broadcast of another MASTATUS
packet.
Occasionally (five times this morning, just once this afternoon) the UPS
emits two consecutive MACONFIG packets. That means APCUPSD goes ~69
seconds without seeing an MASTATUS packet. The longest it will wait for
an MASTATUS packet is 60 seconds (equal to 25 * 2 + "a bit"). Thus the
regularly reported commfail event.
Has anyone else experienced this? Any idea why the UPS generates
MACONFIG packets, and what could cause it to generate two of them
consecutively?
I'm at the extent of my knowledge now and I figured it might be useful
to get some broader insight into the problem.
Thanks