Paul Godard
2017-03-30 00:39:41 UTC
Hello,
I have an APC Back-UPS 650VA Battery Backup device (BN650M1-CA) connected
to my physical pfSense box (2.3.3_1 - FreeBSD 10.3 p17). When running
apcupsd in debug mode on the command line (-b -d 300), I can see that all
attempts to get a HID report fail. So all calls to usb_control_msg
attempting to gather a report will fail but calls using usb_control_msg
requesting other things may work (such as getting the UPS's HID interface's
report descriptor - this work). Also, oddly enough, USB interrupts events
work just fine; apcupsd is able to populate some information (including the
UPS status, estimated runtime left, battery charge percentage, etc.) and is
thus able to trigger alerts when a power outage occurs.
Back to the HID report issue, I enabled kernel tracing and was able to
isolate the failure to ioctl calls (that usb_control_msg utilizes); the HID
report calls return -1 values and sets the "errno" value to 5 (Input/Output
Error). You can see the trace here:
32271 apcupsd CALL write(0x1,0x80201d000,0x4b)
32271 apcupsd GIO fd 1 wrote 75 bytes
"63.611 apcupsd: HidUps.cpp:438 get_report: id=0x0c, kind=2,
length=4 pos=0
"
32271 apcupsd RET write 75/0x4b
32271 apcupsd CALL ioctl(0x8,0xc018556f,0x7fffffffe760)
32271 apcupsd RET ioctl -1 errno 5 Input/output error
32271 apcupsd CALL write(0x1,0x80201d000,0x62)
32271 apcupsd GIO fd 1 wrote 98 bytes
"63.620 apcupsd: HidUps.cpp:446 Error getting report: (-1) Operation
not permitted [Unknown error]
"
I've tried connecting that UPS to another box running a pfSense VM with
USB-passthru (identical version to my physical box) - this is a different
physical computer - and all works fine. So this leads me to believe it's
some sort of issue with the physical USB controller on my physical pfSense
box? If that's the case, I don't believe this is something that should (or
can) be fixed in apcupsd. However, apcupsd is still somewhat usable thanks
to the USB interrupts events but has another logical bug caused by the
unexpected HID reports failures...
So the actual bug that I want to report... When a power outage event is
triggered, apcupsd reports it correctly, however, apcupsd also instantly
reports an all clear ("Power is back. UPS running on mains.") when the
outage is in-fact still occurring. When power is legitimately restored,
apcupsd doesn't report anything different (since it already "recovered"
immediately after the failure was initially detected). I tracked down the
bug to the UsbUpsDriver::read_volatile_data() method (which is called
periodically, and/or after an "important" event - which is determined in
UsbUpsDriver::usb_report_event) in the src/drivers/usb/usb.c file. In this
method, right before you loop through updated CI values by requesting HID
reports (and by eventually calling usb_update_value), you reset the
"Status" field bits to 0x00 (thus getting rid of the "power outage" state)
anticipating that the updated CI values will correctly repopulate the
"Status" field. In my particular case, apcupsd is unable to fetch any HID
reports, therefore, after you clear the "Status" field, no CI values will
ever set it back to a failed "power outage" state. This is why immediately
after detecting an outage, apcupsd thinks power was restored.
The fix I'm proposing (which worked in my particular case) is to keep a
copy of the "Status" field's value before clearing it. Then, in the CI
update loop, I keep track of whether a single HID report came back
successfully (in my particular case, none ever come back so the code now
knows this). Right after the loop, if I detected that no HID reports were
successfully queried, I reset the "Status" field back to its original value
(before it gets cleared before the loop) thus keeping my failed "power
outage" status. The moment power returns to the UPS, apcupsd gets an
updated CI_STATUS through a USB interrupt event and correctly reports the
event and everyone is happy.
Anyone else experienced something similar to this? What are your thoughts
on this? I created a simple patch to fix the bug of the immediate power
restore issue that can be found here: https://pastebin.com/AnJ1VYTn
Thanks,
Paul
I have an APC Back-UPS 650VA Battery Backup device (BN650M1-CA) connected
to my physical pfSense box (2.3.3_1 - FreeBSD 10.3 p17). When running
apcupsd in debug mode on the command line (-b -d 300), I can see that all
attempts to get a HID report fail. So all calls to usb_control_msg
attempting to gather a report will fail but calls using usb_control_msg
requesting other things may work (such as getting the UPS's HID interface's
report descriptor - this work). Also, oddly enough, USB interrupts events
work just fine; apcupsd is able to populate some information (including the
UPS status, estimated runtime left, battery charge percentage, etc.) and is
thus able to trigger alerts when a power outage occurs.
Back to the HID report issue, I enabled kernel tracing and was able to
isolate the failure to ioctl calls (that usb_control_msg utilizes); the HID
report calls return -1 values and sets the "errno" value to 5 (Input/Output
Error). You can see the trace here:
32271 apcupsd CALL write(0x1,0x80201d000,0x4b)
32271 apcupsd GIO fd 1 wrote 75 bytes
"63.611 apcupsd: HidUps.cpp:438 get_report: id=0x0c, kind=2,
length=4 pos=0
"
32271 apcupsd RET write 75/0x4b
32271 apcupsd CALL ioctl(0x8,0xc018556f,0x7fffffffe760)
32271 apcupsd RET ioctl -1 errno 5 Input/output error
32271 apcupsd CALL write(0x1,0x80201d000,0x62)
32271 apcupsd GIO fd 1 wrote 98 bytes
"63.620 apcupsd: HidUps.cpp:446 Error getting report: (-1) Operation
not permitted [Unknown error]
"
I've tried connecting that UPS to another box running a pfSense VM with
USB-passthru (identical version to my physical box) - this is a different
physical computer - and all works fine. So this leads me to believe it's
some sort of issue with the physical USB controller on my physical pfSense
box? If that's the case, I don't believe this is something that should (or
can) be fixed in apcupsd. However, apcupsd is still somewhat usable thanks
to the USB interrupts events but has another logical bug caused by the
unexpected HID reports failures...
So the actual bug that I want to report... When a power outage event is
triggered, apcupsd reports it correctly, however, apcupsd also instantly
reports an all clear ("Power is back. UPS running on mains.") when the
outage is in-fact still occurring. When power is legitimately restored,
apcupsd doesn't report anything different (since it already "recovered"
immediately after the failure was initially detected). I tracked down the
bug to the UsbUpsDriver::read_volatile_data() method (which is called
periodically, and/or after an "important" event - which is determined in
UsbUpsDriver::usb_report_event) in the src/drivers/usb/usb.c file. In this
method, right before you loop through updated CI values by requesting HID
reports (and by eventually calling usb_update_value), you reset the
"Status" field bits to 0x00 (thus getting rid of the "power outage" state)
anticipating that the updated CI values will correctly repopulate the
"Status" field. In my particular case, apcupsd is unable to fetch any HID
reports, therefore, after you clear the "Status" field, no CI values will
ever set it back to a failed "power outage" state. This is why immediately
after detecting an outage, apcupsd thinks power was restored.
The fix I'm proposing (which worked in my particular case) is to keep a
copy of the "Status" field's value before clearing it. Then, in the CI
update loop, I keep track of whether a single HID report came back
successfully (in my particular case, none ever come back so the code now
knows this). Right after the loop, if I detected that no HID reports were
successfully queried, I reset the "Status" field back to its original value
(before it gets cleared before the loop) thus keeping my failed "power
outage" status. The moment power returns to the UPS, apcupsd gets an
updated CI_STATUS through a USB interrupt event and correctly reports the
event and everyone is happy.
Anyone else experienced something similar to this? What are your thoughts
on this? I created a simple patch to fix the bug of the immediate power
restore issue that can be found here: https://pastebin.com/AnJ1VYTn
Thanks,
Paul