2022-04-25 - verify AbstractSystem::_cmdResult thread safety bug
Software testing queue
Servers and software versions used:
While investigating interrupted offsets, I ran a script to send Ice commands to the AdSec
Arbitrator and WFS Arbitrator in two processes, which crashed the AdSec
Arbitrator. After some debugging and inspecting code, I suspect the issue is with thread safety of the command result from IdlSystem
), as it occurs between sending wfs.saveOpticalLoopData (which calls adsec.savedata) and adsec.Pause in two separate threads without any pause in between. IdlSystem
has a mutex for executing the command, but there is no mutex for the command result, which gets re-initialized immediately when the command begins.
To verify that this could be a thread safety issue, I can try adding a 2s pause between wfs.saveOpticalLoopData and adsec.PauseAo to check that it no longer makes the AdSec
- Everything worked as expected. Without the sleep between wfs.saveOpticalLoopData and adsec.PauseAo, the AdSecArbitrator crashed as before. With the sleep, the AdSecArbitrator did not crash after 3 iterations.
- I'll make a pull request for the fix that we can test another time.
- 25 Apr 2022