2022-03-28 - Save optical loop data while executing pause and resume commands

Software: BM
AO: JCG

Software testing queue

Software versions used: stable
Software version after test: stable

Problem (IT #8556)

LBTI and SOUL-LUCI have both experienced instances where there are >85% of frames marked as safe skip during OffsetXY commands using the new gain=0 pause/resume scheme. When this happens, the AO Arbitrator reports that the OffsetXY command logs that it has been interrupted and the star doesn't end up in the right place on ResumeAo. BM and JCG looked through the logs and verified that the bayside stages are moving correctly. The OffsetXY command does not listen to the _interrupt flag from the AO Arbitrator, so the problem is likely caused from something other than OffsetXY.

Test

One possibility is that when the gains are set to zero on pause, they are not all set immediately in a single frame, such that the very next frame processed by the AdSec is with all gains equal to zero. If this happens, when an offset begins after pause, the telescope may begin to move before the gains are fully set to zero, causing bad frames to be processed by the AdSec and producing skip frames. To test this, we can close the loop with normal operating parameters and run sxadsec:~/scripts/tests/bm.20220328.interrupted-offsets/save_pause_resume.sh that performs the following loop:
  1. Begin saving optical loop data for 5000 frames,
  2. issue a pause command to the AdSec arbitrator
  3. wait 1s
  4. issue a resume command to the AdSec arbitrator, and
  5. wait 2s
  6. Wait for optical loop data to finish saving
  7. wait 2s

What happened

  1. [BM] The script caused a segmentation fault in the AdSecArbitrator after Pause. It is very unlikely this has anything to do with the script or the custom AOIce.py in the test script directory. I saved a core dump and an example from adsecarb.log in ~/scripts/tests/bm.20220328.save-optical-data. Will look into this more.
    adsecarb.L.202203280000.log|AdSecArbitrator_35887|INF| 360|2022-03-28 22:13:01.025168| MAIN > Received cmd: RunAo(0,900,860) 
    adsecarb.L.202203280000.log|AdSecArbitrator_35887|INF| 361|2022-03-28 22:13:01.025223| COMMANDHANDLER > FSM (status: AOSet) has received command 2014 (Run Ao)
    adsecarb.L.202203280000.log|AdSecArbitrator_35887|INF| 362|2022-03-28 22:13:01.025285| SERIALIZATION > RunAoImpl(): decimation = 0
    adsecarb.L.202203280000.log|AdSecArbitrator_35887|INF| 363|2022-03-28 22:13:01.025298| MASTDIAGN > Executing setDecimation, client: masterdiagnostic.L dec: 0
    adsecarb.L.202203280000.log|AdSecArbitrator_35887|INF| 364|2022-03-28 22:13:01.025311| MASTDIAGN > Sequence number: 800224
    adsecarb.L.202203280000.log|AdSecArbitrator_35887|INF| 365|2022-03-28 22:13:01.025406| MASTDIAGN > Command returns success .
    adsecarb.L.202203280000.log|AdSecArbitrator_35887|INF| 366|2022-03-28 22:13:02.180848| COMMANDHANDLER > Command Run Ao (code 2014) successfully completed
    adsecarb.L.202203280000.log|AdSecArbitrator_35887|INF| 367|2022-03-28 22:13:02.181242| MAIN > Received cmd: LoadGain(VEC[672],0)
    adsecarb.L.202203280000.log|AdSecArbitrator_35887|INF| 368|2022-03-28 22:13:02.181535| COMMANDHANDLER > FSM (status: AORunning) has received command 2018 (SetGain)
    adsecarb.L.202203280000.log|AdSecArbitrator_35887|INF| 369|2022-03-28 22:13:02.210573| COMMANDHANDLER > Command SetGain (code 2018) successfully completed
    adsecarb.L.202203280000.log|AdSecArbitrator_35887|INF| 370|2022-03-28 22:13:02.215681| MAIN > Received cmd: Offloading(1)
    adsecarb.L.202203280000.log|AdSecArbitrator_35887|INF| 371|2022-03-28 22:13:02.215747| COMMANDHANDLER > FSM (status: AORunning) has received command 2029 (TTOffload)
    adsecarb.L.202203280000.log|AdSecArbitrator_35887|INF| 372|2022-03-28 22:13:02.312748| COMMANDHANDLER > Command TTOffload (code 2029) successfully completed
    adsecarb.L.202203280000.log|AdSecArbitrator_35887|INF| 373|2022-03-28 22:15:22.606542| MAIN > Received cmd: PauseAo()
    adsecarb.L.202203280000.log|AdSecArbitrator_35887|INF| 374|2022-03-28 22:15:22.606623| COMMANDHANDLER > FSM (status: AORunning) has received command 2027 (Pause Ao)
    adsecarb.L.202203280000.log|AdSecArbitrator_35887|WAR| 375|2022-03-28 22:15:22.607120| IDL > Tip-tilt gain is set to zero. Offload disabled until next setGain command
    adsecarb.L.202203280000.log|AdSecArbitrator_35887|INF| 376|2022-03-28 22:15:22.683997| COMMANDHANDLER > Command Pause Ao (code 2027) successfully completed
    adsecarb.L.202203280000.log|AdSecArbitrator_35887|INF| 377|2022-03-28 22:15:22.691792| COMMANDHANDLER > FSM (status: AOPause) has received command 2024 (Save status)
    adsecarb.L.202203280000.log|AdSecArbitrator_35887|ERR| 378|2022-03-28 22:15:22.691834| COMMANDHANDLER > adsecarb: Fsm is discarding command 2024 (Save status)...
    adsecarb.L.202203280000.log|AdSecArbitrator_35887|INF| 379|2022-03-28 22:15:24.788455| MAIN > Received cmd: ResumeAo()
    adsecarb.L.202203280000.log|AdSecArbitrator_35887|INF| 380|2022-03-28 22:15:24.788530| COMMANDHANDLER > FSM (status: AOPause) has received command 2028 (ResumeAo Ao)
    adsecarb.L.202203280000.log|AdSecArbitrator_35887|INF| 381|2022-03-28 22:15:24.796331| COMMANDHANDLER > Command ResumeAo Ao (code 2028) successfully completed
    adsecarb.L.202203280000.log|AdSecArbitrator_35887|INF| 382|2022-03-28 22:15:38.810650| MAIN > Received cmd: PauseAo()
    adsecarb.L.202203280000.log|AdSecArbitrator_35887|INF| 383|2022-03-28 22:15:38.810743| COMMANDHANDLER > FSM (status: AORunning) has received command 2027 (Pause Ao)
    adsecarb.L.202203280000.log|AdSecArbitrator_35887|WAR| 384|2022-03-28 22:15:38.810999| IDL > Tip-tilt gain is set to zero. Offload disabled until next setGain command
    adsecarb.L.202203280000.log|AdSecArbitrator_35887|INF| 385|2022-03-28 22:15:38.822704| COMMANDHANDLER > Command Pause Ao (code 2027) successfully completed
    adsecarb.L.202203280000.log|AdSecArbitrator_35887|INF| 386|2022-03-28 22:15:38.833982| COMMANDHANDLER > FSM (status: AOPause) has received command 2024 (Save status)
    adsecarb.L.202203280000.log|AdSecArbitrator_35887|ERR| 387|2022-03-28 22:15:38.834025| COMMANDHANDLER > adsecarb: Fsm is discarding command 2024 (Save status)...
    adsecarb.L.202203280000.log|AdSecArbitrator_35887|INF| 388|2022-03-28 22:15:40.922347| MAIN > Received cmd: ResumeAo()
    adsecarb.L.202203280000.log|AdSecArbitrator_35887|INF| 389|2022-03-28 22:15:40.922418| COMMANDHANDLER > FSM (status: AOPause) has received command 2028 (ResumeAo Ao)
    adsecarb.L.202203280000.log|AdSecArbitrator_35887|INF| 390|2022-03-28 22:15:40.929453| COMMANDHANDLER > Command ResumeAo Ao (code 2028) successfully completed
    adsecarb.L.202203280000.log|AdSecArbitrator_35887|INF| 391|2022-03-28 22:15:54.202665| COMMANDHANDLER > FSM (status: AORunning) has received command 2024 (Save status)
    adsecarb.L.202203280000.log|AdSecArbitrator_35887|INF| 392|2022-03-28 22:15:54.205059| MAIN > Received cmd: PauseAo()
    adsecarb.L.202203280000.log|AdSecArbitrator_35887|INF| 393|2022-03-28 22:15:54.205136| COMMANDHANDLER > FSM (status: AORunning) has received command 2027 (Pause Ao)
    adsecarb.L.202203280000.log|AdSecArbitrator_35887|ERR| 394|2022-03-28 22:15:54.205150| COMMANDHANDLER > adsecarb: Fsm is discarding command 2027 (Pause Ao)...
    adsecarb.L.202203280000.log|AdSecArbitrator_35887|WAR| 395|2022-03-28 22:15:54.205229| MAIN > PauseAo: WARNING - adsecarb:Illegal command for state AORunning (Pause Ao)

    AdSecArbitrator crashed, segmention fault. Core dump saved as core.35887.aoice-errors
    Backtrace:
     #0 0x000000000063d15d in Arcetri::Arbitrator::AbstractFsm::processCommand (this=0x11822e0) at /home/aoeng/soul/latest/source/Arbitrator/framework/AbstractFsm.cpp:69
    #1 0x0000000000591ccc in Arcetri::AdSec_Arbitrator::AdSecFsm::processCommand (this=0x11822e0) at AdSecFsm.cpp:94
    #2 0x000000000059558f in AdSecFsmCore::__processEvent (this=0x1182320, e=AdSecFsmCore::SaveStatus) at AdSecFsmCore.cpp:144
    #3 0x0000000000595dd1 in AdSecFsmCore::A (this=0x1182320, e=AdSecFsmCore::SaveStatus) at AdSecFsmCore.cpp:330
    #4 0x0000000000591a5e in Arcetri::AdSec_Arbitrator::AdSecFsm::trigger (this=0x11822e0, code=2024) at AdSecFsm.cpp:65
    #5 0x000000000063c968 in Arcetri::Arbitrator::AbstractFsm::execute (this=0x11822e0) at /home/aoeng/soul/latest/source/Arbitrator/framework/AbstractFsm.cpp:24
    #6 0x000000000063a972 in Arcetri::Arbitrator::CommandHandler::doExecute (this=0x11822e0, cmd=0x7f4968000930) at /home/aoeng/soul/latest/source/Arbitrator/framework/CommandHandler.cpp:60
    #7 0x0000000000633c30 in Arcetri::Arbitrator::AbstractArbitrator::processCommand (this=0x11811e0, cmd=0x7f4968001830)
    at /home/aoeng/soul/latest/source/Arbitrator/framework/AbstractArbitrator.cpp:731
    #8 0x00000000006301b2 in Arcetri::Arbitrator::AbstractArbitrator::executeCommand (myself=0x11811e0, cmd=0x7f4968001830)
    at /home/aoeng/soul/latest/source/Arbitrator/framework/AbstractArbitrator.cpp:231
    #9 0x0000000000630037 in Arcetri::Arbitrator::AbstractArbitrator::handleCommand (commandMsgBuf=0x7f4978000f10, argp=0x11811e0, hndlrQueueSize=0)
    at /home/aoeng/soul/latest/source/Arbitrator/framework/AbstractArbitrator.cpp:206
    #10 0x000000000062a3ff in hManager (arg=0x9e6518) at thrdlib.c:990
    #11 0x0000003682007aa1 in start_thread (arg=0x7f49a08b8700) at pthread_create.c:301
    #12 0x0000003681ce893d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115
  2. Juan Carlos performed the sequence manually several times. There is an interesting sinusoidal pattern in commands for each actuator for about 100 frames after the pause is executed. More to come ...
-- BrandonMechtley - 28 Mar 2022
Topic revision: r1 - 28 Mar 2022, BrandonMechtley
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback