MODS Troubleshooting

MODS1 and MODS2

Instrument configuration times out

Sometimes the script stops with an error message that INSTCONFIG timed out.

The "instconfig" command configures the instrument, moving a number of mechanisms in parallel, and, in addition, implicitly runs the "imcslock" command. If any of these mechanism moves times out, either because the component failed to go to the correct position, or because feedback on its position was not received, or if the IMCS IR laser fails to achieve lock within the prescribed time (120 seconds), then the "instconfig" command can time out.

The corrective if INSTCONFIG times out is:
  • do nothing on the script window yet (don't answer the abort/retry/ignore yet)
  • on the MODS gui, press the UPDATE button -or- type the command "refresh instconfig" in the MODS Dashboard command window. This clears any INSTCONFIG-in-progress flags.
  • on the script, tell it R to retry

If this does not work, please call the support astronomer.

Script stops with an error and a prompt to abort/retry/ignore

A script will stop with an error for several reasons: if the "instconfig" command fails; if there is a communication glitch and no feedback is received to report the success or failure of a command; or if there is a syntax error. In the script running window, you should see the action on which it failed and the prompt to abort/retry/ignore. If the action is something that involves no mechanisms, e.g. setting the PARTNER or other archive information, then it points to a communication glitch or syntax error.

When you see that the script has stopped with an error, first follow the instructions above for an INSTCONFIG timeout, namely:
  • do nothing on the script window yet (don't answer the abort/retry/ignore)
  • on the MODS gui, press the UPDATE button or type "refresh instconfig" in the MODS Dashboard command window.
  • on the script, type R to retry
If that does not work, check the script for syntax errors. If none are found and the error appears when moving a mechanism, try to move this same mechanism outside of the script, either by issuing the command in the MODS Dashboard command window (e.g. "red filter r_sdss") or by using the buttons on the GUI. If, after these steps, the problem still remains, please call the support astronomer.

Red or Blue images not appearing in /newdata

  • On the MODS Dashboard GUI, look at the index numbers for the "last image" and "next image". If these are different by more than 1, this indicates that the data transfer queue has stalled; images are not being transferred properly between the CCD computer (M1.RC or M1.BC) and mods1data.
  • In the MODS Dashboard GUI Command Window, type "red fitsflush" or "blue fitsflush", depending on which channel is affected. This will start the data transfer, and it should take a few seconds for each image to appear on /newdata.
    • You can run the "fitsflush" command during an exposure, but avoid sending it during a readout. If there are many short exposures in that channel and you don't want to wait until the script is finished, you can pause an exposure in that channel, run the "fitsflush" command for that channel, and then resume after all of the images have been transferred.
  • Images should start to appear in /newdata and, if modsDisp is running, to be listed and displayed by modsDisp.
    • Note that after all of the images have been transferred, the image that is reported by the "last image" counter and displayed by modsDisp may not be the last image transferred, but, as you can confirm from the output of modsDisp or by listing the contents of /newdata, all of the images should have been transferred.

Exposure and readout counters stop updating but status window shows the exposure countdown timers working.

The most likely explanation is that there was a communication glitch, and one of the messages from the CCD controller noting that the exposure had started was dropped. To recover:
  • Wait until current exposure is done and insure it is saved.
  • refresh the GUI. Click the UPDATE button at the lower left of the MODS Dashboard GUI
    • This clears all exposure and configuration state flags, then asks the instrument to report its current state.
  • Type "red reset" or "blue reset" in MODS Dashboard Command window, depending on which channel has the problem.
  • Abort the script with Cntl-C.

During an exposure, the countdown timer hangs or does not work.

If, during the exposure, the countdown timer is not working, this is a symptom of a communication glitch, where the CCD controller probably failed to tell the GUI that it had started the exposure. The recovery is to:
  • abort the exposure, or wait until it finishes;
  • Refresh the GUI (click UPDATE); and
  • type 'red reset' and/or 'blue reset' in the MODS Dashboard GUI Command window, depending on which channel has the problem.

Cntl-c "Y" fails to abort the script.

To address problems with script abort (cntl-c, "Y" from the script running xterm) failing, a new patch version of the GUI (1.25.8) was installed on 6 Oct 2013. After this, testing showed that a script abort would effectively abort the script, regardless of the stage at which the command was issued: during preparation for the next exposure, during the exposure itself, or during readout. As before, Cntl-c "Y" also will effectively stop the instrument configuration and IMCS lock, and will turn off any calibration lamps than may be on.

When testing the response to cntl-c, "Y" issued at different stages, the following behaviors were observed:
  • during exposure preparation (e.g. while GUI reports "erasing...") --- stops the exposure preparation in its tracks and after about 10 seconds, the exposure control in the GUI returns to the way it looked before the script was run.
  • during an exposure --- stops the exposure and no new file is written. The GUI does report "reading out...", "writing...", "cleaning up..." but, as expected, the image is not readout or written.
  • during readout --- finishes readout and, on the GUI, the LastFile index number is incremented, but the readout bar sticks at 100% read out and the NextFile index number is not incremented. However, upon running the same or a new script, once the exposure controls were reached, the readout bar went back to 0% and a new exposure was started. When that exposure was read out, written and done, NextFile was incremented and the subsequent exposure started. When that one started, LastFile became NextFile+1 and NextFile became the new LastFile+1, all as expected.

The exposure completes but the GUI appears to get "stuck" and no further exposure is started.

Occasionally the blue or red channel will get stuck: the exposure just taken has been completed yet no new exposure is started. You may see the message "Exposure done, cleaning up" in the MODS Dashboard Status box (but subsequent status messages can overwrite that), or you may notice that the data taking is stopped on one channel, while on the other, it continues. The root cause of this problem is a communications drop-out; the dashboard's exposure controller never receives the "exposure done" message from the IC.

If this happens, you can recover and continue on with the script by following the series of steps:
  • First, make absolutely sure that the last exposure from the "stuck" channel really is done - Insure that the image is on /newdata. The stuck channel must be idle for this step.
  • Next, if the exposure has been written to /newdata:
    • In the MODS Dashboard Command entry box, type
      • red expdone if red is stuck, or
      • blue expdone if blue is stuck.
    • Wait for it to execute the regular end-of-exposure process. If there are pending exposures in the sequence (e.g. if blue sticks after image 3 of 5 was completed), the remaining images will be executed like normal. The "expdone" directive breaks into the exposure control state machine and triggers the exposure completion branch. All of the normal "exposure done" messages will be issued to advance the exposure counters, reset status bars, and advance the execution of a multi-exposure script.

Please make sure to use this "expdone" command only to clear this particular condition, as use of "expdone" outside of a stuck exposure context will have unpredictable behavior, if it does anything at all. And never use the "reset" command (see above) for this condition, unless it is clear that the "expdone" has failed to unstick the delinquent channel and you are now abandoning an attempt to keep the observation going and resorting to a full, by-hand, script abort.

Exposure PAUSED but GUI still reports shutter to be OPEN (fixed in September 2013)

This is a known bug: Clicking PAUSE on the GUI or entering the PAUSE command will, as expected, CLOSE the shutter, but the IC program does not return a "shutter closed" flag, so the shutter state as reported on the GUI does not change. Observers should ignore the shutter state reported on the GUI when the exposure is paused. _This problem has been fixed in the new GUI released September 2013_

MODS commands (modsDisp, modsAlign) not defined

Check that the users are running under tcsh. At least LBTB comes up with bash terminals by default, but a "tcsh" at the command line makes all the necessary connections.

AGw failing to acquire/guide star when it should be bright enough

First check that the AGw is working with the default (clear) filter. We had a script change this to B_Bessel. If problems persist, the guide or WFS camera controller may be in a state where it is necessary to shut down GCS, cycle the power to the guider or WFS controller, and then restart GCS. However, consider that a last resort. Ask the OSA or support astronomer for assistance in cycling the power to the guider or WFS controllers.

IMCS lock timing out (check laser power)

The laser power set point should be 1.0 mW, but occasionally it is found at 0.3 mW. To check the set point and actual power:
  • either go to the Utilities on the UI and look for the MODS1 Lamp/Laser Box IR Laser status at the lower left. There are three buttons: Enable, Reset, Update. Click Update. This should show Power 1.1 mW and Set 1.0 mW for normal operations, after modsWake has been run. If it shows that the Power is 0.4 mW and Set is 0.3 mW, change the set point from 0.3 to 1.0 mW and click Update.
  • or, from the shell execute the command modsCmd irlaser
    • with no arguments, this will return the status.
    • following are the relevant keywords to change the status:
      • IRLASER=ON/OFF is the AC power state to the laser control box, it does not turn on the laser proper!
      • IRPSET = IR power set-point (what is requested) in mW
      • IRPOUT = measured IR laser power in mW
      • IRBEAM = ENABLED/DISABLED indicates if the laser output is enabled (IRLASER=ON is required first).
    • use the command modsCmd irlaser power 1.0 to change IRPSET to 1.0mW.

Occasionally, none of the above commands will power on the laser. In this case, try the following set of low-level commands (Note that since the arrival of MODS2, isisCmds require an argument --mods1 or --mods2 when issued from any computer other than mods1 or mods2.)
  • isisCmd --mods1 m1.ie irlaser Queries the MODS1 status.
  • isisCmd --mods1 m1.ie irlaser off Turn the laser off. Before turning off the laser ensure that the IRBEAM = DISABLE (irlaser disable).
  • --- wait 10 seconds so as not to rush things.
  • isisCmd --mods1 m1.ie irlaser on Turn the laser on.
  • --- count to 3 slowly, again, not rushing things.
  • isisCmd --mods1 m1.ie irlaser enable Enable the beam.
  • --- wait 10 seconds --- This is because of the built-in safety interval between "enable" and the beam coming on
  • isisCmd --mods1 m1.ie irlaser Query the status again. Now the power should read 1.1mW out (IRPOUT) for 1.0mW requested (IRPSET).

Unique filenames (like 020101M8.08q.fits)

Occasionally the image readout will not have the expected filename mods1x.UTdate.00NN.fits. This happens when the expected name will clash with the name of a file already in the raw data directory, but also on other occasions.
  • If the unique name is not triggered by a clash, it seems that the file with the expected name is also written. You can verify this by using the gethead command. From any obs machine, gethead 020101M8.08q.fits filename should output the value of the filename keyword.
  • If the unique name is triggered by a clash, then the files will need to be renamed to something which the archiving system will recognize and manually copied to the data directory on mods1data.

Dewar pressure(s) reading zero

If the pressure in one of the dewars is reading zero, the ionization gauge ("ig*") may have had a glitch or may be having some other problem. The ISp's daily instrument monitoring may show this, it is visible on the housekeeping page of the MODS GUI, or output by the command:

modsTemps 1 red or modsTemps 1 blue

Recovery procedure (Note that since the arrival of MODS2, isisCmds require an argument --mods1 or --mods2 when issued from any computer other than mods1 or mods2.):
  • From obsN or mods1/mods1data...
  • Send command: isisCmd --mods1 m1.bc igpower off
  • Count to 5
  • Send command: isisCmd --mods1 m1.bc igpower on
  • Wait 30+ seconds (updates on a slow internal cycle).
  • Verify
  • Report results to Rick, instruments list

modsWake hangs on sending commands to the HEBs

The modsWake script hung on the command blue tedpower on, with the usual options to abort, retry or ignore. Eventually, I tried ignore, but then the script hung on the next command, blue ledpower off. All of the MODS services were running and nothing seemed awry. Finally, I stopped the ICs and all MODS services, and restarted these, following the Instrument Control Software Startup instructions in the white notebook and within the MODSStartUp page. After that, modsWake completed successfully.

These 6 commands in modsWake: 3 to the blue HEB and 3 to the red HEB do the following, using blue as an example:

  • blue tedpower on -> make sure the blue HEB internal thermal cooler is on
  • blue ledpower off -> make sure the blue HEB "lab" LED display is turned off
  • blue igpower on -> make sure the blue HEB vacuum ionization gauge is on

In this instance, either the blue IC may not have been responding, or the command completed, but the "DONE:" response was not sent, one of the more common IC-related communication faults.

This is very rare, I saw it once in the 2 years since MODS1 has been at the LBT.

MODS data not appearing in correct Repository subdirectory

Occasionally, MODS data do not appear in the correct UTDATE subdirectory of Repository, although the data appear in newdata (IT #4218). If this happens, check the headers for the date-obs keyword. There is a memory glitch in the Instrument Control (IC) computer which leads to two date-obs entries in the FITS header. One of these is stale and does not have the current date, and, unfortunately, that is the one used by the archiving system.

To clear the header cache, it is necessary to restart the IC programs on both the M1.BC and M1.RC computers; this will clear the FITS header cache and insure only one date-obs entry is in the header.

The ICs should be restarted, as a precautionary measure, each afternoon before MODS will be used.
  • Open the MODS KVM in Computer Room B, use power button to turn the monitor on if it is off
  • Scroll-Lock Scroll-Lock to get the main KVM page
  • Select M1.RC and click connect
  • Type quit (nothing happens for ~2 seconds, then you get a C:\MODS> prompt at upper left)
  • Type IC. Screen will clear, wait for the status table at lower left to be populated
  • Repeat for M1.BC
  • Note that in Sept 2013, a new IC was loaded while making a workout for the red sequencer problems, and this IC prompts for the channel. Enter "B".

Mask selection error

There have been several instances where a MODS script fails on the slitmask command, ending with the MODS1 Dashboard GUI showing no mask and both IN and OUT orange and a message in the script-running window about the mask being out of position

Errors in the log were:

2012-11-14T10:02:57.162316 M1.IE>MC1 STATUS: MSELECT Selecting mask 16
2012-11-14T10:02:57.165169 MC1>ACQ STATUS: MSELECT Selecting mask 16
2012-11-14T10:03:19.457404 M1.IE>MC1 ERROR: MSELECT MSELECT=0 Move Fault, position at end of move 0 but requested position 16.000000
2012-11-14T10:03:19.462097 MC1>ACQ ERROR: MSELECT MSELECT=0 Move Fault, position at end of move 0 but requested position 16.000000

This problem has so far occurred only for low elevations (elevation < 42 deg) and certain rotator angles and for masks loaded into positions 13-16 of the cassette.

The solution is being worked on (see IT #4098). In the meantime:
  • If possible, load user masks in the cassette starting in position 24 (i.e., bottom up loading), and avoid putting masks in positions 16-13.

  • If it is unavoidable to load fully the cassette, and a mask in position 13-16 will be required at low elevation, then select the mask manually before presetting. The problem is in the mask selection, not in the insertion/retraction, so once the mask has been selected, the acquisition script should proceed without problems (question: should we comment out the slitmask command from the acquisition script?)
    • To select the mask manually: on the MODS gui, click on the name of the slitmask that is currently in place to open the drop-down menu. Drag the cursor to the ID of the mask desired to select it. The buttons will turn orange while the mask is being selected and return to black-on-grey when the move is complete. It does not matter whether the mask is inserted (In) or in the stow position (Out), but it would save time to insert the mask, since that is the first position commanded by the acquisition script.

HEB temperatures slowly rising

If the HEB temperatures are slowly rising, first confirm that the glycol flow and temperatures are ok. If so, then the thermo electric device (TED) may be off.

How to check TED Status and power cycle if necessary:

The thermoelectric device (TED) power status can be checked via a terminal isisCmd. If the terminal is on an obsN machine, it will be necessary to distinguish which mods commands are beings issued for.

If accessing through mods1data or mods2data:
  1. On the MODS kvm in CRB, select the mods data machine associated with the mods of interest, either mods1data or mods2data.
  2. Open a terminal or use on of the existing terminals on the desktop. A terminal can be opened by right-clicking in the background. This will bring up a menu. From this menu select "terminal".
  3. To check TED power status, in this terminal, type:
    >isisCmd m1.bc tedpower
    where m1.bc can be replaced with m1.rc, m2.bc, or m2.rc, depending on which CCD electronics you are noting the temperature rise on. Make sure that if you are on mods2data that you look at the associated m2.bc or m2.rc, and similarly or mods1data. This isisCmd will return: "DONE: TEDPOWER TEDPower=ON" if the thermoelectric device is on, otherwise it will need to be turned on or power cycled. Proceed to step 4 if that is the case.
  4. To power cycle the TED, typed in the terminal the following:
    >isisCmd m1.bc tedpower off
    >isisCmd m1.bc tedpower on
    where m1.bc can be replaced with m1.rc, m2.bc, or m2.rc, depending on which CCD electronics you are noting the temperature rise on. Then once again check the status of the TED by following step 3 above.
If accessing through a terminal on an obsN machine:
  1. On and obsN machine open a terminal or use on of the existing terminals on the desktop. A terminal can be opened by right-clicking in the background. This will bring up a menu. From this menu select "terminal".
  2. To check TED power status, in this terminal, type:
    >isisCmd --mods1 m1.bc tedpower
    where mods1 can be replaced by the mods2 if the issue is on that mods instead, and m1.bc can be replaced with m1.rc, m2.bc, or m2.rc, depending on which CCD electronics you are noting the temperature rise on. Make sure that if you are looking at mods2 that you look at the associated m2.bc or m2.rc, and similarly or mods1. This isisCmd will return: "DONE: TEDPOWER TEDPower=ON" if the thermoelectric device is on, otherwise it will need to be turned on or power cycled. Proceed to step 3 if that is the case.
  3. To power cycle the TED, typed in the terminal the following:
    >isisCmd --mods2 m2.rc tedpower off
    >isisCmd --mods2 m2.rc tedpower on
    where mods1 can be replaced by the mods2 if the issue is on that mods instead, and m1.bc can be replaced with m1.rc, m2.bc, or m2.rc, depending on which CCD electronics you are noting the temperature rise on. Then once again check the status of the TED by following step 2 above.

Why do we see this issue?

Each time the HEBs are power cycled, this will shut off the ion gauge (igpower), the thermoelectric device (tedpower), and can leave the LED in and unknown state (ledpower). restarting the ic's will take care of these 3 items. Running the observing script modsWake.pro will also address these 3 issues. If the ted is off, you may also want to check the status of the ion gauge. It is essentially the same procedure as listed above for the TED, but replace tedpower with igpower in the isisCmd.

-- OlgaKuhn - 24 Nov 2011
Topic revision: r46 - 04 Apr 2016, JenniferPower
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback