Troubleshooting Procedures for MODS team and LBTO staff
These procedures are documented for use by MODS team and LBTO staff. They were separated from the Troubleshooting page that is available on the Partner Observing wiki and available to all observers.
NB: The isis commands can be issued from the lbto account on the obsN computer, or from the mods1data or mods2data machines. If you run these from modsNdata, you do not have to enter the MODS id on the command line (e.g., the
-m 1 or the
Power cycling the guider and WFS camera controllers
The ability to power cycle the guider and WFS camera controllers used to be on the MODS GUI, in the Utilities page. Since the beginning of 2013B, a new version of the MODS GUI has been in use, and it no longer has the buttons to turn off and on these controllers. These functions must now be done from the command line.
Logged in as lbto to an obsN workstation,
- to determine the status of the MODS1 guider or WFS camera contollers, type
isisCmd --mods1 m1.ie util agc or similarly,
isisCmd m1.ie util agc -m 1
isisCmd --mods1 m1.ie util wfs or similarly,
isisCmd m1.ie util wfs -m 1
- if it is on, it will respond as in the following example for the agc:
lbto@obs3:2 % isisCmd --mods1 m1.ie util agc
DONE: UTIL AGC=ON AGC_BRK=OK
- when either of these is off, the breaker state cannot be determined and is reported as UNKNOWN.
lbto@obs3:6 % isisCmd -m 1 m1.ie util agc
DONE: UTIL AGC=OFF AGC_BRK=UNKNOWN
- to turn on the guider or WFS camera controller on MODS1, type
isisCmd -m 1 m1.ie util agc on or
isisCmd -m 1 m1.ie util wfs on
- and to turn off the guider or WFS camera controller on MODS1, type
isisCmd -m 1 m1.ie util agc off or
isisCmd -m 1 m1.ie util wfs off
Power cycling the HEB
Here is the procedure for power-cycling an HEB. MODS1 red is used in this example. Replace 1 with 2 and red with blue as indicated.
Simplest Solution (you are on-site and can get to the KVMoIP console):
Use the KVMoIP console to connect to mods1data:
isisCmd m1.ie util heb_r off
... count to 30 ...
isisCmd m1.ie util heb_r on
And then using the KVMoIP console, get to the M1.RC keyboard/monitor mode, quit from the IC program (type "quit"), then restart it (type "ic" at the DOS prompt).
If all goes well, this does the full restart and you are done.
If you are off-site and/or cannot reach the KVM:
1) Power cycle the HEB
Logged in as lbto@obsN:
isisCmd -m 1 m1.ie util heb_r off
... count to 30 ...
isisCmd -m 1 m1.ie util heb_r on
Then re-init the controller with the IC program still running
isisCmd -m 1 m1.rc seqinit
this should work the first time, but about half the time it doesn't and you see the message,
*** ERROR: SEQINIT Can't Initialize UARTs. Head Electronics UARTs are disabled. Try SEQINIT. EXPSTATUS=DONE
It may take 2-10 repeats before it "catches". We don't know why...
When you get SEQINIT to succeed, send these commands to make sure the HEB is setup properly
isisCmd -m 1 m1.rc tedpower on
isisCmd -m 1 m1.rc igpower on
isisCmd -m 1 m1.rc ledpower off
then you should be able to type
isisCmd -m 1 m1.rc estatus
and see real status info.
These are more-or-less the steps (less seqinit) that the ccdInit.pro script executes, so you could as easily run that script with execMODS instead of doing the other bits after SEQINIT.
Similarly from the GUI console, you could issue the commands as
or from modsCmd if the GUI is running
modsCmd red seqinit
(This is mods2Cmd for MODS2.)
As you can see, if you can just stop/restart the IC program after HEB power cycling, life is usually better.
On a few rare occasions the start/stop IC step will have the sequencer not init correctly. In that case, you have to do the SEQINIT commands to get it to take, then issue the three power commands.
The reason for the three power commands is that the default modes on power-up are to have the internal thermo-electric cooler (the "TED") off. With the TED off, the inside of the box gets hotter than it would be otherwise (it is part of the internal box cooling system). The IGPOWER command turns on the vacuum ionization gauge. If this is off, you'll get bogus pressure readings on the dewar vacuum vessel. By default, LED power is off, but this just makes sure (the LEDs are diagnostic LED readouts on the HEB proper - you don't want them shining inside the instrument. This is a very remote possibility, so this is a sanity check).
Swapping in a spare IC
There should be hot spares of the 3 types of MODS computers in the MODS rack in CRB: (1) the mods instrument server (modsN machine); (2) the mods data server (modsNdata); and (3) a MODS Instrument Control Computer (IC).
Instructions for swapping in a spare IC, in this case for M1.BC, as documented in IT 7238, are below.
- Power down the MODS 1 computers
- From the back of the rack. disconnect all the cords, cables, etc. from the back of the Spare IC. From left to right, it probably goes: power cord, keyboard, mouse, serial cable, VGA cable, SCSI cable, fiber optic cable. Pay extra attention that the fiber optic cable is hanging clear of moving parts and that it does not get pulled, twisted, kinked, pinched or otherwise abused during the entire swap. Stop and double check if you aren’t sure the fibers are safe.
- From the front of the rack, pull the spare computer straight out on its rails until it clicks into place.
- Find the release toggle on the rail on each side. Push the tab on the left one up, and the tab on the right one down while pulling the computer straight out and free of the rails. The computer weighs 50 pounds, so be sure you are ready to support that weight when it comes free. Get assistance if you aren’t sure.
- Put the spare computer somewhere out of the way.
- Remove the MODS 1 Blue IC from the shipping box.
- Line up the rails on the IC with the rails sticking out of the rack and push the computer in.
- If the computer hangs up partway into the rack, pull it back out until the locking tabs click and then push back in.
- Reconnect the cords and cables to the MODS 1 Blue IC.
- Power up the MODS 1 computers and restart the software.
- Through the Raritan, you should be able to connect to M1.BC to verify that the IC program is running, and to start it, if it is not.
- If you see "No video signal", then check that the video cable from the spare IC has been plugged into the correct VGA connector.
- From the console of the mods data machine (mods1data in this case), restart the appropriate caliban (cb_blue in this case).
- Put the Spare IC in the shipping box, seal up the box, put it downstairs in the lobby area and let Bonnie or Kara and Jerry Mason (OSU) know that it is ready to ship back to OSU.
MODS Raritan (KVM)
The MODS Raritan (KVM) is in the MODS rack in CRB. The picture below shows the KVM console pulled out and ready to use.
The Raritan can be used directly in CRB and, when off-site, it can be accessed through the Raritan Multi-Platform Client, which is installed on the 32-bit mountain obsN (obs2, obs3, obs4) and remote room rm507 machines (rm507-1, rm507-2, rm507-2, rm507-3). Look for it in the Desktop. (Note:
As of 22-Nov-2019, the Raritan only runs on the remote room workstation, rm507-2.)
In the Computer Room B: Pull out and unfold the LCD monitor and keyboard and turn it on. Two taps on the Scroll Lock key will bring up the KVM port selection menu. If the Raritan KVMoIP switch has been power cycled, you will have to login. Username is mods and the password is the current support password.
Use the touch pad at the lower right of the keyboard to select the machine you want to monitor, left click, and then left click the "Connect" button that appears on the screen.
When using the Client: The username is admin, and ask an ISA for the password. That should bring you to the KVM port selection menu. From there you
can left-click to select and right-click to connect to a port.
When you are done, please remember to disconnect from any port which you have used, as only one connection to a port is allowed.
Re-establishing the sequencer connection
After the HEB is power cycled, or after the fibers from the HE to the Instrument Control Computer have been swapped,
the HE-to-sequencer connection for that channel will need to be re-established. To ascertain whether or not the sequencer connection is down, you can look for a red outline around the Housekeeping icon on the left side of the mods GUI and the word "OFF" in red letters to the right of the SEQ in question. If the GUI is not running, then you can check the status by issuing the command:
isisCmd --mods1 m1.bc status
which will return a long line with "+" signs in front of the words SEQ and HE (+SEQ, +HE) if the sequencer connection is good, but "-" signs (-SEQ -HE) if the connection is down.
There are two ways to re-establish the HE to sequencer connection:
Restarting the IC program:
- Using the Raritan, connect to the channel whose sequencer is down. If it is MODS1 Blue, for example, connect to the console of M1.BC.
- Most likely there will not be a prompt. Type "quit" and you should see a DOS prompt.
- Type "ic" at the prompt. Wait about 1-2 minutes and you should see the DEWAR PRESSURE and TEMPERATURES reporting.
Doing a "seqinit":
- From a terminal logged in as lbto on an obsN computer, type:
isisCmd --mods1 m1.bc seqinit
- if successful, it will return
DONE: SEQINIT UARTINIT Done.
- if not, try again. Sometimes it takes a few tries.
- Replace the 1 with a 2 for MODS2 instead of MODS1 and the "b" with an "r" for the red instead of the blue channel.
After re-establishing the sequencer connection, remember to reset the power states of the vacuum ionization gauge, the thermo-electric device and the LED. modsNWake does this, but it is best just to issue the following 3 commands after the seqinit, in case you will not be using MODS right away.
In summary, the seqinit and the 3 power commands:
email@example.com % isisCmd --mods1 m1.bc seqinit
DONE: SEQINIT UARTINIT Done.
lbto@obs4:3 % isisCmd --mods1 m1.bc tedpower on
DONE: TEDPOWER TEDPower=ON
lbto@obs4:4 % isisCmd --mods1 m1.bc igpower on
DONE: IGPOWER IGPower=ON
lbto@obs4:5 % isisCmd --mods1 m1.bc ledpower off
DONE: LEDPOWER LEDPower=OFF
Cleaning mods1data and mods2data
MODS1 and MODS2 data are stored under
on mods1data and mods2data, respectively. These fill up and every month, approximately, a set of
files will need to be deleted after it has been confirmed that they have been ingested into the archive. Meanwhile, it is good practice to organize the
recently taken files into UT date subdirectories under
There are two scripts in
to organize files and to check the numbers of red and blue files,
. There is a third
to report the number of files in a listing obtained from the archive.
Following is the procedure:
- To organize files in mods1data and mods2data (hereafter modsNdata where N=1 or N=2)
ls (just to see what dates are covered in the directory, in the example below, let's say files from 20191030 are included)
- This creates a directory called 20191030 under the current one (which is /lhome/data) and moves all of the files with names including 20191030 to that directory. Note that if some of these files were not taken on 20191030, they will still be moved to the 20191030 subdirectory.
- To check the numbers of blue and red files in a UT-date subdirectory of
./lwc.sh 20191030 will report the numbers of blue and red files in 20191030.
- To check whether the files have been ingested in the archive:
- Do the following on your laptop or on one of the obsN or rm507-N computers, not on modsNdata.
- Get the list
- Login to the archive interface, archive.lbto.org.
- Click "Single Instrument Search" and MODS, and then enter the two dates (YYYY-MM-DD) in the area for general searches.
- Click Search
- Click Download and select URL list (CSV list and VOTABLE list also work).
- Download the result and give it a memorable name (Sep2Oct2019.txt, for example)
- Report the number of mods1b, mods1r, mods2b or mods2r files in that list.
./gwc.sh where N= 1 for MODS1 and 2 for MODS2, filename is the name of the list, e.g. Sep2Oct2019.txt and YYYYMMDD is the UT date for which you are searching, e.g. 20191030.
- The query for MODS1 files taken (or with filename containing) 20191017, and the reply, will look like:
olga% ./gwc.sh 1 SepOct.txt 20191017
1 SepOct.txt 20191017
There are 50 mods1b.20191017 files in SepOct.txt
There are 74 mods1r.20191017 files in SepOct.txt
- Delete files from modsNdata that are also in the archive.
- Back on modsNdata, in /lhome/data:
- rm -rf YYYYMMDD to delete the entire subdirectory.
- What if the numbers of files in modsNdata and the archive do not match?
- Determine which files are missing from the archive.
- on modsNdata,
- copy these files from /lhome/data/YYYYMMDD to /lhome/data/NotInArchive/
- and copy these files from /lhome/data/YYYYMMDD to /archive/data
- /archive/data is the mount point for /newdata
- Are these files in the archive now?
- If yes, they you're done.
- If no, try to find out why.
- the first course is to run "fitsverify" and see if any errors are reported.
- ssh lbto@obsN
- cd /newdata
- fitsverify mods1r.20191030.0001.fits
- the 2nd course of action is TBD. Usually at this point, the archive group may need to be involved.
Mechanism timeout and red instead of grey background on MODS GUI
The text below describes the symptoms and troubleshooting steps to diagnose and resolve a mechanism timeout error.
- A command to move a mechanism end with a TIMEOUT error and on the GUI, the box that usually displays the mechanism position is red.
- e.g. the "instconfig red grating" command on MODS1 ended with the error: WARNING: **FAULT** - RGRATING RGRATING=TIMEOUT cannot read from 192.168.139.101:8006
- Reboot the associated comtrol device. This is really only indicated if you suspect that the modsNdata computer crashed hard.
- Check the power status of the associated MicroLYNX controller.
- Identify the MicroLYNX controller that is involved, using this table of MicroLYNX controller assignments.
- Issue the command
isisCmd --modsN mN.ie ieb c mlc # status where
c is either b or r depending on the affected channel and # is the number of the MicroLYNX determined in step 1. (The
--modsN option is only needed when running this command from and obsN machine.)
- Ask someone to go up to MODS and check the light next to the affected microLynx. Is it green or red?
- If it is red, try to power cycle the controller:
isisCmd --modsN mN.ie ieb c mlc # off
isisCmd --modsN mN.ie ieb c mlc # on
- If it still is red, then the MicroLYNX controller may need to be replaced. The following steps require coordination with Rick Pogge:
- reset the MicroLYNX controller remotely.
- swap cables to use a spare MicroLYNX controller. The code for the mechanism will need to be uploaded to the spare controller.
- replace the MicroLYNX controller. The code for the mechanism will need to be uploaded to the spare controller.
Dewar pressure(s) reading zero
If the pressure in one of the dewars is reading zero or N/A, the ionization gauge (IG) may have had a glitch or may be having some other problem. The ISp's daily instrument monitoring may show this, it is visible on the housekeeping page of the MODS GUI, or output by the command: modsTemps 1 red or modsTemps 1 blue.
Recovery procedure, using mods1 blue for the example.
Remember that from the lbto account on obsN, use the --modsN option, but don't use this option if logged in as mods on modsN or modsNdata.
- Send command:
isisCmd --mods1 m1.bc igpower off
- Count to 5
- Send command:
isisCmd --mods1 m1.bc igpower on
- Wait 30+ seconds (updates on a slow internal cycle).
- For mods2, replace --mods1 with --mods2, and m1.bc with m2.bc
HEB temperatures slowly rising
The head electronics boxes (HEBs) have thermoelectric devices on their power supply boards to protect the system against operating at too high a temperature. These devices (TEDs) will cut off the HEB power when the temperature exceeds 45 C. The devices also aid cooling. If instrument cooling has been off and the boxes have reached 45 C, it will take about 1.5 hours for them to cool and reach equilibrium after cooling has been restored. If interested, the MODS1 Red HEB cooldown curve is among the attachments at the bottom of this page.
Upon restoring power to the HEB, the IC process should be restarted. Doing this ought to restore power to the TED as well as the vacuum ionization gauge (IG; see above), but one should check. To check the power state of the TED and IG, issue the following commands (these are for MODS 1 Blue channel, but replace the 1 with a 2 for MODS2 and the "b" with an "r" for Red):
To check the TED power status:
isisCmd --mods1 m1.bc tedpower (logged in as user lbto on an obsN machine)
- if the thermoelectric device is on, this will return:
"DONE: TEDPOWER TEDPower=ON"
To check the IG power status:
isisCmd --mods1 m1.bc igpower
To power cycle the TED, type the following commands:
isisCmd --mods1 m1.bc tedpower off
isisCmd --mods1 m1.bc tedpower on
To power cycle the IG, type:
isisCmd --mods1 m1.bc igpower off
isisCmd --mods1 m1.bc igpower on
Resetting Comtrol Devices on Port Timeouts
Occasionally, we see problems with the serial port interfaces timing out. (This used to occur somewhat frequently when the UPS systems were not robust.) The Comtrol devices can be rebooted via
by bringing up the IP address and clicking the "reboot".
Rebooting the comtrol closes the TCP sockets and their associated serial ports, then does a warm restart (not a power cycle) of the port server proper. It is only indicated in very rare circumstances where the IE program halts because of an abrupt computer crash (e.g., someone pulls the plug on the computer rack) and there are “dangling” sockets. The reboot clears the ports so that after the instrument host computer is restarted, the IE program is restarted, it can connect to the TCP sockets on the comtrol unit in question.
Note that rebooting the comtrol is not necessary after recovering from a mods1data freeze-up, like those we have been seeing lately (IT 7285).
After rebooting the comtrol ports, it is necessary to restart the IE service (
mods1 start ie
mods2 start ie
For example, in Oct-2014, Olga saw the following complaints from the
Poking MODS1 Mechanisms...
Common focal plane:
*** ERROR: HATCH HATCH=TIMEOUT cannot write to 192.168.139.101:8001
DONE: CALIB CALIB=OUT
DONE: AGWX AGWXS=89.841
DONE: AGWY AGWYS=33
DONE: AGWFOC AGWFS=-2.000
DONE: AGWFILT AGWFILT=1 AGWFNAME='Clear'
*** ERROR: MSELECT MINSERT=TIMEOUT cannot write to 192.168.139.102:8012
*** ERROR: MINSERT MINSERT=TIMEOUT cannot write to 192.168.139.102:8012
*** ERROR: DICHROIC DICHROIC=TIMEOUT cannot write to 192.168.139.101:8002
DONE: BCOLTTFA BCOLTTFA=16853.3
DONE: BCOLTTFB BCOLTTFB=17997.3
DONE: BCOLTTFC BCOLTTFC=20052.3
DONE: BGRATING BGRATING=2 GRATNAME='G400L'
*** ERROR: BGRTILT1 BGRTILT1=TIMEOUT cannot write to 192.168.139.112:8010
DONE: BCAMFOC BCAMFOC=3329
DONE: BFILTER BFILTER=7 FILTNAME='ND1.5'
DONE: RCOLTTFA RCOLTTFA=15846.5
*** ERROR: RCOLTTFB RCOLTTFB=TIMEOUT cannot write to 192.168.139.101:8004
*** ERROR: RCOLTTFC RCOLTTFC=TIMEOUT cannot write to 192.168.139.101:8005
*** ERROR: RGRATING RGRATING=TIMEOUT cannot write to 192.168.139.101:8006
*** ERROR: RGRTILT1 RGRTILT1=TIMEOUT cannot write to 192.168.139.101:8007
*** ERROR: RCAMFOC RCAMFOC=TIMEOUT cannot write to 192.168.139.102:8010
*** ERROR: RFILTER RFILTER=TIMEOUT cannot write to 192.168.139.102:8009
From any browser on a machine that is connected to the mountain network, open the URLs to the IP addresses and click
the Reboot button at the bottom of the page:
|| MODS1 Red IEB comtrol 1
|| MODS1 Red IEB comtrol 2
|| MODS1 Red IMCS comtrol
|| MODS1 Blue IEB comtrol 1
|| MODS1 Blue IEB comtrol 2
|| MODS1 Blue IMCS comtrol
|| MODS2 Red IEB comtrol 1
|| MODS2 Red IEB comtrol 2
|| MODS2 Red IMCS comtrol
|| MODS2 Blue IEB comtrol 1
|| MODS2 Blue IEB comtrol 2
|| MODS2 Blue IMCS comtrol
when fitsflush does not recover everything...
As soon as the observer notices that the data transfer has stalled (index(last_file) < index(next_file)-1) they need to stop taking
data. The first recourse is to run
- Pause the current exposure (the script itself cannot be paused)
- In the Command Window, type "blue fitsflush" or "red fitsflush", depending on which channel is stuck.
- Wait about a minute as the files are transferred. Sometimes all of the files transfer and the index number of the last file is still not that of the next file minus 1. This depends on the transfer disk on which it was stored. You can check in /newdata or in the log that modsDisp prints to the screen as it displays each new file that all of the files have been transferred.
does not transfer all of the files, then you may need to go a bit deeper, following these instructions which have been written for the case where the transfer has stalled on the blue channel of MODS2:
- Pause the script (stop taking new data!
- Login to mods2data as user mods
- Stop the Caliban with stuck data (mods2 stop cb_blue)
- Start the Caliban agent with the stuck data (mods2 start cb_blue) – this opens the CB/Blue console on your desktop
- In the CB window scrape 50 images from DISK1 and DISK2 by typing the following commands at the CB/Blue console prompt. This will take some time, and when it runs out of images to transfer, it will complain about no data by complaining about no END card in header in red text.
recover 50 disk1
recover 50 disk2
- Quit out of cb_blue (type “quit”) then restart the Caliban
- Take a single test image (short bias) on the blue to confirm restart
- resume observing
modsDisp not displaying latest images which are on /newdata
Note that, if the files are not on /newdata, and the difference between the index number for the next image and the last image, as displayed on the MODS dashboard, is greater than one, then this would be a problem of data transfer which "blue/red fitsflush" should resolve (see Red or Blue images not appearing in /newdata).
This discussion refers to the situation where the files are
in /newdata but are not being displayed by modsDisp. modsDisp looks for the files MODSxx.new in /newdata which should contain the names of the latest MODSxx images.
Check the permissions on the *.new files in /newdata. This can be done from any machine or account that can read /newdata:
ls -lag /newdata/*.new
the permissions should be "-rw-rw-rw-", e.g.:
lbto@obs4:17 % ls -lag /newdata/*.new
-rw-rw-rw-. 1 nobody 26 Oct 28 18:05 /newdata/MODS1B.new
-rw-rw-rw-. 1 nobody 26 Oct 28 18:04 /newdata/MODS1R.new
-rw-rw-rw-. 1 nobody 26 Sep 5 15:16 /newdata/MODS2B.new
-rw-rw-rw-. 1 nobody 26 Sep 5 15:15 /newdata/MODS2R.new
- If all 4 files exist and have the correct permissions, nothing more needs to be done.
- If all 4 files exist but one or more have "-rw-r--r--" permissions set, then the files are readonly to the MODSn data archiver, and cannot be overwritten for subsequent images.
- Login to the archive machine as root (Stephen Hooper, and Riccardo Smareglia and Cristina Knapic in Trieste, can do this).
chmod 0666 /path/to/newdata/MODS*.new
This sets the permissions for all four files. It is harmless to reset permission on a file, so no need to be selective.
- Never delete MODS*.new files with -rw-r--r-- permissions. While the MODSn data archiver will be able to write these files the first time, the default readonly permissions mean that all subsequent overwrites by the MODSn data archiver will be blocked.
- If one or more of the MODSxx.new files are missing completely, then the following corrective is required:
Example: MODS1R.new and MODS1B.new are missing from /newdata
Login to the archive machine as root
chmod 0666 MODS1*.new
the "touch" commands create zero-length placeholder files, chmod sets
Without the "touch" steps, you have to wait until (in this example) MODS1 writes files of the missing type, and then have someone with root privileges on standby to change the file mode bits before the next images are written, which is not really practical.
Current Sensing Relay (CSR) failure
If the HEB cannot be turned on remotely and shows a breaker fault:
lbto@obs2:41 % isisCmd --mods1 m1.ie util heb_r on
DONE: UTIL HEB_R=ON HEB_R_BRK=FAULT
- First try just to turn off and then back on the HEB.
lbto@obs2:43 % isisCmd --mods1 m1.ie util heb_b off
DONE: UTIL HEB_B=OFF HEB_B_BRK=UNKNOWN
lbto@obs2:44 % isisCmd --mods1 m1.ie util heb_b on
DONE: UTIL HEB_B=ON HEB_B_BRK=FAULT
- If that doesn't work, check that the HE power toggle switches on the sides of each IUB are in the AUTO/REMOTE position. They need to be in this position for remote control. These are 3-position switches (REMOTE, ON, OFF) located on the IUBs next to the fiber connections for the HE. See the picture (on_off_remote_switch_m2r.png.jpg) of this switch for MODS2 red.
- If they are, and the error persists, cycle the HE breakers in the IUBs (see breakers_m2r.jpg). When they do trip, it's hard to tell that they are not ON. So cycle them OFF and then back ON again.
- If that does not help, then go to the HE power toggle switch on the side of the IUB and switch from AUTO to ON, just as a test to see if the HEs can be powered up through the circuit breaker. If they come on then, we need to troubleshoot the start logic (CSRs).
- If, with the switch in the ON position to bypass remote operation, the HE can be turned on, one can proceed in this way for the rest of the night, but the CSR should be replaced at the earliest opportunity. The procedure for replacing a failed CSR is 603s104b.
- on/off/remote switch in the MODS2 IUB:
- HEB breakers in the MODS2 IUB:
2013 September (IT #4813 and IT #4814):
The sieve mask images (sieveSnap) that are taken each afternoon should be checked for any readout anomalies. The first images after summer shutdown (20130902) showed obvious horizontal smearing of the spots within +/- 300 pixels from the center of the image, X=1544 for the 3K x 3K region of interest (ROI) (see attached image, mods1r.2013_0706.0003_v_0901.0006.jpg, which compares pre- and post-summer shutdown images, zoomed about the region where the horizontal smearing starts). Upon further investigation, this smearing occurred only for the 3K x 3K and 4K x 3K regions of interest; the sieve spots on the 8K x 3K and 1K x 1K ROI images were round and sharp (see the attached image, mods1r_readout.png).
This problem required assistance from the MODS team, specifically Rick Pogge and Jerry Mason.
Troubleshooting proceeded as follows:
- stopped all mods services and restarted these, following the procedures for Starting Up the Instrument Control Software.
- obtained sieve Mask images using different regions of interest. Noted for what ROIs the horizontal smearing appeared.
- power cycled the red HEB (see below)
- rebooted the red IC (quit and start the IC program on the M1.RC computer).
- swapped the fiber connectors at the DOS computers in CRB to see whether the problem arises from the host computer (i.e. the sequencer) or the HEB.
- in this case, found that it arises from the host computer.
- swapped the sequencer card from the M1.RC computer currently in use to the spare IC.
- installed a new sequencer card into the original M1.RC computer.
- On 11 September 2013, a new sequencer card was installed in the original M1.RC computer. The spare IC remains in the rack in CRB, ready to be used if and when necessary.
- On 23 September 2013, there was a problem with quadrant 1 showing only bias levels, but no response to light. James Riedl reseated boards in the red HEB. Since that time, no further problems have been seen with quadrant 1, but we saw a different manifestation of a problem with the red sequencer - this time, the sieve spots were not horizontally smeared, but instead there were horizontal bands covering all columns and a range of rows about the central one.
- The two MODS2 "flight" sequencer boards were shipped from OSU, and, on 30 September 2013, one of these was installed in M1.RC. About 20 images were taken with both channels and the images look good.
MODS log files
mods1 (instrument control)
| Environmental sensor data - temps and pressures
| details for file transfers
| Project, channel filename, object info.
IS runtime log