MAT Troubleshooting (As of October 20, 2015 the MAT is no longer on the telescope)

May 2013 - K. Summers - updates for current configuration
19 March 2006 - M.D. De La Pena - Initial notes

There are two components to the MAT software: MATGUI and MAT Server. The GUI is in C++ and runs on Linux; the server is written in C and runs on the "Single Board Computer" (SBC) which is a mini-Linux system. The SBC is physically contained within the MAT box mounted on the C-ring of the telescope.

While using the MAT, it is advisable to monitor all of the software components.

MONITORING THE MAT SERVER

The machine is mat.mountain.lbto.org , the login used is root. If you are having problems with the MAT server software or the MAT hardware, it could be:

  1. some needed processes are missing
  2. there is no disk space
  3. the hardware needs to be power cycled

PROCESSES

When the MAT Server software is running properly, you should see the at least the following processes:
 $ ps -aux
     29         root       1996   S   /home/mat/matserv
     32         root        584   S   focuser /proc/bus/usb/001/006 1 768
     33         root        584   S   focuser /proc/bus/usb/001/008 0 768
     34         root        584   S   filter /proc/bus/usb/001/005 1 512
     35         root        584   S   filter /proc/bus/usb/001/007 0 512
     36         root       1936   S   camera /proc/bus/usb/001/004 1 256
     37         root       1916   S   camera /proc/bus/usb/001/009 0 256

The matserv process is the main process; it invokes the other six processes. There should be two focuser, filter, and camera processes -- one for each camera. In the following line, the 1 denotes the camera being supported. The wide-field camera is 0; the narrow-field camera is 1.
focuser /proc/bus/usb/001/006 1 768 

The devices internal to the MAT hardware communicate with one another via USB devices. If the matserv is not running for some reason, you will need to restart it. The matserv will start all the focuser, filter, and camera processes too. Before you invoke a new version of the matserv, you need to kill the currently running focuser, filter, and camera processes. This can be done with

   $ killall -9 focuse
   $ killall -9 filter
   $ killall -9 camera
   $ /home/mat/matserv > /tmp/matserv.log 2>&1 & 

Note that there is a matserver executable dated April-2006 in the /root directory (the default home for the root user) that is not used.
We are running the Mar-2006 matserv version from /home/mat .

Monitoring with ps -aux, you should see the various processes being invoked.

Stay logged into the MAT to see the messages generated for STDOUT or STDERR which will be displayed to your screen. An alternative way to monitor the MAT Server is to type
 $ tail -f /tmp/matserv.log

This will show you the tail end (~10 lines) of the /tmp/matserv.log file, where the new output is appended as the file grows. In order to inspect the messages after-the-fact, the /tmp/matserv.log file can be inspected with an editor or some other tool as the file is world readable.
Note that the log file is refreshed every time the machine is rebooted. Based upon past performance of the MAT, this file could have a rather short lifetime, so get what you need from it when you can.

DISK SPACE

It is possible to run out of disk space on the SBC. This will cause errors (all commands returned NAK) on the MATGUI side of the software. To see the amount of disk space available,
$ df -h

If the filesystem is full (100%), then you will have to delete some files to get the system to function properly. Since the limited disk space problem, Dan has done a lot of clean up on the disk. If files must be deleted for any reason, you MUST consult with Dan, or you run the risk of deleting something important.

POWER CYCLE

If one of the USB devices resets itself for some reason, or the MAT hardware is simply unresponsive, the MAT server will probably need to be power cycled.

Until there is a working "iboot" type of device connected to the MAT hardware, the way to power cycle the hardware is to pull the plug. The matserv program is launched on reboot via the /etc/rc.sysinit file.

USB DEVICES

Sometimes, even after a power-cycle, one or more of the camera, filter wheel, or focuser devices still doesn't connect to the server. If the server doesn't come up, complaining:
Server aborting for lack of resources.

It could mean it cannot find one of the devices. Have someone reseat all the USB and RJ45 connections and power-cycle again.

See notes in IT4383.

MONITORING THE MATGUI

Please note that only one version of the MATGUI should be run at a time!

The MATGUI generates messages which are sent to the MATGUI message box and/or the SYSLOG for monitoring and diagnostic purposes. For most functions, you can track the progress of a command by watching the MATGUI.log file located in the directory where you launched the GUI. This file location is configurable via the GUI, from the Options tab.

NOTES

Check that the NTP server is running on the MAT. If the NTP daemon is not running, there will be a time error in the DATE keyword stored in the FITS header, as well as an error in the in the time of the machine. This error translates to an error in the hour angle. To see if the NTP server is running, type
 $ ps -aux | grep ntp 

and hopefully you should see a process like:
 root       1708   S   /sbin/ntpd 

If you do not see an NTP daemon process running, contact Kellee or Dan for assistance. You can also check the time on the machine by typing date on the command.

There has been trouble with icing on the window inside the MAT. You can try to mitigate this situation by changing the temperature of the appropriate camera on the Temperature tab of the MATGUI.

Example Here is one example of errors generated by the MAT Server code. This information was printed to STDOUT/STDERR on the SBC when trying to take an exposure with the Wide-field camera. Lines 2-6 show the command to take a 100 msec exposure with the Wide-field camera with the focus set to zero and the filter in position zero being used was sent from the MATGUI -> MAT Server code. Line 7 indicates there was an error in obtaining the exposure on the SBC. These log messages show a problem on the MAT SBC and not in the MATGUI software. The SYSLOG messages for the MATGUI are included below for reference. Note that all the commands sent to the MAT Server from the MATGUI software returned NAK (the error indicator).
how do you see the NAK? I don't get the note (KS)

MAT SBC STDOUT/STDERR
  1 # client accept worked! ip=10.30.0.243
  2 Focuser homed
  3 focus returned was: 0
  4 Exposure set to 100 msec.
  5 Resetting Filter wheel. . .Filter position set to 0
  6 exposing widefield camera
  7 expose returns -2
  8 client dies with bad header.
  9 error - abort.
 10 Network error detected - restarting server
 11 Internal error detected - restarting camera server. Killer was -2

MATGUI SYSLOG
 Mar 16 15:57:42 lbtmu01 LBT: Instantiate the Command Line
 Mar 16 15:57:42 lbtmu01 LBT: line ='0$1$100$0$14$pass_all$1234$0$0$0$B2000'
 Mar 16 15:57:42 lbtmu01 LBT: Command line OK!
 Mar 16 15:57:42 lbtmu01 LBT: TO cl->execute
 Mar 16 15:57:42 lbtmu01 LBT: Initializing the MAT
 Mar 16 15:57:42 lbtmu01 LBT: Set Focuser to Home
 Mar 16 15:57:42 lbtmu01 LBT: Socket FD: 24
 Mar 16 15:57:42 lbtmu01 LBT: Length sent: 33554432 #of chars sent: 4
 Mar 16 15:57:42 lbtmu01 LBT: Chars sent: H0 #of chars sent: 2
 Mar 16 15:57:42 lbtmu01 LBT: XACTION: Focus to Home
 Mar 16 15:57:42 lbtmu01 LBT: ********** Result is NAK **********
 Mar 16 15:57:42 lbtmu01 LBT: Sent Focuser to Home - DONE
 Mar 16 15:57:42 lbtmu01 LBT: Socket FD: 24
 Mar 16 15:57:42 lbtmu01 LBT: Length sent: 33554432 #of chars sent: 4
 Mar 16 15:57:42 lbtmu01 LBT: Chars sent: J0 #of chars sent: 2
 Mar 16 15:57:42 lbtmu01 LBT: XACTION: Get Focuser Position
 Mar 16 15:57:42 lbtmu01 LBT: ********** Result is NAK **********
 Mar 16 15:57:42 lbtmu01 LBT: Current focus: 0 Requested focus: 0
 Mar 16 15:57:42 lbtmu01 LBT: Socket FD: 24
 Mar 16 15:57:42 lbtmu01 LBT: Length sent: 100663296 #of chars sent: 4
 Mar 16 15:57:42 lbtmu01 LBT: Chars sent: E0 100 #of chars sent: 6
 Mar 16 15:57:43 lbtmu01 LBT: XACTION: Set Camera Exposure
 Mar 16 15:57:43 lbtmu01 LBT: ********** Result is NAK **********
 Mar 16 15:57:43 lbtmu01 LBT: Set the Exposure Time - DONE
 Mar 16 15:57:43 lbtmu01 LBT: Socket FD: 24
 Mar 16 15:57:43 lbtmu01 LBT: Length sent: 67108864 #of chars sent: 4
 Mar 16 15:57:43 lbtmu01 LBT: Chars sent: L0 0 #of chars sent: 4
 Mar 16 15:57:45 lbtmu01 LBT: XACTION: Move Filter to Position
 Mar 16 15:57:45 lbtmu01 LBT: ********** Result is NAK **********
 Mar 16 15:57:45 lbtmu01 LBT: Set the Filter Wheel - DONE
 Mar 16 15:57:52 lbtmu01 LBT: Get a Light image
 Mar 16 15:57:52 lbtmu01 LBT: XACTION: BEFORE Get Light Image
 Mar 16 15:57:52 lbtmu01 LBT: Socket FD: 24
 Mar 16 15:57:52 lbtmu01 LBT: Length sent: 33554432 #of chars sent: 4
 Mar 16 15:57:52 lbtmu01 LBT: Chars sent: P0 #of chars sent: 2
 Mar 16 15:57:56 lbtmu01 LBT: XACTION: AFTER Get Light Image
 Mar 16 15:57:56 lbtmu01 LBT: Error in getting Light image: #ERR#6
 Mar 16 15:57:56 lbtmu01 LBT: Got a Light image - DONE
 Mar 16 15:57:56 lbtmu01 LBT: NULL Image!
 Mar 16 15:57:56 lbtmu01 LBT: FROM cl->execute FAILURE 

On startup, the devices are printed:
dev: /proc/bus/usb/001/005 3 768
dev: /proc/bus/usb/001/007 2 768
dev: /proc/bus/usb/001/004 3 512
dev: /proc/bus/usb/001/006 2 512 

These are the values of the devlist : name (/proc/bus/usb/001/005, bus (3), and type (768)

Should be 6 devices listed - 2 focusers, 2 filters, 2 cameras.

ERROR MESSAGES

matserv Server aborting for lack of resources server didn't find 2 focusers, 2 filters, and 2 cameras  
  Server aborting for lack of resources no camera or no filter or no focuser  
  No IP server address specified    
  camera connect failed. returned 0x%x    
Topic revision: r12 - 29 Oct 2015, ChrisBiddick
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback