TMS Meeting Minutes, July 16, 2020

Attendees: Andrew, Breann, Heejoo, John, Matthieu, Trenton, Yang (Zoom meeting)

Meeting discussion summary:

  • This is a note that we agreed to just call each of our meetings "TMS meeting" and will not use the term "general TMS meeting" and "software TMS meeting," since we seem to just be discussing all things related every time. So there is no point of dividing the meetings into general and software specific.
  • Heejoo discussed some data collected during our TMS TCP test in this week.
  • Yang gave a summary of the TMS TCP test with Heejoo in this week (on Monday and Wednesday, July 13 and 15). The main problem we have is that we cannot seem to mix channels 11 and 15 with any other channels in the TCP command "measurelengthwithrawdata," which is the one needed to retrieve raw uncorrected measurement values for each channel. As soon as we mix channel 11 and 15 with any other channels, we will receive all zero results from the TCP server. And there is no way to recover from that state afterwards unless we restart everything on the control PC. We had also encountered problems where the system gives back all "ERROR" results. We contacted eTalon engineers. After some email exchanges, they suggested this problem was caused by channels 11 and 15 being above the 9500mm threshold. They suggested we change some configuration files on the TMS PC. See below for the detailed email Yang sent to the eTalon engineers.
  • We discussed briefly some aspects of the Ciddor correction program that Trenton had developed (and later ported to Python 3 by Yang). We were mainly concerned about the accuracy.
  • We discussed on Andrew's initial draft spreadsheet for planning and milestones. Heejoo put the spreadsheet on Google drive, access it from here. We may later convert it to Google sheet if that works well.
  • We then discussed plans for TMS test on that particular night since it seems we may have a chance. (We did get chance on the night of July 16 for a couple of hours of TMS test with LBC on the telescope. However the TCP server problem persisted and seemed to be worse. See below for details).

Action items:

  • Plan the short-noticed closed-dome test the night of July 16 (2.5 hours right after the meeting, that is). We did get the closed-dome test, roughly from 7:45pm -- 11:30pm. Andrew, Breann, Heejoo, John, and Yang participated via TMS zoom channel. Yang conducted some software test until about 9:40pm (with problems) and then Andrew, Heejoo, and John continued with other tests and data collection.
  • Yang will do some study and comparison of the Ciddor correction programs (with the online NIST calculator, eTalon's own correction, and with the data in the original paper).

Troubleshooting emails

Here are some of the troubleshooting emails we exchanged with eTalon engineers, reproduced here for archival purposes.

Exchange after Heejoo and Yang's test on July 13 and 15:

Our emails sent to eTalon engineers:

First of all, we made sure the lasers were ready. Heejoo has remote login to the control PC and can see the laser status on the control software. All laser status lights were green before we proceeded. I also sent the "laserready" query over the TCP before we started, and it came back with "laserready_1" indicating lasers were ready.

We currently have nine channels (4, 8, 10, 11, 13, 14, 15, 17, 23) that we are using. First what we did was repeatedly test the single channel measurement, that is, we repeatedly send the command "measurelengthwithrawdata" on a single channel over the TCP. E.g.:

'> measurelengthwithrawdata,4,9137
'> #receive the results back
'> measurelengthwithrawdata,11,9528
'> #receive the results back
'> ......

We did this for all of our nine channels repeatedly many times. All these tests succeeded and came back with valid results.

What we found was that for multiple channel queries with the "measurelengthwithrawdata" command, as soon as we include channel 11, or 15, and regardless of what other channels might be, the measurement will fail to report back valid results. We will receive results with all zero values. For example,

'> measurelengthwithrawdata,4,9137,11,9528
'> measurelengthwithrawdata,4,9137,15,9542,8,9034

and so on. As long as the channel numbers are >= 2 in the command, and one of the channels is either 11 or 15, we won't be able to receive valid results. We did plenty of such tests with a consistent behavior as described.

What's more, after such commands (i.e., multi-channel involving channels 11 or 15), we are no longer able to command any valid measurements. All measurements, regardless of whether they include channels 11 and 15, will report back all zero values. It seems at such points, we have no ways to recover through remote operations. Our only choice was to restart everything from the control PC and wait for the system to become ready again.

We also discovered another problem through our test. Having discovered the above problem, we decided to exclude channels 11 and 15 from our tests and just focus on the remaining seven channels. We wrote a small program that can generate all possible combinations from those seven channels, and we ran the 'measurelengthwithrawdata' command through each of these channel combinations multiple times. We just wanted to see if they would all succeed, and if the system was able to handle repeated measurement tests for a long time.

For most of the time, this test went well and all reported back valid results. However, we had an instance where during the tests of 4-channel combinations, it suddenly stopped working, and reported back "ERROR" as the result. After this point, all subsequent measurements reported back "ERROR" results. We also checked the console window on the control PC, and it displayed the following messages:

Warning: Error occurred while executing callback:
Error using LEDController/SetStatus (line 66)
Message: the port is closed.
Source: System
HelpLink:
UpdateLEDStatus: Message: The port is closed.
......
...... (repeats the "HelpLink:" and "UpdateLEDStatus:..." lines)

After this had occurred, we again could not recover from the TCP session (i.e., could not perform any valid measurements). We restarted the system on the control PC.

eTalon engineer reply:
The channels 11 and 15 have lengths over 9.5 m, which is the criteria of selection between reference fibers 2 and 3.

You can find this info. in the CometDAQParameters.xml file at C:\ProgramData\Etalon\MultiLine\2.0, you will see the following:

eTalon configuration

Our explanation:
  • When using single channel 11 or 15, the software select ref.3, works well
  • When using single for multiple channels except for 11 and 15, all of them smaller than 9.5 m, the software select ref.2, works well
  • When combining 11 or 15 with others, the software uses the longest length to determine ref. fiber, i.e. using length of channel 11 or 15 (over 9.5m), in this case, the ref. 3 is selected, all zeros returned.

It seems ref. 3 is OK for 11 and 15, but problematic the rest. This may happen due to the number of points per fringes are set to be different for different ref. fibers. You receive all zeros but not error message, because measurements are succusfull but the analysis failed. That is why you also found that lasers were OK.

Solution:

A easy solution is to change ref3Length from 9.5 to 10 in the xml file. This will force all channels (I assum all your channels < 10 m) to use ref. 2 in the measurement. Please close software completely, change this value and save, restart software. This should work.

A useful tip: when close software, plesae select “keep laser on” in the initializing window. This will not shut down the lasers and save you much time for laser stablize.

Exchange after Our test on July 16:

Our emails sent to eTalon engineers:
We experienced the following problems tonight:

1. We did the configuration file change per your instructions and restarted the software. However the problem persisted. When we mix channels 11 and 15 with any other channels (in the command "measurelengthwithrawdata"), we receive back all zero values. What happens afterwards seemed to be random. Sometimes, we were able to continue receiving valid results when channels 11 and 15 were not involved, e.g.,

'> measurelengthwithrawdata(4,11) # short notation for measure channels 4 and 11
'> # receiving all zero values,
'> measurelengthwithrawdata(4,13) # next measuring channels 4 and 13
'> # works and receiving valid results

However sometimes, we will not be able to receive any valid results after channels 11 and 15 were involved and we receive zero results. For example: '> measurelengthwithrawdata(8,15) # measure channels 8 and 15
'> # receive all zero values,
'> measurelengthwithrawdata(8,17) # measure channels 8 and 17 the next
'> # receive all zero values.
'> # keep receiving all zero values afterwards for any measurements.

2. Another problem was that once we started to receive all zero values, we started to experience problems with the GUI on the TMS control PC (which we have access to via teamviewer). After starting to receive all zero values, the GUI on the control PC became unresponsive, e.g., clicking a button and nothing showed up (it should pop up a window).

We have no choice, but to force quit the software. And then afterwards, when we tried to restart it, it showed the error message: "DAQ error, incorrect number of devices found, please check xml file."

We found the solution to this problem was to then close the software again, normally this time (since it just restarted, we were able to close it normally and gracefully). And then open the software again, then it seemed to be fine.

3. Another problem we experienced when we restarted the software and received the "DAQ error" mentioned above, we had errors when performing a single standard measurement from the software GUI on the control PC, as well as a deformation analysis, also on the control PC software. When we perform the mentioned tests, it will have a pop-up error message "Subscripted assignment dimension mismatch."

4. During our TCP server 'measurelengthwithrawdata' tests, we again experienced the same problem we reported last time, but with an increased frequency. At some point, we received the results "ERROR" from the 'measurelengthwithrawdata' command (the exact channels that cause this problem seemed to be random), all subsequent measurements reported back "ERROR" results. The console window on the control PC displayed the following messages:

=====
Warning: Error occurred while executing callback:
Error using LEDController/SetStatus (line 66)
Message: the port is closed.
Source: System
HelpLink:
UpdateLEDStatus: Message: The port is closed.
......
...... (repeats the "HelpLink:" and "UpdateLEDStatus:..." lines)
=====

..... to be written ......
Topic revision: r1 - 17 Jul 2020, YangZhang
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback