Prev Next

SW&IT Meeting Minutes 2018-07-09

Attendees: Andrew, David, Doug, Igor, Kellee, Matthieu, Petr, Stephen, Xianyu, Yang


News from management

  • Fall safety meeting

Previous Action Items

  • Git, GitHub - Tuesday 1pm-3pm N550 (not N505!), Zoom available

Open topics

Bring any topics you want for discussion at the meeting here. Use <topic>[<name>] formatting.

This week in github

Team reports


Last week
  • ARGOS swing arm work
    • python script scraps swing arm website using beautiful soup
    • query status and send retract/deploy commands
    • python 2.6.6
  • Ingested handover documents into CAN
  • Played with ARGOS and TAN Docker containers

This week
  • Vacation Thursday and Friday
  • Request daytime testing for ARGOS swing arm
  • Use Docker X applications and SSH forwarding


Last week
  • Babysitting Ketiv through Vault upgrade
    • Not as hands-off as the contract would suggest
    • I had to secure the Vault license from Autodesk during the upgrade
    • Is now functional
  • Vault client software
    • Versions prior to 2016 no longer work
    • I have found one user who was using 2015. He's been upgraded
    • Client 2018 is available to our users from here. Simply point it to for licensing and vault storage and it should function.
  • Tracking managers down re: IT email lists
This week:
  • Track more managers down
  • Mountian Synology disk repair
  • Backups


Last Week:
  • Analysis of MODS, LUCI, and LUCI-ARGOS pointing data from April 1 forward (since IT 7084). Resulting new models and suggestions sent to Yang.
  • Backtesting of suggested and current production models against MODS and LUCI pointing telemetry. Sent results and suggestions to Yang.
  • Analysis of SX and DX collimation telemetry for modeling. Sent results and new model recommendations to Petr (Yang in cc for pointing model implications).
This Week:
  • Attempt to study 1 year's collimation telemetry to see if changing elevation models are actually indications of a flaw in temperature modeling due to sample frequency.
  • I anticipate some HR/admin related work this week.
  • Vacation next 2 weeks.


Mostly dealing with LUCI issues


Short description: Looking at a myriad of LUCI sofware issues this last week.

Long description:

It started with what it looked like issue #6805, a filter issue with LUCI but upon inspection of the error it didn't seem to be a problem with the hardware but the internal LUCI software. Basically the command never was sent to the filter. Later on other issues become more apparent, like processes suddenly stopping, not the same process everytime, except for the Time Server that seem to be prone to be shutdown. Running out of ideas I look into the database, being the one thing that all the luci sofware is depended on as it uses hibernation through out. I noticed the database disk partition was kind of filling up with only 13G to go. Most of the space on the partition was used by database transaction log files. I purged them for good measure and freed about 70GB of space. All this weird behavior culminated with a shutdown of the LUCI machine by the end of Thursday. The machine took a long long time to come up, something I found scary and made me think on failing over to lucix even if as a test, I think that day will come. Anyhow the machine come back up on it's own. We restarted the luci software and everything looked fairly ok. Dave T. was happy for a while ... just before mid night on Friday Dave contacted me with an issue of luci RTD not finding the data. This is an internal mount LUCI does everytime there is a new day, which happens at around 5PM localtime, which was the time the machine was down, so it made sense to think that the software just missed the window to create the mounts. I put them back manually, Dave T. tested them, he seems happy. Later that night Dave T. send an email in frustration, the LUCI software was completely misbehaving. Alexander looked into it and found that the LUCI config.xml file got corrupted, funny enough around the time Dave reported the mount issue.

Take away points:
  • I need to understand how hibernation really works, as it has not only a database component but also a xml export component. The LUCI configuration xml file gets constantly overwriten by the system, so there is a OS file system component dependency. I don't think LUCI needs to change the configuration file at all, so can we just stop writing to it? or maybe separate static components from variable ones?
    • One option could be to put the config.xml into a tempfs partition, with no contention from anybody else in the machine?
  • Also look more into the database and how it is affected when this issues happen. We need to get some statistics that tells us if the database is an issue at all. It doens't look like that because you never see a mysql process with high cpu, never, and when accessing the database is always fast. Nevertheless is there are hot query that might create problems from time to time?
  • I created a script, that matches java process with mysql processes and gives a summary of what's going on. The script is called and lives in luci.luci ~lucifer/lcsp directory


migrate OT/queue SVN to GitHub. It took several attempts as it is a lenghtly process, by Petr's recomendation I used which uses git svn clone <svn_url>, in the example they give they use the argument --no-metadata which stripts the svn revision out of the history upon ingestion. That's something I found undesirable as there are already emails and issuetracks that refer to pass svn revisions, and that needs to be preserved. Thanks to Petr for all the help given!.


Last week:
  • End-of-month tasks for AllSky data, telemetry.
  • Updated telemetry wiki page for leap seconds info.
  • Looked into LBC guiding drift issues from 2015 to make sure there were no obvious LBC rotator issues.
  • Found a hard-coded port number in the OAC HK upgrade code and fixed that a few places. It's in SVN and built as 5.0 on the mountain and in Tucson.
  • Coordination of OVMS racktangle kernel update with Petr - still need to pick a date.
  • Created a new local TCS git environment (for master and 2018A) and committed the GCS updates for OAC HK upgrade on a branch. Will test again and then request the pull.
  • Created a separate branch for the other GCS updates I had been making - IT4686 and committed my changes locally. I will probably check these again and then just be done with GCS.
  • Reviewed Petr's computeToRaw methods added to WFSingThread in GCS.
  • Half-day holiday.

This week:
  • New build and test of checked in GCS updates so I can request the pull.
  • Vacation on Thursday.



  • some work on GCS WFS, looks good
  • returned to IT3xxx (Time to live) and almost nailed the issue - last problem seems to be in handling rotators
  • updated Git documentation, preparation for Git discussion
  • changed GitHub permission for TCS to allow us to create branches


  • Github hooks verified delivered.
  • Work on DNS server using knot and unbound.
  • Obs3 VNC issues.
  • Laura's laptop broken, called in, repaired.
  • SOUL network switch RFQ. Purchase of SOUL switch for lab.
  • shipment of SOUL machine.


Last week:
  • Reactions on seveal power backplane document versions from MG;
  • Ordered wires and ICD3 programmer for the PBP test;
  • Implemented the PBP state machine first version;
  • Wrote the PBP anlog parse in C++;
  • PBP LBTO internal discussion.
This week:
  • Test the power backplane with the ICD3 programmer;
  • Read the memory map by Matlab;
  • LBTI SOUL network and Jumbo frame


  • Continue to study the AO supervisor code and the IT issue #7261 (memory problem with WfsArbitrator)
  • Working with Doug to transition and learn the pointing model construction and improvement.
  • Working on pointing model related wiki pages.

This topic: Software > MeetingMinutes20180709
Topic revision: 10 Dec 2018, UnknownUser
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback