usage: h5csv [OPTIONS] -d dataset files LBTO OPTIONS -h, --help Print a usage message and exit -V, --version Print version number and exit -o F, --output=F Output raw data into file F -c 4,5,6 Print the specified columns (by number) of the dataset (note that time_stamp (col 1) is included by default) This option only works for flat structures (weather, seeing, etc.) and cannot be used at the same time as -n -n "name1, name2" Print the specified columns using the names of the dataset (note that time_stamp is included by default) This option cannot be used at the same time as -c -s --strtime Convert MJD time microsecs to UTC time string -u --unixtime Convert MJD time microsecs to Unix UTC time millisecs -e --enumnumber Print enum strings as integers (for graphing) -f --fields Print only field names and column numbers This option must be used on its own -A --onlyattr Print only the attributes. field names/number, descriptions, units This option must be used on its own, and must provide a filename. -d P, --dataset=P Use the specified dataset This argument is ALWAYS required and must be the LAST option --------------- Examples --------------- 1) Dump the operations dataset from DDS HDF5 file 201601190000.dds.operations.h5 to a CSV file named operations.csv h5csv -o operations.csv -d operations_01 /lbt/telemetry_data/tcs/dds/2016/01/19/201601190000.dds.operations.h5 OR h5csv -d operations_01 /lbt/telemetry_data/tcs/dds/2016/01/19/201601190000.dds.operations.h5 > operations.csv 2) Select columns 3,4 from dataset seeing_01 and convert the timestamp from MJD microseconds to unix time microseconds in 201506290541.dimm.seeing.h5 h5csv -c 3,4 --unixtime -d seeing_01 /data/201506290541.dimm.seeing.h5 > seeing.csv 3) Select column 5 from dataset seeing_01 in 201506290541.dimm.seeing.h5, output to seeing.csv h5csv -c 5 -o seeing.csv -d seeing_01 /data/201506290541.dimm.seeing.h5 4) Select columns named sxflux and sxfwhm from dataset guiding_01 in 201601191109.gcsl.guiding.h5 h5csv -n "sxflux, sxfwhm" -d guiding_01 /lbt/telemetry_data/tcs/gcsl/2016/01/19/201601191109.gcsl.guiding.h5 5) Dump the attributes from dataset offload_ttf_command_01 in 201601041558.aosl.offload_ttf_command.h5 h5csv -A columns.txt -d offload_ttf_command_01 /lbt/telemetry_data/tcs/aosl/2016/01/04/201601041558.aosl.offload_ttf_command.h5
Initial version was based on hdf5-1.8.10 http://www.hdfgroup.org/ftp/HDF5/releases/hdf5-1.8.10/src/hdf5-1.8.10.tar.gz
Telemetry collection library was built with 1.9.131
As of Summer-2015, updated hdf5-1.8.15-patch1
In Feb-2016, upgraded to hdf5-1.8.16.
shell64
as user ksummers
in the directory /home/ksummers/telemetry/h5csv-64bit
:
1. Download and untar http://www.hdfgroup.org/ftp/HDF5/current/src/hdf5-1.10.0.tar.bz2 into hdf5-1.10.0
2. Checkout the updates from svn:
cd hdf5-1.10.0/tools cp -pR h5dump h5csv cd .. svn export https://svn.lbto.org/repos/tools/trunk/h5csvThis should checkout the following files into a
h5csv
directory in hdf5-1.10.0 : configure mvFiles tools/Makefile.am tools/Makefile.in tools/h5csv/Makefile.am tools/h5csv/Makefile.in tools/h5csv/h5dump.c tools/h5csv/h5dump_ddl.c tools/h5csv/h5dump_ddl.h tools/h5csv/h5dump_xml.c tools/lib/h5tools.c tools/lib/h5tools.h tools/lib/h5tools_dump.c tools/lib/h5tools_dump.h tools/lib/h5tools_str.c tools/lib/h5tools_str.h tools/lib/Makefile.in3. Move the files to the appropriate directories (on top of the hdf5 installed files) with the script
mvFiles
, configure and make:
./h5csv/mvFiles (complains that it cannot delete the directory, but it's ok) ./configure --prefix=<your-path-here>/hdf5-1.10.0 --enable-shared=no \ --enable-static-exec --enable-threadsafe --with-pthread --disable-hlIf you want a development version, configure with debug by adding:
--enable-build-mode=debugThen build
gmake (Takes several minutes and there are lots of warnings) gmake install (puts the executable into the prefix/bin directory)Builds a lot more than it needs to, but only builds the one executable of tools The executable
h5csv
is in the tools/h5csv
and in the top-level bin
directory
4. Copy the executable to /web/modules/hdf5-1.10.0/bin
(Tucson) for use by the telemetry visualization application
5. Repeat the steps on a 32bit host in Tucson (I use rm580f-1
), if necessary, using the lbtscm
account in /home/lbtscm/lbt32/astronomy/hdf5-1.10.0
. Using the configure command:
./configure --prefix=/home/lbtscm/lbt32/astronomy/hdf5-1.10.0 \ --enable-shared=no --with-zlib=/home/lbtscm/lbt32/lbto_runtime/zlib-1.2.7 --enable-static-exec \ --enable-threadsafe --with-pthread --disable-hl6. Copy the executable to
/lbt/astronomy/stow/h5csv/bin
(Tucson and the mountain)
1.10.0(Feb2017)
Added a required command-line parameter for the -A
option for a filename. It used to just write to a file called columns.txt
but now it will be called multiple times at once in the same directory, so it has to have a unique filename. This filename will be passed in from the telemetry visualization tools.
Note: There is some weird behavior with this parameter. When used with the -o option (send output to this filename, like telemToCSV
does), then it does not put the column names in the -o file, but only in the file specified with -A. 1.10.0(Aug2016)
No functionality mods - just updated to HDF5 1.10.0.
1.8.16(Apr2016)
Minor mods to allow the "Units" attributes to be dumped and parsed; only affects the -A
option: h5tools_str.c
)
h5tools_dump_attribute
doesn't do anything unless the name is "Units" (h5tools_dump.c
)
h5tools_dump_attribute
to use rawdatastream instead of rawoutstream h5dump_ddl.c
)
columns.txt
set as h5tools_set_data_output_file
if -A
is used (h5dump.c
)
1.8.16(Feb2016)
Column selection using numbered columns didn't work correctly for nested structures. Also, column numbers could change as streams are modified by applications that are writing the streams. This version allows column selection by column name, using ->
for the syntax.
For instance:
h5csv -u -n "hp3->absenc,hp4->loadcell,hp5->command" -d hardpoints_01 /lbt/telemetry_data/tcs/pmcr/2016/01/19/201601190000.pmcr.hardpoints.h5 timestamp_utc, hp3->absenc, hp4->loadcell, hp5->command 1453161600042,-2.5085144,-0.692096949,0.233640179 1453161600293,-2.5085144,-0.167046547,0.233640179 1453161600544,-2.5085144,-0.225385427,0.233640179 ... h5csv -n "temperature,pressure" -d weather_01 /lbt/telemetry_data/tcs/pcs/2016/01/19/201601190001.pcs.weather.h5 time_stamp, temperature, pressure 4959878483250974,-0.809967041,690.554993 4959878543306738,-0.834991455,690.545044 4959878603363321,-0.834991455,690.560059 4959878663420852,-0.860015869,690.560059 ...
lib/h5tools.h/.c |
added to ctx structure - named field list of structs with parent/name strings new parentInList method so that we can know to traverse through a data type to look for columns we want changed columnInList method to use column numbers or names |
lib/h5tools_dump.c |
append new cmpd_fieldsep (using colon) to end of a column name when dumping types so that we can parse out the column number change the traversing to also check parentInList so that we go in to compound types if individual columns are requested more checking required on some of the output to make sure we don't print a "parent" name or separator when we want the "parent->child" field dump_attribute method modified for our use change datasetblockbegin to be CR so we get a CR after stream name when dumping attributes only |
lib/h5tools_str..h/.c |
h5tools_print_char modified to replace comma with blank in case we have commas in descriptions (which we don't after the next TCS build, but we do now) h5tools_str_sprint added parent argument for recursive calls - this function is just like print_datatype the way it's used for compound data types. So, similar to mods in print_datatype , had to check parentInList to make sure it traversed through compound types and make sure we don't get "parent" or separator when we want the "parent->child" If doing field lists, don't use the quote character |
h5csv/h5dump.c |
New arguments implemented -A for attributes only (used to build the telemetry map), -n for columns by name |
h5csv/h5dump_ddl.h/.c |
handle_datasets and dump_datasets modified to take a list of column names added parsing of the columns by name |
lib/h5tools.h |
added to ctx structure - parent, column info, file number |
lib/h5tools.c |
added columnInList function render_element change to basically ignore ncols so it doesn't put a line break in the middle of our list |
lib/h5tools_dump.h |
add parent to h5tools_print_datatype |
lib/h5tools_dump.c |
change fmt_double , fmt_float to match our formats cmpd_sep , cmpd_pre , cmpd_suf , cmpd_end , datasetend , datatypeend , fileblockend , datasetblockend , datablockbegin , datablockend , strucblockend to get rid of CR and {} dset_format , datasetbegin , databegin to not have titles elmt_suf1 to be newline to make multiple files have newlines after each row of data added cmpd_nestsep for the separator between compound data type names and the fields (using -> ) h5tools_dump_simple_data function added a need_prefix FALSE before for loop on elements print_datatype lots of changes: added column check so we can get the header for only the columns we want; changed ncols - but did that work?? also deleted the names of the datatypes printed; tweaking for CRs and delimeters h5tools_dump_datatype don't print datatype if it's not the first H5 file we're working on, CR tweaking h5tools_dump_data string_dataformat.idx_format changed to make the indexes for each row NOT print ; CR changes change the name of the first field in the first compound data type to timestamp_utc if time conversion requested |
lib/h5tools_str.c |
h5tools_str_sprint modified float format, added column check, don't use line_indent between values or cmpd_name new method h5tools_str_sprintUTC created to call if time conversion requested |
h5csv/h5dump.c |
The main program for h5csv , this is modified extensively to delete most of the options and call handle_datasets directly instead of through the handle functions set up |
h5csv/h5dump_ddl.h |
dump_dataset and handle_datasets modified to take column list and file number params |
h5csv/h5dump_ddl.c |
dump_datatype sets line_ncols to 1024 -- do we need that? don't include dataset begin/name/blockbegin parse column list command-line parameter into array of col numbers call h5tools_dump_datatype with rawdatastream instead of rawoutstream don't call h5tools_dump_dataspace don't iterate over attributes attr_iteration send datasetend to rawdatastream instead of rawoutstream CRs |
diff
the csv files generated.
See the script: /home/ksummers/telemetry/csv-testing/Apr2016/jmhCSVTesting.sh
h5csv -d weather_01 /lbt/telemetry_data/tcs/pcs/2016/01/19/201601190001.pcs.weather.h5 | head time_stamp, tai_offset, temperature, pressure, humidity, stationid 4959878483250974,36000000,-0.809967041,690.554993,29.8850002,1 4959878543306738,36000000,-0.834991455,690.545044,30.0350018,1 4959878603363321,36000000,-0.834991455,690.560059,29.8250027,1 ...Multiple columns in a non-nested stream:
h5csv -n "temperature,pressure" -d weather_01 /lbt/telemetry_data/tcs/pcs/2016/01/19/201601190001.pcs.weather.h5 | head time_stamp, temperature, pressure 4959878483250974,-0.809967041,690.554993 4959878543306738,-0.834991455,690.545044 ...Check nested columns:
h5csv -u -d y_01 /lbt/telemetry_data/tcs/oss/dyb/2016/01/19/201601190000.oss.dyb.y.h5 | more timestamp_utc, tai_offset,errors->is_flow_error_active,errors->is_flowmeter_alarm_active,errors->is_flow_out_of_range,errors->is_general_alarm_active,errors->is_latch_interlock_faulted,errors->is_overfl ow_error_active,errors->is_pump_drive_fault_active,errors->is_pump_error_active,errors->is_pump_not_off,errors->is_pump_not_on,errors->is_pump_overtemp_active,errors->is_servo_error_active,errors->is_sw ing_arm_error_active,errors->is_swing_arm_state_illegal,errors->is_tank_level_error_active,errors->is_tank_level_f_out_of_range,errors->is_tank_level_r_out_of_range,errors->is_valve_close_timeout_active ,errors->is_valve_close_torque_active,errors->is_valve_error_active,errors->is_valve_illegal_limits_active,errors->is_valve_open_timeout_active,errors->is_valve_open_torque_active,errors->is_valve_overt emp_active,warnings->is_tank_level_f_high,warnings->is_tank_level_r_high,warnings->is_tank_warning_active,warnings->is_temp_out_of_range,warnings->is_trim_value_out_of_range,warnings->is_trim_warning_ac tive,inprocess->in_process_pump_powering_on,inprocess->in_process_pumping,inprocess->in_process_servoing,inprocess->in_process_valve_opening,inprocess->is_axis_enabled,inprocess->is_pump_off,inprocess-> is_pump_on,inprocess->is_pump_on_fwd,inprocess->is_pump_on_rev,inprocess->is_tank_overflowing,inprocess->is_valve_closed,inprocess->is_valve_open, tank_level_r, tank_level_f, accum_inbalance, flow, pump _rate, temp, trim, left_sa_moment_0, left_sa_moment_1, left_sa_moment_2, left_sa_moment_3, right_sa_moment_0, right_sa_moment_1, right_sa_moment_2, right_sa_moment_3, plusmomrem, minusmomrem 1453161608419,36000000,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,9.9503479,254.136826,0,-0.00340986252,1,1.45510435,-1503,7261,8596.09961,4699,898,6945,9253,468 8,903,0,0 ...Check column selection with nested names:
h5csv -n "hp3->absenc,hp4->loadcell,hp5->command" -d hardpoints_01 /lbt/telemetry_data/tcs/pmcr/2016/01/19/201601190000.pmcr.hardpoints.h5 | head time_stamp,hp3->absenc,hp4->loadcell,hp5->command 4959878400042512,-2.5085144,-0.692096949,0.233640179 4959878400293671,-2.5085144,-0.167046547,0.233640179 ...
wc
on the files. ??? See the env files for the whole month of June. Seems that it doesn't look like a problem when you're just getting a few cols. But the wc
is the same there too. Is it just a visualization problem with the csv file?
--fields
option, don't use other options. pipe it through sed to get newline separated file and then you can also substitute the colon for a comma to get what we need for Doug's file.
/data/hdf5-1.8.15-patch1/tools/h5csv/h5csv --fields -d secondarymirror_collimation_01 /lbt/telemetry_data/tcs/psfl/2015/10/01/201510010200.psfl.secondarymirror_collimation.h5 | tr , '\n'gives:
time_stamp:1 tai_offset:2 angle:3 limit:4 lookup:5 lookuptemp:6 tabletemp:7 temperature:8 tempok:9 activeoptics->x_value:10 activeoptics->x_gain:11 activeoptics->y_value:12 activeoptics->y_gain:13 activeoptics->z_value:14 activeoptics->z_gain:15 activeoptics->rx_value:16 activeoptics->rx_gain:17 activeoptics->ry_value:18 activeoptics->ry_gain:19 activeoptics->rz_value:20 ..... shelloffload->rz_value:110 shelloffload->rz_gain:111 tempoffset->x_value:112 tempoffset->y_value:113 tempoffset->z_value:114 tempoffset->rx_value:115 tempoffset->ry_value:116 tempoffset->rz_value:117