Python Module

This section was auto-generated from the doc-string comments in the python files by Sphinx auto-doc. The comments in the .py files are formatted below.

hobo_qaqc Module

class hobo_qaqc.HOBOdata[source]

Load and process data from HOBO loggers produced by the ONSET company.

Handles csv files exported from the HoboWare program. The native format for HOBO loggers is a .hobo file. This proprietary binary file is not handled here and must be converted to a csv.

This class syncs timesteps, checks time zones, and units, and converts where needed.

export_to_GCE_csv(csvname)[source]

Export the HOBO data to a GCE friendly csv file

Parameters:csvname – str. Filepath to output csv file
format_QAQC_data(units='SI', tz=-8, tstep='5min')[source]

Reformat the data using basic QAQC for SI or US units and time zone consistency regardless of daylight savings.

Parameters:
  • units – str. keyword argument. The desired system of units. Default is ‘SI’.
  • tz – flt. keyword argument. The desired time zone as an offset from Greenwich Mean Time. Default is -8 (PST)
  • tstep – keyword argument. Interval to round time stamps to. Default ‘5min’.

Note

tstep is input to the function HOBOdata.format_sync_timestep(). Valid types are listed there.

format_intensity(col='Intensity', unit='Lux')[source]

Format light intensity records in desired units

Parameters:
  • col – keyword argument. str. Name of column containing light intensity data. Defaults to ‘Intensity’.
  • unit – keyword argument. str defining desired units. Default is ‘Lux’ (SI)
format_sync_timestep(n_min='5min')[source]

Sync timestamps to a defined measurement interval. Timestamps are increased to the next defined interval.

Parameters:n_min – str. keyword argument. Interval to round time stamps to. Default ‘5min’.

Note

This uses the function ceil to round up to the next interval. The interval provided must match a known type and contain both a number and a letter such as ‘1D’ to round up to the next whole day.

See documentation for valid types [1]

Warning

This will change the index and timestamp of every record.

[1]: https://pandas.pydata.org/pandas-docs/stable/timeseries.html#offset-aliases
format_temp(col='Temp', unit='C')[source]

Format temperature records to desired units

Parameters:
  • col – keyword argurment. str. Name of column containing temperature data. Defaults to ‘Temp’
  • unit – keyword argument. str defining desired unit. Default is ‘C’
format_timezone(tz=-8)[source]

Check that timezone is correct, and if not, adjust the time zone.

Parameters:tz – a timezone as number of hours offset from Greenwhich Mean Time
get_csv_GMT_offset(header, lineno=-1)[source]

Get timezone as an offset from Greenwhich Mean Time from the header file

Parameters:
  • lineno – keyword argument. index of header array. Function operates on specified index. Default -1
  • header – array of header lines where each line is a single string.
Returns:

string of timezone offset from GMT

Example:

String for PST ‘-08:00’
get_csv_col(header, lineno=-1)[source]

Extract column names from csv format

Parameters:
  • header – array of header lines where each line is a single string.
  • lineno – keyword argument. index of header array. Function operates on specified index. Default -1
Returns:

array of column names.

get_csv_intensity_unit(header, lineno=-1)[source]

Get unit for sunlight intensity

Parameters:
  • header – array of header lines where each line is a single string
  • lineno – keyword argument. index of header array. Function operates on specified index. Default -1
Returns:

str defining units for sunlight intensity

get_csv_sn(header, lineno=-1)[source]
Parameters:
  • header – array of header lines where each line is a single string.
  • lineno – keyword argument. index of header array. Function operates on specified index. Default -1
Returns:

str containing serial number

get_csv_temp_unit(header, lineno=-1)[source]

Get unit for temperature records

Parameters:
  • header – array of header lines where each line is a single string.
  • lineno – keyword argument. index of header array. Function operates on specified index. Default -1
Returns:

str with single letter defining units for temperature.

get_header_nlines(file_name)[source]

Estimate how many header lines exist in a file

Parameters:file_name
Returns:int that is index of last header line

Warning

This is a simplistic filter that searches for the first row where there are no quotes and returns line_num - 1 on a 1 based index.

Complex files with quotes around data fields, or no quotes in header lines will not be caught.

Example:

‘Plot Title: RS12’ ‘#’,’Date Time, GMT-07:00’,’Temp, °C’,’Intensity, lum/ft²’,’Coupler Attached’,’Stopped’,’End Of File’ 1,11/17/2014 11:10:00 AM,3.472,16.0,,,

returns 2

get_timestamp_col(col)[source]

Time stamps can be exported by HOBO into either 1 or 2 columns

Parameters:col – an array of column names
Returns:list of index locations
Returns:list of column name(s) that make the timestamp
intensity_lumft2_to_lux(intensity)[source]

Convert light intensity records from lumen ft-2 into Lux

Parameters:intensity – an intensity value or list of intensity values in lumen ft-2
Returns:an intensity or list of intensity values in Lux
is_intensity_lux()[source]

Read units definition from header and return True if units are Lux

Returns:Boolean. True if light intensity is recorded in Lux
is_temp_celsius()[source]

Read units definition from header and return true if units are celsius

Returns:Boolean. True if temperature is recorded in celsius.
is_timezone_correct(tz)[source]

Check the timezone in which data was recorded against the expected timezone

Parameters:tz – a timezone as number of hours offset from Greenwhich Mean Time
Returns:Boolean
load_csv_data(fname)[source]

Load csv file output by HOBO pendants into a Pandas DataFrame.

Parameters:fname – str. Filepath of csv data file
read_csv_header(file_name)[source]

Read the header lines from the beginning of a file. Reads n_lines, and stores them as headers object.

Parameters:file_name – str. File path of file to be read.
reformat_HOBO_csv(infname, outfname=None, units='SI', tz=-8, tstep='5min')[source]

Imports a csv file output by HoboWare software and checks for:

  • units
  • timezone
  • time sync (09:07 vs 09:05)

File is converted to specified settings and exported to a GCE friendly format.

Parameters:
  • infname – str. Filename to read
  • outfname – str. Filename to ouput. Defaults to same as infname
  • units – str. System of units desired. Defaults to SI
  • tz – int or flt. Timezone as offset from GMT
  • tstep – str. Time interval to sync to. Default is ‘5min’. See HOBOdata.format_sync_timestep() or [2] for valid formats.
[2]: https://pandas.pydata.org/pandas-docs/stable/timeseries.html#offset-aliases
set_data_GMT_offset(hr_offset)[source]

Define time zone of DataFrame timestamps in offset from UTC/GMT

Parameters:hr_offset – floating point of time zone in hours difference from Greenwhich Mean Time
temp_F_to_C(temp)[source]

Convert temperature records from Fahrenheit

Parameters:temp – a temperature value or list of temperature values in degrees fahrenheit.
Returns:a temperature value or list of temperature values in degrees celsius

file_manager Module

This module preforms QAQC methods in a batch. Methods were developed to process csv files created by HOBO sensors at meteorological sites on the HJ Andrews experimental forest. It also preforms other file storage and management functions. For a specified directory, it processes all files and creates a directory of new, processed csv files.

QAQC methods are imported from hobo_qaqc.HOBOdata.reformat_HOBO_csv().

When module is called FileHandling.manage() is executed.

This module is designed to minimize any read/write times by copying all files locally, preforming all processes, and then transferring files to final directories. This is ideal with external or network drives, but if all directories are local, it will create a final directory which duplicates file names from the source directory.

class file_manager.FileHandling[source]

Processes all files in assigned directory for timezone, units, and timestep sync, and converts values where necessary. Contains methods for archiving using .zip, wiping directories after processing, and adding to existing directory structure: ./<FileArchive>/<Project>/<Site>.

Warning

Executes ./MET_hobo/file_path.config as Python file and saves variables to class object.

copy_to_final_dir(file_list, subdir, loc)[source]

Call OS specific system command to copy from temporary working directory to final storage. Selects files by site using wildcard selection.

Example::
RS12*
Parameters:
  • file_list – List of str to select files from. Example: [‘RS12’,’RS04’] copies files ‘RS12*’ and ‘RS04*’
  • subdir – str. Destination subdirectory within final storage directory. Files are moved to here.
  • loc – str. Directory where files are currently located.
Returns:

List of strings of each filename copied to the final directory

copy_to_wdir()[source]

Copies source files to local working directory using OS specifc DOS, bash, or shell command. Results are output to log file.

del_files_frm_srcdir()[source]

Wipe all files from the src_dir, defined in file_path.config as dir_source_files. All files and sub- folders in this directory will be wiped.

If source directory and final directory are the same, this process will abort.

Warning

This uses destructive methods which will erase any and all contents of the target directory and any sub- directories within.

shutil.rmtree()

Returns:List of strings of each filename wiped from the source directory
del_temp_folders()[source]

This is to wipe temporary processing folders in the working directory. The convention maintained by this module is that all temp folders have the “_” prefix

If any files are still in _processed, and have not been copied to a final storage directory, deletion of this directory will be aborted.

Warning

This uses destructive methods which will erase any and all contents of the target directory and any sub- directories within.

shutil.rmtree()

index_files()[source]

Identify files in source directory. Create list of .hobo, .csv, .log files, and any other file type encountered.

Identify site as any prefix to the left of “_” in filename and generate a list of unique sites.

manage(time_step=None, units='SI', tz=-8)[source]

Execute file managment.

  1. Copy files to working directory (./_data).
  2. Create list of .csv, .hobo, and .logs files in working directory.
  3. Attempt to preform QAQC on all .csv files and transfer to ./_processed.
  4. Create a .zip file for all .hobo files from each site. Disabled per bitbucket issue #10 .
  5. Copy all files with .csv, .log, and unknown extension to final storage.
  6. Delete temporary folders in working directory.
  7. Wipe original source directory. This directory contains files where QAQC was not preformed. Disabled per bitbucket issue #10 .
  8. Write log file.

3 keyword variables are defined to allow the user to alter format_QAQC_data() settings. units, and tz (time zone) are set to default values, SI units and PST (GMT-8). To change these values, manage() must be called directly, through the terminal, or through Python. time_step, is defined in the config file. This argument only needs to be defined here if the user wants to override the config file at the command line.

qaqc_csv(time_step=None, units='SI', tz=-8)[source]

Attempt to QAQC all csv files for timezone, timestep sync, and units.

For list of .csv files generated by index_files(), call hobo_qaqc.HOBOdata.reformat_HOBO_csv().

Returns:list. strings of filenames processed with \n at end.
Returns:int. number of csv files
Returns:int. number of files processed
set_log_header()[source]

Create header for log file. Assigns first items to list self.logs.

write_log()[source]

Write log to file. <final storage directory>//logs//hobo_qaqc_<date>.log.

Log is a list of strings until this function is called.

zip_hobo_files()[source]

Collect all files with .hobo extension and write to a zip file in the temp directory _processed.

Naming convetion is <site>_<today’s date>.zip, where site is any filename prefix to the left of “_”.

For list of .hobo files generated by index_files()

Returns:List of strings of each filename and it’s zipped filename with a \n at the end
Returns:int. Count of hobo files
Returns:int. Count of zipped files