d000a8348f
Added bottom level instrument metadata descriptions such as units and description.
Florez Ospina Juan Felipe2024-05-24 09:50:25 +02:00
8d4f4e68c7
Removed yaml file output from data integration file. The creation of this file is being outsource to data store repo
Florez Ospina Juan Felipe2024-05-24 09:32:30 +02:00
88de88c316
Removed creation of yaml file subsequent to data integration. This can cause misalignment with data store. I think the yaml snapshot of a hdf5 file should therefore be outsourced there.
Florez Ospina Juan Felipe2024-05-24 09:30:24 +02:00
1537633b1a
Made a few optimizations to code and documentation. Expressions relying on list comprehensions were simplified with generator expressions. ex,: any([keyword in filename for keyword in select_file_keywords]) was simplified to any(keyword in filename for keyword in select_file_keywords).
Florez Ospina Juan Felipe2024-05-24 09:06:07 +02:00
d574ac382d
Replaced attribute table_header in Lopap configuration file with a shorter version which is consistent accross more files. Some of the headers might change.
Florez Ospina Juan Felipe2024-05-24 08:55:36 +02:00
63b683e4aa
Optimzed and included df to np structured array conversion. \n-Replaced loop plus append with list comprehension. \n-Replaced pd df column concatenation based on row-wise concatenation with df.aggr() method that uses column wise concatenation.
Florez Ospina Juan Felipe2024-05-23 22:20:19 +02:00
bd458c6cd0
Optimzed and included df to np structured array conversion. \n-Replaced loop plus append with list comprehension. \n-Replaced pd df column concatenation based on row-wise concatenation with df.aggr() method that uses column wise concatenation.
Florez Ospina Juan Felipe2024-05-23 22:18:37 +02:00
7367da84b9
Simplified code by updating HDF5 attributes using .update() dict method (inherited from dict type).
Florez Ospina Juan Felipe2024-05-22 20:11:54 +02:00
1729cd40fa
Added feature to interpret links to description in the yaml intrument configuration file and added them at the dataset level as attributes.
Florez Ospina Juan Felipe2024-05-09 19:17:08 +02:00
1429c56916
Added link to descriptions and units of table variables/or columns. These can be used as attributes of datasets from tabular data
Florez Ospina Juan Felipe2024-05-09 19:15:20 +02:00
f49120102d
Included timestamp specification, which indicates column names in a list that contain datetime information.
Florez Ospina Juan Felipe2024-04-30 14:51:58 +02:00
553c3fe946
Incorparated feature to merge data and time data which may originally be in separate columns in text source files. This is specified in the text source specification yaml file
Florez Ospina Juan Felipe2024-04-30 14:50:33 +02:00
4fd3bb1957
Updated readme file with instructions on how to set compound attributes and delete them.
Florez Ospina Juan Felipe2024-04-26 14:27:01 +02:00
493be88f49
Removed unecessary pygit depenedency and associated function that relied on it.
Florez Ospina Juan Felipe2024-04-26 13:15:33 +02:00
14ae29bf3c
Corrected parsing problem from hdf5 to yaml attribute. Single element arrays are now represented as a scalar as opposed to a list with a single element.
Florez Ospina Juan Felipe2024-04-26 12:54:41 +02:00
be02ad01ed
Removed problematic lines, which depended on soon to be removed dependency config_file.py
Florez Ospina Juan Felipe2024-04-24 17:14:13 +02:00
c64cad6779
Removed this workflow because it is redundant. Replaced this with active creation of review branch in gitlab
Florez Ospina Juan Felipe2024-04-24 17:05:25 +02:00
074d2e3954
Removed config_file output file naming and instead user now inputs desired output filename. Also added input argument to introduce root level metadata.
Florez Ospina Juan Felipe2024-04-18 19:14:06 +02:00
a1c88fdb5a
Added lines to flatten (shorten) original directory paths in the resulting hdf5 file.
Florez Ospina Juan Felipe2024-04-17 15:20:26 +02:00
8005b60579
Included a boolean input argument hdf5_upload to deactivate hdf5 upload for testing.
Florez Ospina Juan Felipe2024-04-07 17:09:01 +02:00
edd1bbf5be
Added an import and treemap to png statemets, but for some reason didnot work, and took forever to run. So, I left the lines but for now commented them out.
Florez Ospina Juan Felipe2024-04-07 16:55:37 +02:00
5e70d9158b
Deleted function third_complete_metadata_review() because forth_complete_metadata_review() is the same. Also, modified a substring of their name from complete to submit and submit to save for clarity. Usually submission is the last step of a review process.
Florez Ospina Juan Felipe2024-04-05 17:10:34 +02:00
d68dc98070
Implemented some safeguards that enable only commits of untracked metadata review files
Florez Ospina Juan Felipe2024-04-04 14:20:13 +02:00
719e9d6672
Repurposed the role of the config_file.py. Now it only provides functions to select the file_readers based on group id and produce a created_at timestamp.
Florez Ospina Juan Felipe2024-04-03 13:55:54 +02:00
f9b31c06fd
Reimplemented file filtering, first file extension contraints are imposed and then file keyword contraints.
Florez Ospina Juan Felipe2024-04-03 13:49:16 +02:00
9cde013be0
Modified node values as the number of children of each group. When nodes are datasets, their value is 1.
Florez Ospina Juan Felipe2024-04-02 18:48:50 +02:00
9071120e50
Refactored code to read .dat and .txt files in binary mode first rb, then the prespecified encoding is used to decode the lines. This is to have more control over the decoding process and be able to better spot possible encoding errors.
Florez Ospina Juan Felipe2024-04-02 18:35:04 +02:00
39cae66936
Implemented a two important changes. 1. filename of output file is not passed as input but it is automatically computed based on an input config_param dict. 2) input filenames in file system path are now filtered on an initial walk through the directory tree. This is to use stored path filenames for prunning directory tree, later on.
Florez Ospina Juan Felipe2024-04-02 17:33:58 +02:00