473 Commits

Author SHA1 Message Date
69f3857936 Implemented functions for data extraction from hdf5 files. 2024-05-31 12:39:10 +02:00
e6de1ff55d Incorporated jupyter notebook of simple example metadata annotation workflow. 2024-05-30 12:24:12 +02:00
4de7834a91 Updated readme file 2024-05-30 12:21:17 +02:00
76bffc6afe Updated notebook documentation and included an example metadata annotation notebook. 2024-05-30 12:20:34 +02:00
a0318681be Removed html file no longer useful. 2024-05-30 12:18:28 +02:00
922bb3ca64 Updated YAML config file parsing logic to account for changes in config file description. 2024-05-30 12:16:54 +02:00
7f423ccc6f Decomposed experiment_data into experiment_startdate and experiment_enddate. 2024-05-30 12:15:49 +02:00
3a9aede909 Made def third_update_hdf5_file_with_review more modular by separating data update and git operations, resulting new functions that can be reused in less restrictive matadata annotation contexts. 2024-05-29 15:26:48 +02:00
ef7c6c9efb Implemented a git operations module for automated git ops, based on subprocess. 2024-05-29 15:17:09 +02:00
146981379f Updated readme file. 2024-05-29 11:24:46 +02:00
71f284f709 Updated readme file 2024-05-29 11:23:33 +02:00
4ffd790059 Updated project name in configuration file 2024-05-28 15:06:25 +02:00
dad5e082f1 Changed ordering of data integration config files so that they align with our experimental campaign hierarchy. 2024-05-28 14:43:32 +02:00
a86fc97605 Refactored due to updates in the file reader function. 2024-05-28 14:41:34 +02:00
3de6abce50 added the feature to activate or deactivate data copying before reading the input file. This is to avoid redundant copying when we are already working on file copies. 2024-05-28 14:40:14 +02:00
fd1c6461bb Updated some of the raname_as metadata for all instruments so that it is much machine readable and perhpas be used as an alternative to the original name in future releases. 2024-05-28 14:37:43 +02:00
804ea52583 Modified function to return list of paths when config_file.yaml integration mode = experimental step. 2024-05-28 11:29:32 +02:00
f6a46168ec Improved parsing from HDF5 attr dict to yaml compatible dict. Now we can parse HDF5 compound attributes (structured np arrays). 2024-05-28 11:27:44 +02:00
41c7660be3 Enhanced data transfer progress visualization and logging 2024-05-28 08:59:29 +02:00
08d58557df Fixed bug that didnot allowed analythical_methods composite keywords (e.g., ICAD/HONO) to be matched in intrument configurations. 2024-05-28 08:57:57 +02:00
3270ce5ed7 Implemented reader file compatibility check. 2024-05-27 18:22:16 +02:00
2911416431 Improved modularity of hdf5_file creation by creating a function that copies the intput directory file and applies directory, files, and extensions constraints before regular directory to hdf5 transfer. See [200~def copy_directory_with_contraints(input_dir_path, output_dir_path, select_dir_keywords, select_file_keywords, allowed_file_extensions): 2024-05-27 18:15:08 +02:00
24a2d5d37e Refactored list to array conversion using metadata_rewiew_lib 2024-05-26 15:04:07 +02:00
77afbbbf8f Added function to convert list of strings into a np.array of bytes. This is useful to create list-valued attributes in HDF5. 2024-05-26 14:56:36 +02:00
88572b44b1 Fixed buggy statement. import datetime ... followed by datetime.now() was fixed as datetime.datetime.now(). 2024-05-26 12:26:54 +02:00
37071945f5 Removed hdf5 file creation redundancy by creating a helper function create_HDF5_file(date_str,select_file_keywords), which handles variations in date_str and keywords. 2024-05-26 12:24:15 +02:00
4dc09339b5 Replaced lambda function with regular function and fstring for better readability and debugging 2024-05-26 11:39:40 +02:00
b7f9bfe149 Replaced print statement with logging and raise exception for better error handling and managment 2024-05-26 11:34:20 +02:00
ac37235072 Added function setup_logging to configure logger to record logs in specified output directory. 2024-05-26 11:19:54 +02:00
8f1a82c00d updated env file 2024-05-24 15:55:49 +02:00
c7051bfe69 updated readme and reader to handle ignore ascii character errors 2024-05-24 15:55:15 +02:00
9329f39deb Deleted output no longer returned in data integration pipeline 2024-05-24 14:55:08 +02:00
b5ed1cb826 Updated readme file 2024-05-24 11:56:30 +02:00
005e855e48 Updated configuration file organization and workflow description. 2024-05-24 11:15:05 +02:00
784cb1eb62 Commented out openia python module. 2024-05-24 10:54:15 +02:00
d000a8348f Added bottom level instrument metadata descriptions such as units and description. 2024-05-24 09:50:25 +02:00
8d4f4e68c7 Removed yaml file output from data integration file. The creation of this file is being outsource to data store repo 2024-05-24 09:32:30 +02:00
88de88c316 Removed creation of yaml file subsequent to data integration. This can cause misalignment with data store. I think the yaml snapshot of a hdf5 file should therefore be outsourced there. 2024-05-24 09:30:24 +02:00
1537633b1a Made a few optimizations to code and documentation. Expressions relying on list comprehensions were simplified with generator expressions. ex,: any([keyword in filename for keyword in select_file_keywords]) was simplified to any(keyword in filename for keyword in select_file_keywords). 2024-05-24 09:06:07 +02:00
d574ac382d Replaced attribute table_header in Lopap configuration file with a shorter version which is consistent accross more files. Some of the headers might change. 2024-05-24 08:55:36 +02:00
63b683e4aa Optimzed and included df to np structured array conversion. \n-Replaced loop plus append with list comprehension. \n-Replaced pd df column concatenation based on row-wise concatenation with df.aggr() method that uses column wise concatenation. 2024-05-23 22:20:19 +02:00
bd458c6cd0 Optimzed and included df to np structured array conversion. \n-Replaced loop plus append with list comprehension. \n-Replaced pd df column concatenation based on row-wise concatenation with df.aggr() method that uses column wise concatenation. 2024-05-23 22:18:37 +02:00
a45fb4476b Replaced commented lines by accurate comments 2024-05-22 20:15:17 +02:00
7367da84b9 Simplified code by updating HDF5 attributes using .update() dict method (inherited from dict type). 2024-05-22 20:11:54 +02:00
1729cd40fa Added feature to interpret links to description in the yaml intrument configuration file and added them at the dataset level as attributes. 2024-05-09 19:17:08 +02:00
1429c56916 Added link to descriptions and units of table variables/or columns. These can be used as attributes of datasets from tabular data 2024-05-09 19:15:20 +02:00
f49120102d Included timestamp specification, which indicates column names in a list that contain datetime information. 2024-04-30 14:51:58 +02:00
553c3fe946 Incorparated feature to merge data and time data which may originally be in separate columns in text source files. This is specified in the text source specification yaml file 2024-04-30 14:50:33 +02:00
f3c2777bb0 Performed edits to README.md 2024-04-26 14:33:41 +02:00
4fd3bb1957 Updated readme file with instructions on how to set compound attributes and delete them. 2024-04-26 14:27:01 +02:00