Commit Graph

224 Commits

Author SHA1 Message Date
8fa587ef19 Removed html file no longer useful. 2024-05-30 12:18:28 +02:00
894936f107 Updated YAML config file parsing logic to account for changes in config file description. 2024-05-30 12:16:54 +02:00
3e21ecde7b Decomposed experiment_data into experiment_startdate and experiment_enddate. 2024-05-30 12:15:49 +02:00
b2e807788f Made def third_update_hdf5_file_with_review more modular by separating data update and git operations, resulting new functions that can be reused in less restrictive matadata annotation contexts. 2024-05-29 15:26:48 +02:00
1987d1610f Implemented a git operations module for automated git ops, based on subprocess. 2024-05-29 15:17:09 +02:00
c1e5bc9ddd Updated readme file. 2024-05-29 11:24:46 +02:00
8226f616dd Updated readme file 2024-05-29 11:23:33 +02:00
fb1c627104 Updated project name in configuration file 2024-05-28 15:06:25 +02:00
4fb5ed58b1 Changed ordering of data integration config files so that they align with our experimental campaign hierarchy. 2024-05-28 14:43:32 +02:00
82754e26b0 Refactored due to updates in the file reader function. 2024-05-28 14:41:34 +02:00
0f505df45c added the feature to activate or deactivate data copying before reading the input file. This is to avoid redundant copying when we are already working on file copies. 2024-05-28 14:40:14 +02:00
e0d84d7822 Updated some of the raname_as metadata for all instruments so that it is much machine readable and perhpas be used as an alternative to the original name in future releases. 2024-05-28 14:37:43 +02:00
54e4301e93 Modified function to return list of paths when config_file.yaml integration mode = experimental step. 2024-05-28 11:29:32 +02:00
dfd14fd029 Improved parsing from HDF5 attr dict to yaml compatible dict. Now we can parse HDF5 compound attributes (structured np arrays). 2024-05-28 11:27:44 +02:00
2fe2ac2efa Enhanced data transfer progress visualization and logging 2024-05-28 08:59:29 +02:00
eb89b59702 Fixed bug that didnot allowed analythical_methods composite keywords (e.g., ICAD/HONO) to be matched in intrument configurations. 2024-05-28 08:57:57 +02:00
7bfd895eb5 Implemented reader file compatibility check. 2024-05-27 18:22:16 +02:00
33fec9bd59 Improved modularity of hdf5_file creation by creating a function that copies the intput directory file and applies directory, files, and extensions constraints before regular directory to hdf5 transfer. See [200~def copy_directory_with_contraints(input_dir_path, output_dir_path, select_dir_keywords, select_file_keywords, allowed_file_extensions): 2024-05-27 18:15:08 +02:00
3ea8d1ee40 Refactored list to array conversion using metadata_rewiew_lib 2024-05-26 15:04:07 +02:00
4859d6d2e4 Added function to convert list of strings into a np.array of bytes. This is useful to create list-valued attributes in HDF5. 2024-05-26 14:56:36 +02:00
1e10aad835 Fixed buggy statement. import datetime ... followed by datetime.now() was fixed as datetime.datetime.now(). 2024-05-26 12:26:54 +02:00
e55086b0ad Removed hdf5 file creation redundancy by creating a helper function create_HDF5_file(date_str,select_file_keywords), which handles variations in date_str and keywords. 2024-05-26 12:24:15 +02:00
b4fba4b40c Replaced lambda function with regular function and fstring for better readability and debugging 2024-05-26 11:39:40 +02:00
1012e17905 Replaced print statement with logging and raise exception for better error handling and managment 2024-05-26 11:34:20 +02:00
9ad77da9f8 Added function setup_logging to configure logger to record logs in specified output directory. 2024-05-26 11:19:54 +02:00
e0f1b6b1ff updated env file 2024-05-24 15:55:49 +02:00
34fb1be71f updated readme and reader to handle ignore ascii character errors 2024-05-24 15:55:15 +02:00
7633816c23 Deleted output no longer returned in data integration pipeline 2024-05-24 14:55:08 +02:00
55d3a2c92b Updated readme file 2024-05-24 11:56:30 +02:00
8315f8991b Updated configuration file organization and workflow description. 2024-05-24 11:15:05 +02:00
8cff0d6f74 Commented out openia python module. 2024-05-24 10:54:15 +02:00
e278cde961 Added bottom level instrument metadata descriptions such as units and description. 2024-05-24 09:50:25 +02:00
1c39986503 Removed yaml file output from data integration file. The creation of this file is being outsource to data store repo 2024-05-24 09:32:30 +02:00
292708e745 Removed creation of yaml file subsequent to data integration. This can cause misalignment with data store. I think the yaml snapshot of a hdf5 file should therefore be outsourced there. 2024-05-24 09:30:24 +02:00
c4f12eaa84 Made a few optimizations to code and documentation. Expressions relying on list comprehensions were simplified with generator expressions. ex,: any([keyword in filename for keyword in select_file_keywords]) was simplified to any(keyword in filename for keyword in select_file_keywords). 2024-05-24 09:06:07 +02:00
9c311342d8 Replaced attribute table_header in Lopap configuration file with a shorter version which is consistent accross more files. Some of the headers might change. 2024-05-24 08:55:36 +02:00
67a52ab00a Optimzed and included df to np structured array conversion. \n-Replaced loop plus append with list comprehension. \n-Replaced pd df column concatenation based on row-wise concatenation with df.aggr() method that uses column wise concatenation. 2024-05-23 22:20:19 +02:00
993db5d783 Optimzed and included df to np structured array conversion. \n-Replaced loop plus append with list comprehension. \n-Replaced pd df column concatenation based on row-wise concatenation with df.aggr() method that uses column wise concatenation. 2024-05-23 22:18:37 +02:00
e4b9487575 Replaced commented lines by accurate comments 2024-05-22 20:15:17 +02:00
7c1c0bf33c Simplified code by updating HDF5 attributes using .update() dict method (inherited from dict type). 2024-05-22 20:11:54 +02:00
83de18989f Added feature to interpret links to description in the yaml intrument configuration file and added them at the dataset level as attributes. 2024-05-09 19:17:08 +02:00
e8a13dba20 Added link to descriptions and units of table variables/or columns. These can be used as attributes of datasets from tabular data 2024-05-09 19:15:20 +02:00
6d1b7545e5 Included timestamp specification, which indicates column names in a list that contain datetime information. 2024-04-30 14:51:58 +02:00
67765d53f0 Incorparated feature to merge data and time data which may originally be in separate columns in text source files. This is specified in the text source specification yaml file 2024-04-30 14:50:33 +02:00
fee72bbda6 Performed edits to README.md 2024-04-26 14:33:41 +02:00
faef7db666 Updated readme file with instructions on how to set compound attributes and delete them. 2024-04-26 14:27:01 +02:00
7441d63cd3 Removed unecessary pygit depenedency and associated function that relied on it. 2024-04-26 13:15:33 +02:00
b344a4045f Cleared out jupyter notebook. 2024-04-26 13:09:41 +02:00
9552bfead2 Included new delete attribute and restart review features. 2024-04-26 13:08:27 +02:00
94d717f9db Corrected parsing problem from hdf5 to yaml attribute. Single element arrays are now represented as a scalar as opposed to a list with a single element. 2024-04-26 12:54:41 +02:00