Commit Graph

201 Commits

Author SHA1 Message Date
florez_j 69b73c26b0 Corrected import statements due to dependency name changes 2024-10-17 16:52:42 +02:00
florez_j 7c60193aa6 Renamed module: src/hdf5_lib.py -> src/hdf5_writer.py 2024-10-17 10:53:51 +02:00
florez_j 44073e3816 Replaced read_dataset_from_hdf5file(hdf5_file_path, dataset_path) with HDF5DataOpsManager.extract_dataset_as_dataframe(self,dataset_name) 2024-10-17 10:46:19 +02:00
florez_j 2a9d69c757 Robustified metadata and dataset extraction methods by requiring explicit load of file obj before their use. Renamed a few functions and fixed types in print statements. 2024-10-10 11:28:23 +02:00
florez_j 6be3b31247 Renamed open_file() --> load_file_obj() and close_file() --> unload_file_obj() to focus more on the management operations on the files that actual file handling operations. 2024-10-10 10:47:44 +02:00
florez_j 568f747a69 Robustifed metadata revision methods with error detection conditions and try-except statements. Metadata revision methods now do not have the capability of opening the file. 2024-10-10 10:39:10 +02:00
florez_j fe96134383 Fixed bug in HDF5DataOpsManager.append_dataset() and added 'creation_date' metadata attribute when instrument (groups) are created. 2024-10-09 16:06:44 +02:00
florez_j c321a17943 Fixed bug, causing input_path normalization operation to damage Windows network drive paths. Basically, os.path.normpath(path_to_input_directory).strip(os.sep) replaced by os.path.normpath(path_to_input_directory).rstrip(os.sep) 2024-10-07 16:16:12 +02:00
florez_j 89e9dd9ab1 Fixed bugs in update_file() method and create_hdf5_file_from_filesystem_path() 2024-10-03 09:32:25 +02:00
florez_j 9b5d777a5b Added .update_file() method, which enables complementary data structure updates to existing file with same name as append_dir's head. 2024-10-02 14:38:35 +02:00
florez_j aad0a7c3fb Added file openning mode as input parameter. Now, mode can only take values in ['w','r+'] 2024-10-02 13:54:59 +02:00
florez_j 4420f81642 Removed construct_attributes_dict(attrs_obj) and replaced by {key: utils.to_serializable_dtype(val) for key, val in obj.attrs.items()} 2024-10-01 10:42:20 +02:00
florez_j 4d48e84e50 Made two helper functions private by adding the prefix __ 2024-10-01 09:31:41 +02:00
florez_j 8cd4b7d925 Deleted annotate_root_dir(filename,annotation_dict: dict), and outsourced functionality to HDF5DataOpsManager.append_metadata() or .update_metadata() at obj_name = '/' 2024-10-01 09:19:14 +02:00
florez_j 6f5d4adcee Implemented metadata append, rename, delete, and update operations on the hdf5 manager object and refactored metadata update script based on yaml file to use said operations. 2024-09-30 16:32:39 +02:00
florez_j 96dad0bfb1 Renamed to_yaml() as serialize_metadata() and introduce input parameter output_format, which allows yaml or json. 2024-09-26 16:23:09 +02:00
florez_j 85b0e5ab74 Performed a few function relocations and deletions from src/hdf5_lib.py into src/hdf5_ops.py and made a copy of previous version as src/hdf5_lib_part2.py 2024-09-26 15:13:31 +02:00
florez_j a92660049f Moved is_structured_array() and to_serializable_dtype() to utils, ranamed a few functions and propagated changes to dependent modules. 2024-09-26 14:03:11 +02:00
florez_j a57e46d89c Renamed take_yml_snapshot_of_hdf5_file func as to_yaml func 2024-09-25 16:49:44 +02:00
florez_j 7b221599d8 Moved take_yml_snapshot_of_hdf5_file func and associted helper functions from hdf5_vis.py into hdf5_ops.py 2024-09-25 16:42:44 +02:00
florez_j 1e93a2c552 Moved take_yml_snapshot_of_hdf5_file func and associted helper functions from hdf5_vis.py into hdf5_ops.py 2024-09-25 16:40:16 +02:00
florez_j 10554fc41e Renamed make_dtype_yaml_compatible func as to_serializable_dtype func 2024-09-25 16:36:50 +02:00
florez_j 1e1499c28a Robustified definition of path_to_input_dir arg or parameter by ensuring is always defined using forward slashes and then is normalized to the os specification. Improved dry run = True of copy directory func. 2024-09-25 15:12:19 +02:00
florez_j 9eeb9d6380 Moved src/metadata_review_lib.py pipelines/metadata_revision.py 2024-09-17 16:55:22 +02:00
florez_j 07401c895f Moved src/data_integration_lib.py -> pipelines/data_integration.py 2024-09-17 15:32:23 +02:00
florez_j 2dd033bcb3 Refactored code into functions to parse and validate yaml condif file and to perform specified data integration task using a pipeline like software structure. 2024-09-17 15:28:11 +02:00
florez_j d63f522588 Major update. Remove file filtering option and outputname input arg. The output name is now the same as the path_to_input_dir + .h5. By default, the hdf5 writer preserves second level subdirectories and the rest are flattend. dir filtering is outsource to copy_dir_with_constraints from utils- 2024-09-16 16:35:09 +02:00
florez_j 7a9f7a8c59 Renamed parameter 'input_file_system_path' to 'path_to_input_directory' for clarity. 2024-09-16 14:24:55 +02:00
florez_j cc0adfca62 Fixed import statement 2024-09-13 15:11:25 +02:00
florez_j 4974246522 move def get_parent_child_relationships(file: h5py.File) from ..._vis.py to ..._ops.py 2024-09-13 14:59:11 +02:00
florez_j b42482069c src/hdf5_data_extraction.py -> src/hdf5_ops.py 2024-09-13 14:55:12 +02:00
florez_j e8e2473ebe Added new method to retreive metadata from h5file at a given obj path 2024-09-13 14:52:07 +02:00
florez_j 96a2e96b6a Fixed import statement after module's relocation 2024-08-23 16:23:57 +02:00
florez_j e4b04b4484 Modified to use filereader_registry.py. 2024-08-23 16:10:23 +02:00
florez_j d985115125 Integrated copy h5 file into group functionality, imported from g5505_file_reader 2024-08-23 15:47:04 +02:00
florez_j 18165eca1a Modified import statements to account for reader module's relocation. 2024-08-23 13:27:26 +02:00
florez_j a0f44a1f4b Moved src/g5505_file_reader.py -> instruments/readers/g5505_file_reader.py to increase modularity with respect to new intrument additions. 2024-08-23 10:11:29 +02:00
florez_j 1112a214e9 Moved src/g5505_utils.py to utils/g5505_utils.py 2024-08-23 07:27:39 +02:00
florez_j d7fc38abd9 Moved get_parent_relationships func into hdf5_vis.py and cleaned up unused import statements 2024-08-22 09:50:26 +02:00
florez_j 05d1133e32 Moved get_parent_child_relationships() funct from hdf5_lib.py tinto hdf5_vis.py to avoid circular dependency between the lower level and higher level module. Thus removed also src.hdf5_lib.py import statement. 2024-08-22 09:47:57 +02:00
florez_j d7c7808400 Implemented method for appending new attributes to an specific object. 2024-08-16 09:32:58 +02:00
florez_j bb250e9940 Implemented method to reformat a given column in a datatable holding datetime info into a desired datetime format. During data integration this will serve to normalize datatime formats across data tables 2024-08-16 08:08:28 +02:00
florez_j 062a688f47 Added method to reformat columns containing datetime byte strings into a desired datetime formated object 2024-08-14 16:22:28 +02:00
florez_j c876e925a7 Modified code to point to new instrument folders location. Also, upgrated code to accept either a user specified location or the default location 2024-08-12 13:40:01 +02:00
florez_j 7f0e5384ea Moved instruments folder outside src/. 2024-08-12 10:09:21 +02:00
florez_j 18aba8d0d3 Implemented dataset append method in HDF5DatOpsAPI 2024-08-09 15:25:09 +02:00
florez_j 5fe7fc4b70 Developed a class to manage data operations on a given hdf5 file 2024-08-09 13:23:54 +02:00
florez_j 8f7f14ab68 Removed time stamp configuration attributes from ACSM_TOFWARE, because it can be messy for a configuration file. 2024-08-08 11:24:41 +02:00
florez_j 74db800e01 Updated file with new instrument configuration ACSM. 2024-08-07 16:38:52 +02:00
florez_j ae1e3bfc23 Moved ext_to_reader_dict to g5505_file_reader.py and replaced redear selection based on g5505_reader.select_file_reader(hdf5_file_path). 2024-08-07 16:30:36 +02:00