Commit Graph

348 Commits

Author SHA1 Message Date
2dbd255589 Created file reader for acsm tofware files, updated registry and updated yaml file with instrument specific terms and reader config params. 2024-10-07 16:18:14 +02:00
bd4ced00ba Fixed bug, causing input_path normalization operation to damage Windows network drive paths. Basically, os.path.normpath(path_to_input_directory).strip(os.sep) replaced by os.path.normpath(path_to_input_directory).rstrip(os.sep) 2024-10-07 16:16:12 +02:00
c103268102 Fixed bugs in update_file() method and create_hdf5_file_from_filesystem_path() 2024-10-03 09:32:25 +02:00
2920be624a Added new instrument (flagging app) file reading capabilities. It includes two files a flag_reader.py that takes flag.json files produced by the app into a standard intermidiate representation, and a yaml file with instrument dependent description terms. Last, we modified the filereader_registry.py to find the new instrument file reader. 2024-10-03 09:07:06 +02:00
bac6f5d773 Added __init__.py inside intrument folders 2024-10-02 15:51:02 +02:00
d49f511dbd Added .update_file() method, which enables complementary data structure updates to existing file with same name as append_dir's head. 2024-10-02 14:38:35 +02:00
ea898ca3c5 Added file openning mode as input parameter. Now, mode can only take values in ['w','r+'] 2024-10-02 13:54:59 +02:00
4f0361c6c5 Removed construct_attributes_dict(attrs_obj) and replaced by {key: utils.to_serializable_dtype(val) for key, val in obj.attrs.items()} 2024-10-01 10:42:20 +02:00
1b0c666132 Made two helper functions private by adding the prefix __ 2024-10-01 09:31:41 +02:00
14a1d032b9 Deleted annotate_root_dir(filename,annotation_dict: dict), and outsourced functionality to HDF5DataOpsManager.append_metadata() or .update_metadata() at obj_name = '/' 2024-10-01 09:19:14 +02:00
96500063fb Implemented metadata append, rename, delete, and update operations on the hdf5 manager object and refactored metadata update script based on yaml file to use said operations. 2024-09-30 16:32:39 +02:00
db6fcd03da Refactored a few function calls due to ranming changes in utils module 2024-09-27 08:58:35 +02:00
c992662a1f Renamed to_yaml() as serialize_metadata() and introduce input parameter output_format, which allows yaml or json. 2024-09-26 16:23:09 +02:00
02a7c4d834 Performed a few function relocations and deletions from src/hdf5_lib.py into src/hdf5_ops.py and made a copy of previous version as src/hdf5_lib_part2.py 2024-09-26 15:13:31 +02:00
8f9e2fc594 Moved is_structured_array() and to_serializable_dtype() to utils, ranamed a few functions and propagated changes to dependent modules. 2024-09-26 14:03:11 +02:00
7ab615019a Renamed take_yml_snapshot_of_hdf5_file func as to_yaml func 2024-09-25 16:49:44 +02:00
57d49a8db0 Moved take_yml_snapshot_of_hdf5_file func and associted helper functions from hdf5_vis.py into hdf5_ops.py 2024-09-25 16:42:44 +02:00
7304655ba5 Moved take_yml_snapshot_of_hdf5_file func and associted helper functions from hdf5_vis.py into hdf5_ops.py 2024-09-25 16:40:16 +02:00
32ba2a13cd Renamed make_dtype_yaml_compatible func as to_serializable_dtype func 2024-09-25 16:36:50 +02:00
3e143fb9c7 Abstracted reusable steps in integration_sources as dima_pipeline.py and added functionality to make a collection of hdf5 files, where each represents an single experiment of campaign. 2024-09-25 15:23:23 +02:00
dd8fc1a906 Robustified definition of path_to_input_dir arg or parameter by ensuring is always defined using forward slashes and then is normalized to the os specification. Improved dry run = True of copy directory func. 2024-09-25 15:12:19 +02:00
90d43a46f8 Fixed instrument_dir estimation to be bottom up, ie, based on path to file. Otherwise, it does not work when dima used as submodule 2024-09-19 15:47:11 +02:00
0e354a0f14 Moved src/metadata_review_lib.py pipelines/metadata_revision.py 2024-09-17 16:55:22 +02:00
de859102ab Moved src/data_integration_lib.py -> pipelines/data_integration.py 2024-09-17 15:32:23 +02:00
59861c3aa8 Refactored code into functions to parse and validate yaml condif file and to perform specified data integration task using a pipeline like software structure. 2024-09-17 15:28:11 +02:00
7b3b453db1 Major update. Remove file filtering option and outputname input arg. The output name is now the same as the path_to_input_dir + .h5. By default, the hdf5 writer preserves second level subdirectories and the rest are flattend. dir filtering is outsource to copy_dir_with_constraints from utils- 2024-09-16 16:35:09 +02:00
eec38f61d7 Restructured a bit to include the default case of copying an imnput directory without any constraints. Also, added dry_run input argument that returns a path to files dict representation of output directory without making an actual copy. Useful when input directory is already safe to work with directly 2024-09-16 15:38:30 +02:00
6d91c043f8 Renamed parameter 'input_file_system_path' to 'path_to_input_directory' for clarity. 2024-09-16 14:24:55 +02:00
85b4909713 Fixed import statement 2024-09-13 15:11:25 +02:00
0f913e5002 move def get_parent_child_relationships(file: h5py.File) from ..._vis.py to ..._ops.py 2024-09-13 14:59:11 +02:00
4813359a4f src/hdf5_data_extraction.py -> src/hdf5_ops.py 2024-09-13 14:55:12 +02:00
4525c1ba04 Added new method to retreive metadata from h5file at a given obj path 2024-09-13 14:52:07 +02:00
3e1a46ebc7 Fixed import statement after module's relocation 2024-08-23 16:23:57 +02:00
926dc9208a Modified to use filereader_registry.py. 2024-08-23 16:10:23 +02:00
1e0da55abc Removed and splitted into instruments/readers/filereader_registry.py instruments/readers/g5505_text_reader.py instruments/readers/xps_ibw_reader.py 2024-08-23 16:09:04 +02:00
0a58e86bcb Split instruments/readers/g5505_file_reader.py into a fileregistry.py and independent file readers. This is to improve instrument modularity and additions 2024-08-23 16:06:44 +02:00
cfae414b0e Renamed to reflect better the functionality of the file 2024-08-23 15:50:14 +02:00
b499ef2845 Integrated copy h5 file into group functionality, imported from g5505_file_reader 2024-08-23 15:47:04 +02:00
7ad4e686a7 Moved copy_file_in_group() into hdf5_lib.py because it is not really doing the same role of all filereaders 2024-08-23 15:45:32 +02:00
33ad9acdd4 Moved all yaml files with dictionary terms for each instrument to dictionaries folder 2024-08-23 14:32:23 +02:00
f20e02d62f Added ACSM_TOFWARE metadata descriptions 2024-08-23 14:23:32 +02:00
17dd1f1864 Modified import statements to account for reader module's relocation. 2024-08-23 13:27:26 +02:00
a33e2b681f Fixed a few import dependencies after relocating this file. 2024-08-23 10:57:13 +02:00
e76ed79f1e Moved src/g5505_file_reader.py -> instruments/readers/g5505_file_reader.py to increase modularity with respect to new intrument additions. 2024-08-23 10:11:29 +02:00
b22b0e94e4 Moved src/g5505_utils.py to utils/g5505_utils.py 2024-08-23 07:27:39 +02:00
9d917226af Moved get_parent_relationships func into hdf5_vis.py and cleaned up unused import statements 2024-08-22 09:50:26 +02:00
da6cca1632 Moved get_parent_child_relationships() funct from hdf5_lib.py tinto hdf5_vis.py to avoid circular dependency between the lower level and higher level module. Thus removed also src.hdf5_lib.py import statement. 2024-08-22 09:47:57 +02:00
d6dce9a392 Implemented method for appending new attributes to an specific object. 2024-08-16 09:32:58 +02:00
ea13f2b71b Implemented method to reformat a given column in a datatable holding datetime info into a desired datetime format. During data integration this will serve to normalize datatime formats across data tables 2024-08-16 08:08:28 +02:00
3fc96a89d2 Added method to reformat columns containing datetime byte strings into a desired datetime formated object 2024-08-14 16:22:28 +02:00