473 Commits

Author SHA1 Message Date
6701bc06ad Added read_mtable_as_dataframe(filename) back so that jupyter notebook can use it to demonstrate some functionality 2024-11-23 16:31:29 +01:00
fd92bce802 Implemented sanitize dataframe function to deal with 'O' which may have numbers or strings detected as string types. Then we use it prior to convert dataframe into structured numpy array. 2024-11-23 16:28:49 +01:00
8ab2cb3bdb Moved to notebooks/ 2024-11-23 12:29:55 +01:00
33ad4b8509 Moved to notebooks/ 2024-11-23 12:16:13 +01:00
c30bdab41a Moved to notebooks/ to improve repo organization 2024-11-23 12:06:27 +01:00
3535fd0cc2 Moved data integration ipynb to notebooks folder to improve readability 2024-11-23 11:24:28 +01:00
e486b4659c Added .pkl extension in the list of admissible file extensions 2024-11-21 11:47:41 +01:00
d13e10e44f Modified logger setup to create monthly logs 2024-11-21 11:46:11 +01:00
4632554af1 Added a logs/ and envs/ folder to gitignore. 2024-11-21 11:44:38 +01:00
1be4b8493a Improved progress description stdout 2024-11-10 18:21:00 +01:00
ca2c98eebc Fixed command line interface bug 2024-11-10 18:19:59 +01:00
8d17bf267c Major code refactoring and simplifications to enhance modularity. Included a command line interface. 2024-11-01 09:52:41 +01:00
510683a50d Renamed the input argument yaml_review_file as review_yaml_file. 2024-11-01 09:51:12 +01:00
e2fec03d4a Included cli commands update and serialize to simplify running metadata revision pipeline. 2024-10-29 07:56:43 +01:00
3f7a089a28 Fixed bug: to_serializable_dtype() did not identify correctly dtype of array's entries with object dtype 2024-10-28 18:49:22 +01:00
74633adf7f Removed unused import statements 2024-10-28 16:37:32 +01:00
cc96672245 Moved git related operations from pipelines/ to src/git_ops.py 2024-10-28 16:30:34 +01:00
15b0ff3cc4 Added function to validate review yaml file, and updated update_hdf5_with_review function 2024-10-28 16:20:28 +01:00
69b73c26b0 Corrected import statements due to dependency name changes 2024-10-17 16:52:42 +02:00
7c60193aa6 Renamed module: src/hdf5_lib.py -> src/hdf5_writer.py 2024-10-17 10:53:51 +02:00
44073e3816 Replaced read_dataset_from_hdf5file(hdf5_file_path, dataset_path) with HDF5DataOpsManager.extract_dataset_as_dataframe(self,dataset_name) 2024-10-17 10:46:19 +02:00
f1b2c64f66 Fixed bug when file reader not available. File reader registry now returns a reade that maps input to None. 2024-10-14 16:03:03 +02:00
2a330fcf92 Added 'filename_format' attribute to YAML schema. It takes as value a string of comma separated keys from available attributes in YAML file. 2024-10-14 16:01:24 +02:00
1954542031 Fixed bug introduce in logger due to invalid date naming replace : with - 2024-10-10 14:29:36 +02:00
ea82af2cd5 Cleaned up import statements and comment out path append operations 2024-10-10 14:27:50 +02:00
7d94ce29dd Attempt to initialize dima/utils as a module 2024-10-10 11:53:27 +02:00
1c2588d85f Attemp to initialize dima as a module 2024-10-10 11:43:02 +02:00
2a9d69c757 Robustified metadata and dataset extraction methods by requiring explicit load of file obj before their use. Renamed a few functions and fixed types in print statements. 2024-10-10 11:28:23 +02:00
7653e982a4 Updated function dependencies to reflect changes made to hdf5_ops.py 2024-10-10 11:02:05 +02:00
6be3b31247 Renamed open_file() --> load_file_obj() and close_file() --> unload_file_obj() to focus more on the management operations on the files that actual file handling operations. 2024-10-10 10:47:44 +02:00
568f747a69 Robustifed metadata revision methods with error detection conditions and try-except statements. Metadata revision methods now do not have the capability of opening the file. 2024-10-10 10:39:10 +02:00
31c9db98ca Changed datetime format output of created_at() function as '%Y-%m-%d %H:%M:%S.%f' 2024-10-09 16:07:40 +02:00
fe96134383 Fixed bug in HDF5DataOpsManager.append_dataset() and added 'creation_date' metadata attribute when instrument (groups) are created. 2024-10-09 16:06:44 +02:00
7c683f96a1 Merge branch 'main' of https://gitlab.psi.ch/5505/dima 2024-10-07 16:19:10 +02:00
9a3bf77f37 Created file reader for acsm tofware files, updated registry and updated yaml file with instrument specific terms and reader config params. 2024-10-07 16:18:14 +02:00
c321a17943 Fixed bug, causing input_path normalization operation to damage Windows network drive paths. Basically, os.path.normpath(path_to_input_directory).strip(os.sep) replaced by os.path.normpath(path_to_input_directory).rstrip(os.sep) 2024-10-07 16:16:12 +02:00
dc7f156367 Updated README.md with guide for intrument dependent file reader extensions and updated TODO.md with pending tasks. 2024-10-03 11:31:51 +02:00
89e9dd9ab1 Fixed bugs in update_file() method and create_hdf5_file_from_filesystem_path() 2024-10-03 09:32:25 +02:00
098a79531c Added new instrument (flagging app) file reading capabilities. It includes two files a flag_reader.py that takes flag.json files produced by the app into a standard intermidiate representation, and a yaml file with instrument dependent description terms. Last, we modified the filereader_registry.py to find the new instrument file reader. 2024-10-03 09:07:06 +02:00
01b39b4c02 Added __init__.py inside intrument folders 2024-10-02 15:51:02 +02:00
9b5d777a5b Added .update_file() method, which enables complementary data structure updates to existing file with same name as append_dir's head. 2024-10-02 14:38:35 +02:00
aad0a7c3fb Added file openning mode as input parameter. Now, mode can only take values in ['w','r+'] 2024-10-02 13:54:59 +02:00
4420f81642 Removed construct_attributes_dict(attrs_obj) and replaced by {key: utils.to_serializable_dtype(val) for key, val in obj.attrs.items()} 2024-10-01 10:42:20 +02:00
4d48e84e50 Made two helper functions private by adding the prefix __ 2024-10-01 09:31:41 +02:00
8cd4b7d925 Deleted annotate_root_dir(filename,annotation_dict: dict), and outsourced functionality to HDF5DataOpsManager.append_metadata() or .update_metadata() at obj_name = '/' 2024-10-01 09:19:14 +02:00
6f5d4adcee Implemented metadata append, rename, delete, and update operations on the hdf5 manager object and refactored metadata update script based on yaml file to use said operations. 2024-09-30 16:32:39 +02:00
afe31288a0 Refactored a few function calls due to ranming changes in utils module 2024-09-27 08:58:35 +02:00
96dad0bfb1 Renamed to_yaml() as serialize_metadata() and introduce input parameter output_format, which allows yaml or json. 2024-09-26 16:23:09 +02:00
85b0e5ab74 Performed a few function relocations and deletions from src/hdf5_lib.py into src/hdf5_ops.py and made a copy of previous version as src/hdf5_lib_part2.py 2024-09-26 15:13:31 +02:00
a92660049f Moved is_structured_array() and to_serializable_dtype() to utils, ranamed a few functions and propagated changes to dependent modules. 2024-09-26 14:03:11 +02:00