Commit Graph

36 Commits

Author SHA1 Message Date
77386432f8 Merge branch 'main' of https://gitlab.psi.ch/5505/dima 2024-07-02 16:50:08 +02:00
177a5aa2a1 Updated documentation. 2024-07-02 16:49:48 +02:00
c074e45892 Renamed script_name to processing_file. 2024-07-01 16:17:25 +02:00
106795ae59 Added a few lines to detect the existence of the file and change the file mode from 'w' to 'a' based on that information. 2024-06-20 09:03:47 +02:00
498a51cbc6 Updated function to add project level metadata at the root group of the hdf5 file. 2024-06-19 18:31:11 +02:00
04558e7785 Added code to parse dict attributes. 2024-06-18 14:42:51 +02:00
a6868d985d Fixed bug regarding datetime to str column conversion in dataframe by using .map(srt) (element wise operation) as opposed to .apply(str) 2024-06-18 09:21:46 +02:00
b66dc11a62 Replaced applymap to .apply because the former is being depricated 2024-06-17 13:47:54 +02:00
ed1641af55 Created function to save dataframes with annotations in hdf5 format 2024-06-17 13:36:05 +02:00
9ab9aa49c4 Abstracted a code snippet from def create_hdf5_file_from_filesystem_path(..) as transfer_file_dict_to_hdf5() so that it can be reusable. 2024-06-13 15:44:01 +02:00
1054367f12 Modified annotate_root_dir function. 2024-06-02 16:02:48 +02:00
a86fc97605 Refactored due to updates in the file reader function. 2024-05-28 14:41:34 +02:00
41c7660be3 Enhanced data transfer progress visualization and logging 2024-05-28 08:59:29 +02:00
2911416431 Improved modularity of hdf5_file creation by creating a function that copies the intput directory file and applies directory, files, and extensions constraints before regular directory to hdf5 transfer. See [200~def copy_directory_with_contraints(input_dir_path, output_dir_path, select_dir_keywords, select_file_keywords, allowed_file_extensions): 2024-05-27 18:15:08 +02:00
88de88c316 Removed creation of yaml file subsequent to data integration. This can cause misalignment with data store. I think the yaml snapshot of a hdf5 file should therefore be outsourced there. 2024-05-24 09:30:24 +02:00
1537633b1a Made a few optimizations to code and documentation. Expressions relying on list comprehensions were simplified with generator expressions. ex,: any([keyword in filename for keyword in select_file_keywords]) was simplified to any(keyword in filename for keyword in select_file_keywords). 2024-05-24 09:06:07 +02:00
a45fb4476b Replaced commented lines by accurate comments 2024-05-22 20:15:17 +02:00
7367da84b9 Simplified code by updating HDF5 attributes using .update() dict method (inherited from dict type). 2024-05-22 20:11:54 +02:00
be02ad01ed Removed problematic lines, which depended on soon to be removed dependency config_file.py 2024-04-24 17:14:13 +02:00
ceb8a34ee0 Commented out no needed python import statements 2024-04-23 13:23:13 +02:00
d3ec0bd473 Included additional directory path validation based on dir keywords 2024-04-23 11:05:20 +02:00
074d2e3954 Removed config_file output file naming and instead user now inputs desired output filename. Also added input argument to introduce root level metadata. 2024-04-18 19:14:06 +02:00
a1c88fdb5a Added lines to flatten (shorten) original directory paths in the resulting hdf5 file. 2024-04-17 15:20:26 +02:00
f9b31c06fd Reimplemented file filtering, first file extension contraints are imposed and then file keyword contraints. 2024-04-03 13:49:16 +02:00
9cde013be0 Modified node values as the number of children of each group. When nodes are datasets, their value is 1. 2024-04-02 18:48:50 +02:00
39cae66936 Implemented a two important changes. 1. filename of output file is not passed as input but it is automatically computed based on an input config_param dict. 2) input filenames in file system path are now filtered on an initial walk through the directory tree. This is to use stored path filenames for prunning directory tree, later on. 2024-04-02 17:33:58 +02:00
a58bf4f019 Refactored import dependencies. 2024-03-26 13:57:19 +01:00
1bf1f60beb Added lines to treat string attributes as fixed-length strings, which are represented as bytes that need to be decoded with utf-8. There are a few advantages, and hdf5 reader provide more precise behavior than variable length strings 2024-03-22 17:28:47 +01:00
e389ffbefe Relocated def display_group_hierarchy_on_a_treemap(filename: str) to hdf5_vis.py 2024-03-21 16:27:54 +01:00
b886066133 Simplified code and corrected buggy if statement. Included input verification steps and OS path normalization. 2024-03-19 11:11:05 +01:00
79b7428b9f Cleaned up code by removing commented lines and so on. 2024-02-21 10:47:12 +01:00
1a4294e0c2 Modified to received unified dictionary structure and transform it into equivalent group datasets and attribute structure. 2024-02-16 16:52:21 +01:00
e7bdee21da Refactored to interact with config_file.py, which sets available file readers 2024-02-15 15:59:42 +01:00
337a1947fe Reverted a few minor refactoring changes. 2024-02-15 10:10:10 +01:00
8ba9547895 Refactored hdf5_lib.py due to previous file move. 2024-02-15 09:59:36 +01:00
dfdfea2b71 Created src folder and transfered into it previously deleted python scripts. 2024-02-15 09:52:15 +01:00