Commit Graph

95 Commits

Author SHA1 Message Date
florez_j 9cde013be0 Modified node values as the number of children of each group. When nodes are datasets, their value is 1. 2024-04-02 18:48:50 +02:00
florez_j 9071120e50 Refactored code to read .dat and .txt files in binary mode first rb, then the prespecified encoding is used to decode the lines. This is to have more control over the decoding process and be able to better spot possible encoding errors. 2024-04-02 18:35:04 +02:00
florez_j f351f102b7 Commented out a print statement. 2024-04-02 18:31:58 +02:00
florez_j 39cae66936 Implemented a two important changes. 1. filename of output file is not passed as input but it is automatically computed based on an input config_param dict. 2) input filenames in file system path are now filtered on an initial walk through the directory tree. This is to use stored path filenames for prunning directory tree, later on. 2024-04-02 17:33:58 +02:00
florez_j 9c70fd643f Refactored code in terms of subprocess for git functionality. 2024-03-28 19:38:12 +01:00
florez_j 942485ffc1 Modified code to select usecases based on integer number. 2024-03-28 18:24:44 +01:00
florez_j 2b568ff05a Implemented jupyter notebook to run data integration workflow. Tested all usecases defined in config. So far so good. 2024-03-28 18:22:40 +01:00
florez_j 6fb5253d21 Corrected a few bugs; deletion of useless buggy line and configuration of text reader with latin-1 encoding for a few cases. 2024-03-28 18:20:57 +01:00
florez_j bbff419313 Removed strange bug when reading .TXT smps files. Specified latin-1 encoding and relaxed error detection to ignore. 2024-03-28 17:43:26 +01:00
florez_j 06429e6def Generalized workflow functions to consider reviewer attributes such as initials and type e.g., data-owner and metadata-reviewer. 2024-03-28 16:11:01 +01:00
florez_j 37fd603943 Completed first version of metadata_review_lib.py. Still need to test and correct possible bugs. 2024-03-28 13:59:47 +01:00
florez_j f0af30f7e8 Deleted metadata_review_workflow.py and turned it into a jupyter notebook. 2024-03-28 13:10:42 +01:00
florez_j 438ac4d24d Included lines for setting up author and commiter in pygit2 commit function 2024-03-27 14:24:33 +01:00
florez_j 54e30ef9ec Implemented git add and commit for second metadata review step, and create it function to checkout branches. 2024-03-27 14:23:16 +01:00
florez_j 6aa98b71b3 Implemented git add and commit for second metadata review step, and create it function to checkout branches. 2024-03-27 14:22:25 +01:00
florez_j 56010f58ad Submitted metadata review. 2024-03-27 13:55:39 +01:00
florez_j 819474f678 Initialized metadata review process. 2024-03-27 13:55:39 +01:00
florez_j 383c5377c1 Initialized metadata review process. 2024-03-27 13:51:17 +01:00
florez_j 270825e9dc Initialized metadata review process. 2024-03-27 11:35:53 +01:00
florez_j 06283e286f Initialized metadata review process. 2024-03-26 17:22:42 +01:00
florez_j 2aac145379 Removed buggy statement, which was expected to detect recently created review files 2024-03-26 16:34:38 +01:00
florez_j 1a89e1af66 Implemented script to run metadata review workflow 2024-03-26 16:25:44 +01:00
florez_j 302b7dbfa5 Implemented metadata review library 2024-03-26 16:21:02 +01:00
florez_j 1f2bb419fe Save commit 2024-03-26 16:20:04 +01:00
florez_j f37ba4705a Included .h5 files for now, but they should be enable later on through git LFS. 2024-03-26 16:18:48 +01:00
florez_j a727e38db4 Implemented hdf5_vis.py, which is a hdf5 visualization library to obtain treemap and yaml representations of hdf5 files. 2024-03-26 16:14:40 +01:00
florez_j a58bf4f019 Refactored import dependencies. 2024-03-26 13:57:19 +01:00
florez_j e934ae65d6 Relocated from src/ 2024-03-25 08:52:13 +01:00
florez_j 1b9963d44d Moved to input_files/ 2024-03-25 08:51:34 +01:00
florez_j 1bf1f60beb Added lines to treat string attributes as fixed-length strings, which are represented as bytes that need to be decoded with utf-8. There are a few advantages, and hdf5 reader provide more precise behavior than variable length strings 2024-03-22 17:28:47 +01:00
florez_j 13cb6395aa Restructured the way table_preamble attribute is represented. Now it is a list of strings as opposed to a multilinear string with special characters like \n. This is to avoid parsing problems in the yalm files. 2024-03-22 17:26:30 +01:00
florez_j fff935f551 Included optional argument in make_copy function and commented out a few lines that increase dataset storage complexity. 2024-03-21 17:16:14 +01:00
florez_j 4244e39232 Implemented hdf5_vis.py to gather functions that display or represent properties of an hdf5 file in a human readable file format like yalm or html files that enble interative visualizations on the browser. 2024-03-21 16:30:27 +01:00
florez_j e389ffbefe Relocated def display_group_hierarchy_on_a_treemap(filename: str) to hdf5_vis.py 2024-03-21 16:27:54 +01:00
florez_j 8004a891aa Included lines to work on copies of files, and removed .strip() to create the table preamble because it destroyed txt structure. 2024-03-19 14:55:49 +01:00
florez_j 63e7fb28d0 Removed the 'backup_' name from the copied file so that the orignal name is preserved in the hdf5 file. The file copy storage location is enough to distinguise it from the original file. 2024-03-19 14:07:37 +01:00
florez_j 2d8503ef7a Included make_file_copy function 2024-03-19 11:59:39 +01:00
florez_j 2271be6ecc Moved make_file_copy internal function to g5505_utils.py module because it can be reused accross file readers. 2024-03-19 11:58:08 +01:00
florez_j 7fe254755f Replaced attributes, previously extracted from the table_preamble in .txt and .dat files with a single dataset attribute called table_preamble that contains the whole table preamble. 2024-03-19 11:40:35 +01:00
florez_j b886066133 Simplified code and corrected buggy if statement. Included input verification steps and OS path normalization. 2024-03-19 11:11:05 +01:00
florez_j afa89df143 Relocated scripts 2024-03-18 13:44:11 +01:00
florez_j 23d0923c93 Moved to src/ folder. 2024-03-18 13:42:30 +01:00
florez_j 07a532b20b Implemented demos to illustrate functionalities of the openbis_lib.py module. 2024-02-21 16:02:09 +01:00
florez_j bf2c28f843 This filed was moved to src folder. 2024-02-21 15:58:59 +01:00
florez_j 98682420fe Moved openbis_lib.py to src folder. 2024-02-21 15:58:21 +01:00
florez_j a031a33f4f Included output_files folder for the sake of organization. 2024-02-21 15:47:33 +01:00
florez_j 19f7c4a026 Updated gitignore with h5 file extension. 2024-02-21 15:46:17 +01:00
florez_j 79b7428b9f Cleaned up code by removing commented lines and so on. 2024-02-21 10:47:12 +01:00
florez_j 219435511b Changed variable names, rearranged pieces of code, and set up data checks. 2024-02-21 10:41:57 +01:00
florez_j 1a4294e0c2 Modified to received unified dictionary structure and transform it into equivalent group datasets and attribute structure. 2024-02-16 16:52:21 +01:00