|
|
44d4a7b29b
|
.ipybn
|
2024-04-04 13:00:56 +02:00 |
|
|
|
dd1f1245e3
|
Refactored comment lines.
|
2024-04-04 12:58:17 +02:00 |
|
|
|
2d5fecfb34
|
Removed git checkout statements, to avoid conflicting changes of .ipybn files.
|
2024-04-04 12:56:37 +02:00 |
|
|
|
96c68f7614
|
Added .ipynb files to gitignore
|
2024-04-04 11:18:07 +02:00 |
|
|
|
fa4fe691d0
|
Refactored a few git statemets in terms of subprocess.run
|
2024-04-04 11:02:24 +02:00 |
|
|
|
0417ac6deb
|
Modified hdf5 file path whose metadata is to be reviewed.
|
2024-04-04 09:31:19 +02:00 |
|
|
|
72e37ed277
|
Implemented jupyter notebooks for metadata review workflow excecution.
|
2024-04-04 09:18:36 +02:00 |
|
|
|
719e9d6672
|
Repurposed the role of the config_file.py. Now it only provides functions to select the file_readers based on group id and produce a created_at timestamp.
|
2024-04-03 13:55:54 +02:00 |
|
|
|
5cd19979b6
|
Implemented first approach to data integration workflow
|
2024-04-03 13:51:21 +02:00 |
|
|
|
f9b31c06fd
|
Reimplemented file filtering, first file extension contraints are imposed and then file keyword contraints.
|
2024-04-03 13:49:16 +02:00 |
|
|
|
9cde013be0
|
Modified node values as the number of children of each group. When nodes are datasets, their value is 1.
|
2024-04-02 18:48:50 +02:00 |
|
|
|
9071120e50
|
Refactored code to read .dat and .txt files in binary mode first rb, then the prespecified encoding is used to decode the lines. This is to have more control over the decoding process and be able to better spot possible encoding errors.
|
2024-04-02 18:35:04 +02:00 |
|
|
|
f351f102b7
|
Commented out a print statement.
|
2024-04-02 18:31:58 +02:00 |
|
|
|
39cae66936
|
Implemented a two important changes. 1. filename of output file is not passed as input but it is automatically computed based on an input config_param dict. 2) input filenames in file system path are now filtered on an initial walk through the directory tree. This is to use stored path filenames for prunning directory tree, later on.
|
2024-04-02 17:33:58 +02:00 |
|
|
|
9c70fd643f
|
Refactored code in terms of subprocess for git functionality.
|
2024-03-28 19:38:12 +01:00 |
|
|
|
942485ffc1
|
Modified code to select usecases based on integer number.
|
2024-03-28 18:24:44 +01:00 |
|
|
|
2b568ff05a
|
Implemented jupyter notebook to run data integration workflow. Tested all usecases defined in config. So far so good.
|
2024-03-28 18:22:40 +01:00 |
|
|
|
6fb5253d21
|
Corrected a few bugs; deletion of useless buggy line and configuration of text reader with latin-1 encoding for a few cases.
|
2024-03-28 18:20:57 +01:00 |
|
|
|
bbff419313
|
Removed strange bug when reading .TXT smps files. Specified latin-1 encoding and relaxed error detection to ignore.
|
2024-03-28 17:43:26 +01:00 |
|
|
|
06429e6def
|
Generalized workflow functions to consider reviewer attributes such as initials and type e.g., data-owner and metadata-reviewer.
|
2024-03-28 16:11:01 +01:00 |
|
|
|
37fd603943
|
Completed first version of metadata_review_lib.py. Still need to test and correct possible bugs.
|
2024-03-28 13:59:47 +01:00 |
|
|
|
f0af30f7e8
|
Deleted metadata_review_workflow.py and turned it into a jupyter notebook.
|
2024-03-28 13:10:42 +01:00 |
|
|
|
438ac4d24d
|
Included lines for setting up author and commiter in pygit2 commit function
|
2024-03-27 14:24:33 +01:00 |
|
|
|
54e30ef9ec
|
Implemented git add and commit for second metadata review step, and create it function to checkout branches.
|
2024-03-27 14:23:16 +01:00 |
|
|
|
6aa98b71b3
|
Implemented git add and commit for second metadata review step, and create it function to checkout branches.
|
2024-03-27 14:22:25 +01:00 |
|
|
|
56010f58ad
|
Submitted metadata review.
|
2024-03-27 13:55:39 +01:00 |
|
|
|
819474f678
|
Initialized metadata review process.
|
2024-03-27 13:55:39 +01:00 |
|
|
|
383c5377c1
|
Initialized metadata review process.
|
2024-03-27 13:51:17 +01:00 |
|
|
|
270825e9dc
|
Initialized metadata review process.
|
2024-03-27 11:35:53 +01:00 |
|
|
|
06283e286f
|
Initialized metadata review process.
|
2024-03-26 17:22:42 +01:00 |
|
|
|
2aac145379
|
Removed buggy statement, which was expected to detect recently created review files
|
2024-03-26 16:34:38 +01:00 |
|
|
|
1a89e1af66
|
Implemented script to run metadata review workflow
|
2024-03-26 16:25:44 +01:00 |
|
|
|
302b7dbfa5
|
Implemented metadata review library
|
2024-03-26 16:21:02 +01:00 |
|
|
|
1f2bb419fe
|
Save commit
|
2024-03-26 16:20:04 +01:00 |
|
|
|
f37ba4705a
|
Included .h5 files for now, but they should be enable later on through git LFS.
|
2024-03-26 16:18:48 +01:00 |
|
|
|
a727e38db4
|
Implemented hdf5_vis.py, which is a hdf5 visualization library to obtain treemap and yaml representations of hdf5 files.
|
2024-03-26 16:14:40 +01:00 |
|
|
|
a58bf4f019
|
Refactored import dependencies.
|
2024-03-26 13:57:19 +01:00 |
|
|
|
e934ae65d6
|
Relocated from src/
|
2024-03-25 08:52:13 +01:00 |
|
|
|
1b9963d44d
|
Moved to input_files/
|
2024-03-25 08:51:34 +01:00 |
|
|
|
1bf1f60beb
|
Added lines to treat string attributes as fixed-length strings, which are represented as bytes that need to be decoded with utf-8. There are a few advantages, and hdf5 reader provide more precise behavior than variable length strings
|
2024-03-22 17:28:47 +01:00 |
|
|
|
13cb6395aa
|
Restructured the way table_preamble attribute is represented. Now it is a list of strings as opposed to a multilinear string with special characters like \n. This is to avoid parsing problems in the yalm files.
|
2024-03-22 17:26:30 +01:00 |
|
|
|
fff935f551
|
Included optional argument in make_copy function and commented out a few lines that increase dataset storage complexity.
|
2024-03-21 17:16:14 +01:00 |
|
|
|
4244e39232
|
Implemented hdf5_vis.py to gather functions that display or represent properties of an hdf5 file in a human readable file format like yalm or html files that enble interative visualizations on the browser.
|
2024-03-21 16:30:27 +01:00 |
|
|
|
e389ffbefe
|
Relocated def display_group_hierarchy_on_a_treemap(filename: str) to hdf5_vis.py
|
2024-03-21 16:27:54 +01:00 |
|
|
|
8004a891aa
|
Included lines to work on copies of files, and removed .strip() to create the table preamble because it destroyed txt structure.
|
2024-03-19 14:55:49 +01:00 |
|
|
|
63e7fb28d0
|
Removed the 'backup_' name from the copied file so that the orignal name is preserved in the hdf5 file. The file copy storage location is enough to distinguise it from the original file.
|
2024-03-19 14:07:37 +01:00 |
|
|
|
2d8503ef7a
|
Included make_file_copy function
|
2024-03-19 11:59:39 +01:00 |
|
|
|
2271be6ecc
|
Moved make_file_copy internal function to g5505_utils.py module because it can be reused accross file readers.
|
2024-03-19 11:58:08 +01:00 |
|
|
|
7fe254755f
|
Replaced attributes, previously extracted from the table_preamble in .txt and .dat files with a single dataset attribute called table_preamble that contains the whole table preamble.
|
2024-03-19 11:40:35 +01:00 |
|
|
|
b886066133
|
Simplified code and corrected buggy if statement. Included input verification steps and OS path normalization.
|
2024-03-19 11:11:05 +01:00 |
|