Commit Graph

142 Commits

Author SHA1 Message Date
florez_j ceb8a34ee0 Commented out no needed python import statements 2024-04-23 13:23:13 +02:00
florez_j 129443d6d9 Updated python libraries and intallation instructions. 2024-04-23 13:20:42 +02:00
florez_j 8876d5af4f Example data integretion configuration files 2024-04-23 12:03:24 +02:00
florez_j f65abda6d1 jupyter notebook to run data integration workflow 2024-04-23 11:11:13 +02:00
florez_j a12cd80355 Implemented function that takes yaml config files specifying data integration output 2024-04-23 11:10:13 +02:00
florez_j b233dc094d yaml intrument configuration file for text data 2024-04-23 11:07:49 +02:00
florez_j d3ec0bd473 Included additional directory path validation based on dir keywords 2024-04-23 11:05:20 +02:00
florez_j 9d9e9dcfe5 Added lines to parse instrument reader properties from yaml file. 2024-04-23 11:02:10 +02:00
florez_j 074d2e3954 Removed config_file output file naming and instead user now inputs desired output filename. Also added input argument to introduce root level metadata. 2024-04-18 19:14:06 +02:00
florez_j 1ed37920c2 Replaced git commands in terms of subprocess.run 2024-04-17 15:26:45 +02:00
florez_j a1c88fdb5a Added lines to flatten (shorten) original directory paths in the resulting hdf5 file. 2024-04-17 15:20:26 +02:00
florez_j 8005b60579 Included a boolean input argument hdf5_upload to deactivate hdf5 upload for testing. 2024-04-07 17:09:01 +02:00
florez_j edd1bbf5be Added an import and treemap to png statemets, but for some reason didnot work, and took forever to run. So, I left the lines but for now commented them out. 2024-04-07 16:55:37 +02:00
florez_j 89e94a1b2b Renamed forth_submit_.. function to last_submit .. 2024-04-05 17:21:18 +02:00
florez_j 5e70d9158b Deleted function third_complete_metadata_review() because forth_complete_metadata_review() is the same. Also, modified a substring of their name from complete to submit and submit to save for clarity. Usually submission is the last step of a review process. 2024-04-05 17:10:34 +02:00
florez_j d68dc98070 Implemented some safeguards that enable only commits of untracked metadata review files 2024-04-04 14:20:13 +02:00
florez_j d6ee987859 Submitted metadata review. 2024-04-04 14:07:54 +02:00
florez_j 12b3accad9 Initialized metadata review. 2024-04-04 14:06:21 +02:00
florez_j e9578a775a Submitted metadata review. 2024-04-04 13:23:52 +02:00
florez_j 7a39c49931 Initialized metadata review. 2024-04-04 13:16:33 +02:00
florez_j 5839c0f466 Resolved incoming changes from main branch 2024-04-04 13:07:43 +02:00
florez_j 44d4a7b29b .ipybn 2024-04-04 13:00:56 +02:00
florez_j dd1f1245e3 Refactored comment lines. 2024-04-04 12:58:17 +02:00
florez_j 2d5fecfb34 Removed git checkout statements, to avoid conflicting changes of .ipybn files. 2024-04-04 12:56:37 +02:00
florez_j 96c68f7614 Added .ipynb files to gitignore 2024-04-04 11:18:07 +02:00
florez_j fa4fe691d0 Refactored a few git statemets in terms of subprocess.run 2024-04-04 11:02:24 +02:00
florez_j 48d3b8f492 Submitted previous deletes to clean working directory tree 2024-04-04 10:58:16 +02:00
florez_j 811a4a0615 Initialized metadata review process. 2024-04-04 09:51:39 +02:00
florez_j a2e6d823ce Submitted previous deletes to clean working directory tree 2024-04-04 09:40:24 +02:00
florez_j 0417ac6deb Modified hdf5 file path whose metadata is to be reviewed. 2024-04-04 09:31:19 +02:00
florez_j dee204010d Initialized metadata review process. 2024-04-04 09:23:55 +02:00
florez_j 72e37ed277 Implemented jupyter notebooks for metadata review workflow excecution. 2024-04-04 09:18:36 +02:00
florez_j 719e9d6672 Repurposed the role of the config_file.py. Now it only provides functions to select the file_readers based on group id and produce a created_at timestamp. 2024-04-03 13:55:54 +02:00
florez_j 5cd19979b6 Implemented first approach to data integration workflow 2024-04-03 13:51:21 +02:00
florez_j f9b31c06fd Reimplemented file filtering, first file extension contraints are imposed and then file keyword contraints. 2024-04-03 13:49:16 +02:00
florez_j 9cde013be0 Modified node values as the number of children of each group. When nodes are datasets, their value is 1. 2024-04-02 18:48:50 +02:00
florez_j 9071120e50 Refactored code to read .dat and .txt files in binary mode first rb, then the prespecified encoding is used to decode the lines. This is to have more control over the decoding process and be able to better spot possible encoding errors. 2024-04-02 18:35:04 +02:00
florez_j f351f102b7 Commented out a print statement. 2024-04-02 18:31:58 +02:00
florez_j 39cae66936 Implemented a two important changes. 1. filename of output file is not passed as input but it is automatically computed based on an input config_param dict. 2) input filenames in file system path are now filtered on an initial walk through the directory tree. This is to use stored path filenames for prunning directory tree, later on. 2024-04-02 17:33:58 +02:00
florez_j 9c70fd643f Refactored code in terms of subprocess for git functionality. 2024-03-28 19:38:12 +01:00
florez_j 0fcdc4ad2e Refactored code in terms of subprocess for git functionlity 2024-03-28 19:36:50 +01:00
florez_j accb271d83 Submitted metadata review. 2024-03-28 19:33:52 +01:00
florez_j 8991dfd6df Initialized metadata review process. 2024-03-28 19:32:30 +01:00
florez_j c552752468 Submitted metadata review. 2024-03-28 19:31:19 +01:00
florez_j 5d54ac99cd Submitted metadata review. 2024-03-28 19:27:32 +01:00
florez_j 1df8faedf6 Submitted metadata review. 2024-03-28 19:24:23 +01:00
florez_j 366e01fa4c Initialized metadata review process. 2024-03-28 19:11:46 +01:00
florez_j 2a7d1eeb89 Initialized metadata review process. 2024-03-28 19:11:46 +01:00
florez_j e08119e9b6 Initialized metadata review process. 2024-03-28 19:09:51 +01:00
florez_j c3048f3083 Submitted metadata review. 2024-03-28 18:28:58 +01:00