Commit Graph

452 Commits

Author SHA1 Message Date
109be49f31 Merge branch 'feature/DB_for_FileReader_Repo' into 'main'
Restructuring of file reader system to process multi-instrument data folders.

See merge request 5505-public/dima!3
2025-02-25 10:48:59 +01:00
14b738818c Merge branch 'main' into 'feature/DB_for_FileReader_Repo'
# Conflicts:
#   instruments/filereader_registry.py
#   pipelines/data_integration.py
#   src/hdf5_writer.py
2025-02-25 10:41:02 +01:00
4f438f86fe Update import statements in pipelines/data_integration.py. from instruments.readers import ... -> from instruments import ... 2025-02-25 09:21:52 +01:00
68344964ac Implemented create_hdf5_from_filesystem_new() using new instrument readers cml interface and subprocesses. This facilitates extension of file reading capabilities by collaborators without requiring changes to file_registry.py. Only additions in folders and registry.yaml. 2025-02-24 18:48:03 +01:00
e5fdc6fa31 Update all file readers with command line interface so we can run them as a subprocess. Added also registry.yaml to decouple code from user-based instrument adaptations or extensions. 2025-02-24 17:27:12 +01:00
2cdd6925af Merge branch 'main' of https://gitlab.psi.ch/5505-public/dima 2025-02-22 18:02:45 +01:00
bc1d65d469 Fix import for filereader_registry.py after moving it from intruments/readers/ one level above. 2025-02-22 17:59:00 +01:00
85d4e39299 Moved filereader_registry.py outside readers folder. 2025-02-22 17:53:19 +01:00
02e926e003 Moved filereader_registry.py outside readers folder. 2025-02-22 17:51:56 +01:00
81be6b54c8 Change import statements with try except to enable explicit import of submodules from import to avoid conflicts with parent project. 2025-02-22 17:10:53 +01:00
df0aca97df Implement data_lineage_metadata.json detection and then use it to annotate associated file. 2025-02-10 15:56:34 +01:00
b8900cab67 Enable boolean type columns from pandas DataFrame to be suitably converted into numpy structured array 2025-02-10 15:52:17 +01:00
7906387271 Make file reader selection case insensitive by using ext.lower() and update config_text_reader.py to point to renamed dictionary. 2025-02-08 19:45:16 +01:00
cbf468f5ac remove instruments/dictionaries/ICAD_NO2.yaml. Its dict terms are now in ICAD.yaml. 2025-02-08 19:23:37 +01:00
131704dcf2 Add dict terms from ICAD_NO2.yaml 2025-02-08 19:22:27 +01:00
33aabf45fa Combine dictionaries of ICAD_HONO.yaml and ICAD_NO2.yaml into ICAD.yaml 2025-02-08 19:21:17 +01:00
3e6f6bc46e Remove skip directory condition when directory keywords are empty. Here, all paths to files should be considered. 2025-02-07 16:37:01 +01:00
1a843ee2c6 Fix reader txt/csv default behavior. 2025-02-07 16:25:45 +01:00
46ca26a983 Enable instrumentFolder of form <instFolder>/<category>/ to be trasfered without flatenning 2025-02-07 16:24:21 +01:00
36780d1a63 Add try except block to trigger errors for invalid group names. 2025-02-06 16:07:45 +01:00
5943c60216 Add constraint to match only path/to/keyword1/keyword2/files containing a composite keyword keyword1/keyword2. 2025-02-06 15:34:38 +01:00
58386ca10b Add property to extracted dataset as dataframe. Now time column is of datetime type to facilitate downstream procesing. 2025-02-04 17:23:32 +01:00
d89aebd861 Implement method in hdf5 manager to infer datetime column in dataset 2025-02-04 17:13:01 +01:00
e358d4ab64 Synch with remote repo 2025-02-03 10:31:48 +01:00
5e3f75d66b Fix typo in html text. 2025-01-27 13:53:59 +01:00
a3a1b8506c Update readme.md and set_up_env.sh 2025-01-27 13:29:29 +01:00
1b2184d8e1 Update unload operation to remove reference and fix logic error to dataset metadata extraction. 2025-01-24 10:28:43 +01:00
7ffcd90e7b Add validation step to yaml file validation to ensure list type and a minimun length for the 'instrument_datafolder' keyword. 2025-01-22 15:55:21 +01:00
d59e9d2c0b Fix typo on extension items, extensions need to include a dot .json and .yaml. 2025-01-21 09:30:49 +01:00
f07dfc0a81 Add json and yaml extensions to admissible file extension lists. 2025-01-21 08:57:38 +01:00
de9c45c21f Updated to cleared jupyter notebooks 2025-01-14 14:46:43 +01:00
ba49b168c4 Added comments to explain configuration parameters/or variables. 2025-01-14 14:25:53 +01:00
df4bd2b3ae Add directory tree structure description. 2024-12-04 17:20:35 +01:00
368e4ce6d8 Update .gitignore with output_files/ 2024-12-04 16:53:57 +01:00
4d87169732 Add .gitkeep and keep this folder empty. it is only to be used for local processing 2024-12-04 16:52:50 +01:00
32c1bd0731 Update readme with getting started section 2024-12-04 16:24:14 +01:00
b13b4a4b57 Solved binary incompatibility issue of generated environment by conda installing h5py and numpy from conda-forge or default channels. 2024-12-04 16:15:42 +01:00
5fa28ca917 Updated bash script and yml env file to set up python interpreter. 2024-12-04 13:52:35 +01:00
d787ce6972 Update to readme.md 2024-12-03 13:55:45 +01:00
941bf0e784 Update readme with key features of the repo. 2024-12-03 13:50:53 +01:00
68cf2f8d3e Updated README.md with software arquitecture figure 2024-12-02 17:28:22 +01:00
99cc6faf11 Updated README.md with software arquitecture figure 2024-12-02 17:24:48 +01:00
39eec2679e Updated README 2024-12-02 17:22:52 +01:00
6319c36cfb Updated figure name. 2024-12-02 17:08:36 +01:00
c97ff1208e Updated ci runner pipeline fot gitlab page 2024-12-02 16:31:49 +01:00
6899894ba1 Updated documentation and built doc website 2024-12-02 16:31:03 +01:00
fc139e0ae5 Relocated to visualization module 2024-12-02 15:39:41 +01:00
ef8cf9bb4e Add __init__.py 2024-12-02 15:36:03 +01:00
d79877cc9b Moved hdf5_lib.py to visualization folder 2024-12-02 15:34:44 +01:00
fa9edcb115 Removed no longer useful notebook 2024-12-02 15:32:57 +01:00