334335387eUpdate to notebooks/demo_data_integration.ipynb. Step description now includes information about set up with network_mount env variable
main
Florez Ospina Juan Felipe2025-06-22 12:16:26 +02:00
e851131269Append new functions to utils/g5505_utils.py. This search for .env file in root directoryFlorez Ospina Juan Felipe2025-06-22 12:13:14 +02:00
f3ff32e049Update to pipelines/data_integration.py. Added feature to use environment variable MOUNT_DRIVE defined in .env file.Florez Ospina Juan Felipe2025-06-22 12:11:48 +02:00
630189c5d7Update input_files/campaignDescriptor3_NG.yaml input directory due changes in source directoryFlorez Ospina Juan Felipe2025-06-22 10:42:20 +02:00
b610b4e337Rename yaml files in input_files/ as campaign descriptors for consistency with idear project.Florez Ospina Juan Felipe2025-06-20 10:24:53 +02:00
b96c04fc01Refactor instruments/readers/g5505_text_reader.py, some code abstracted as functions to improve readabilitity.Florez Ospina Juan Felipe2025-06-19 20:40:14 +02:00
f555f7f199Implement skipping in convert_attrdict_to_np_structured_array(attr_value: dict) when dictionary values are not scalar. This ensures compatible values are transfered while the rest simply dicarded.Florez Ospina Juan Felipe2025-06-10 16:03:01 +02:00
83cec97e83Fix bug instruments/readers/structured_file_reader.py. pd.to_dict return a list of dicts so we need to handle each item seprately using a loop.Florez Ospina Juan Felipe2025-06-07 19:14:53 +02:00
f640205b12Add new file reader instruments/readers/structured_file_reader.py, and update registry.py and yamlFlorez Ospina Juan Felipe2025-06-07 18:15:41 +02:00
e80c19ef61Update src/hdf5_writer.py to consider data lineage metadata in data ingestion processFlorez Ospina Juan Felipe2025-06-07 15:31:13 +02:00
87462211a9Update instruments/readers/nasa_ames_reader.py to handle dirty text entries. Dirty entries of time variables that cannot be properly processed are sent to natFlorez Ospina Juan Felipe2025-05-27 10:12:20 +02:00
e4b2a4cd5aSplit header in three parts and detect variables and variable descriptions added to attribute dictionaryFlorez Ospina Juan Felipe2025-05-21 09:19:16 +02:00
ad4339a76bAdded new filereader dictionary pair for nasames files. This is a first version that may change.Florez Ospina Juan Felipe2025-05-14 13:50:08 +02:00
32abd4cd56Implemented hdf5_file_reader.py and updated register.yaml and hdf5_writer.py. This replaces previous function __copy_file_in_group().Florez Ospina Juan Felipe2025-02-25 12:25:15 +01:00
5f9f09d288Merge branch 'feature/DB_for_FileReader_Repo' into 'main'florez_j2025-02-25 10:48:59 +01:00
295b43a89aMerge branch 'main' into 'feature/DB_for_FileReader_Repo'florez_j2025-02-25 10:41:02 +01:00
064b8b3a62Update import statements in pipelines/data_integration.py. from instruments.readers import ... -> from instruments import ...Florez Ospina Juan Felipe2025-02-25 09:21:52 +01:00
db4bb0ef03Implemented create_hdf5_from_filesystem_new() using new instrument readers cml interface and subprocesses. This facilitates extension of file reading capabilities by collaborators without requiring changes to file_registry.py. Only additions in folders and registry.yaml.Florez Ospina Juan Felipe2025-02-24 18:48:03 +01:00
92a2560ed7Update all file readers with command line interface so we can run them as a subprocess. Added also registry.yaml to decouple code from user-based instrument adaptations or extensions.Florez Ospina Juan Felipe2025-02-24 17:27:12 +01:00
1e67745fa4Fix import for filereader_registry.py after moving it from intruments/readers/ one level above.Florez Ospina Juan Felipe2025-02-22 17:59:00 +01:00
821d314cb6Change import statements with try except to enable explicit import of submodules from import to avoid conflicts with parent project.Florez Ospina Juan Felipe2025-02-22 17:10:53 +01:00
8ce6f588dcImplement data_lineage_metadata.json detection and then use it to annotate associated file.Florez Ospina Juan Felipe2025-02-10 15:56:34 +01:00
68a9928c39Enable boolean type columns from pandas DataFrame to be suitably converted into numpy structured arrayFlorez Ospina Juan Felipe2025-02-10 15:52:17 +01:00
c28286a626Make file reader selection case insensitive by using ext.lower() and update config_text_reader.py to point to renamed dictionary.Florez Ospina Juan Felipe2025-02-08 19:45:16 +01:00
0b29e2ec68remove instruments/dictionaries/ICAD_NO2.yaml. Its dict terms are now in ICAD.yaml.Florez Ospina Juan Felipe2025-02-08 19:23:37 +01:00
b58e205f9fRemove skip directory condition when directory keywords are empty. Here, all paths to files should be considered.Florez Ospina Juan Felipe2025-02-07 16:37:01 +01:00
0d26777732Enable instrumentFolder of form <instFolder>/<category>/ to be trasfered without flatenningFlorez Ospina Juan Felipe2025-02-07 16:24:21 +01:00
2f72177410Add constraint to match only path/to/keyword1/keyword2/files containing a composite keyword keyword1/keyword2.Juan Felipe Florez Ospina2025-02-06 15:34:38 +01:00
5d0ab4603fAdd property to extracted dataset as dataframe. Now time column is of datetime type to facilitate downstream procesing.Juan Felipe Florez Ospina2025-02-04 17:23:32 +01:00
ef66d8f1c2Update unload operation to remove reference and fix logic error to dataset metadata extraction.Florez Ospina Juan Felipe2025-01-24 10:28:43 +01:00
1ae607f73bAdd validation step to yaml file validation to ensure list type and a minimun length for the 'instrument_datafolder' keyword.Florez Ospina Juan Felipe2025-01-22 15:55:21 +01:00
3e37854445Solved binary incompatibility issue of generated environment by conda installing h5py and numpy from conda-forge or default channels.Florez Ospina Juan Felipe2024-12-04 16:15:42 +01:00
2e52109beeremoved review folder. This is now supposed to be create for review of experimental campaign data objects metadata.Florez Ospina Juan Felipe2024-12-02 14:32:34 +01:00
11ca454b94Removed bacause some of the functionalities have been outsourced to other modules src/hdf5_ops.py and src/hdf5_writer.pyFlorez Ospina Juan Felipe2024-11-26 11:55:06 +01:00
1b2b319295Attempt to dynamically resolve path to dima package, when excecuted from command line.Florez Ospina Juan Felipe2024-11-24 17:37:38 +01:00
1174ffc8b8Commented out metadata info about group members for a given group. This is to simplify yaml or json representation of the metadata.Florez Ospina Juan Felipe2024-11-24 15:57:54 +01:00
3122c4482fRemoved logs/ folder. This is usually created locally to trace certain file copying, transfer and conversion processes.Florez Ospina Juan Felipe2024-11-24 10:51:05 +01:00
c257ab6072Rerun jupyternotebooks to check their functionality after relocating them to notebooks. OpenBis related python scripts still need to be tested.Florez Ospina Juan Felipe2024-11-24 10:45:40 +01:00
02ded9c11aAdd hidden.py to the list. This file may contain sensitive information is only to be accessed locally or from a secure location.Florez Ospina Juan Felipe2024-11-24 10:41:32 +01:00
b24d33ab15Check whether h5 file being written exists. If so, we do not overwrite it because it may be underdoing refinement, changes or updates, for archiving, sharing, or publishing.Florez Ospina Juan Felipe2024-11-24 10:38:13 +01:00
ca314f971dAdd utility functions add_project_path_to_sys_path() to set up path to DIMA's modules dynamically.Florez Ospina Juan Felipe2024-11-24 10:08:19 +01:00
a928c4ef4cMoved demos .py to notebooks. Note: Maybe turn them to jupyternotebooks for consistencyFlorez Ospina Juan Felipe2024-11-24 07:49:50 +01:00