4.8 KiB
4.8 KiB
Metadata Annotation Process¶
In this notebook, we will go through a simple metadata annotation process. This involves the following steps:
- Define an HDF5 file.
- Create a YAML representation of the HDF5 file.
- Edit and augment the YAML with metadata.
- Update the original file based on the edited YAML.
Import libraries and modules¶
- Excecute (or Run) the Cell below
In [ ]:
import os from nbutils import add_project_path_to_sys_path # Add project root to sys.path add_project_path_to_sys_path() try: import src.hdf5_ops as hdf5_ops import pipelines.metadata_revision as metadata_revision print("Imports successful!") except ImportError as e: print(f"Import error: {e}")
Imports successful!
Step 1: Define an HDF5 file¶
- Set up the string variable
hdf5_file_pathwith the path to the HDF5 file of interest. - Excecute Cell.
In [ ]:
hdf5_file_path = "../output_files/collection_kinetic_flowtube_study_LuciaI_2022-01-31_2023-06-29/kinetic_flowtube_study_LuciaI_2023-06-29.h5"
Step 2: Create a YAML Representation of the File¶
We now convert HDF5 file structure and existing metadata into a YAML format. This will be used to add and edit metadata attributes.
- Excecute Cell.
In [4]:
yaml_file_path = hdf5_ops.serialize_metadata(hdf5_file_path,output_format='json') if os.path.exists(yaml_file_path): print(f'The YAML file representation {yaml_file_path} of the HDF5 file {hdf5_file_path} was created successfully.')
The YAML file representation output_files/collection_kinetic_flowtube_study_LuciaI_2022-01-31_2023-06-29/kinetic_flowtube_study_LuciaI_2023-06-29.json of the HDF5 file output_files/collection_kinetic_flowtube_study_LuciaI_2022-01-31_2023-06-29/kinetic_flowtube_study_LuciaI_2023-06-29.h5 was created successfully.
Step 3: Edit and Augment YAML with Metadata¶
We can now manually edit the YAML file to add metadata.
- (Optional) automate your metadata annotation process by creating a program that takes the YAMl file and returns the modified version of it.
- Excecute Cell.
In [ ]:
def metadata_annotation_process(yaml_file_path): # Include metadata annotation logic, e.g., load yaml file and modify its content accordingly print(f'Ensure your edits to {yaml_file_path} have been properly incorporated and saved.') return yaml_file_path yaml_file_path = metadata_annotation_process(yaml_file_path)
Step 4: Update the Original File Based on the Edited YAML¶
Lastly, we will update the original file with the metadata from the YAML file.
- Excecute Cell.
In [ ]:
metadata_revision.update_hdf5_file_with_review(hdf5_file_path,yaml_file_path)