c876e925a7
Modified code to point to new instrument folders location. Also, upgrated code to accept either a user specified location or the default location
Florez Ospina Juan Felipe2024-08-12 13:40:01 +02:00
8f7f14ab68
Removed time stamp configuration attributes from ACSM_TOFWARE, because it can be messy for a configuration file.
Florez Ospina Juan Felipe2024-08-08 11:24:41 +02:00
ae1e3bfc23
Moved ext_to_reader_dict to g5505_file_reader.py and replaced redear selection based on g5505_reader.select_file_reader(hdf5_file_path).
Florez Ospina Juan Felipe2024-08-07 16:30:36 +02:00
4e669b3eee
Moved hdf5_file_path to file reader mapping and extension definitions to g5505_file_reader_module.py. Created functions to compute file_reader key from path to file in the hdf5 file and select the reader based on the key. This should enable more modular file reader selection.
Florez Ospina Juan Felipe2024-08-07 16:21:22 +02:00
3430627494
Modified reader to output table_preamble as a dataset as opposed to attributes of the file. I believe this is better for readability of the metadata given that those preambles can sometimes contain large ammounts of text.
Florez Ospina Juan Felipe2024-08-02 14:37:06 +02:00
a06e28291c
Added attribution insertion order tracking at the root level and reorganized a few import statements.
Florez Ospina Juan Felipe2024-07-17 08:41:40 +02:00
f04f5eaaf9
Robustified column name to description assigment, however it may be a bit slower than before.
Florez Ospina Juan Felipe2024-07-10 13:31:47 +02:00
73beb83278
Moved parse_attribute() from ..review_lib.py into ...utils.py and backpropagate (refactored) changes to respective modules.
Florez Ospina Juan Felipe2024-07-10 11:32:00 +02:00
2ce925735d
Modified return datetime output to a format without colons, which could be problematic for filenaming.
Florez Ospina Juan Felipe2024-07-10 09:47:56 +02:00
0a0b4ac41d
Moved a few functions from ...reader.py and hdf5_lib.py into ..utils.py, and refactored accordingly.
Florez Ospina Juan Felipe2024-07-10 09:19:30 +02:00
0c74c52e09
Removed smogchamber reader because its funtionality is now integrated into g5505_file_reader.py.
Florez Ospina Juan Felipe2024-07-09 16:13:01 +02:00
cb7d914908
Cleaned code and modified def create_hdf5_file_from_dataframe to create group hierichy implicitly from path rather than recursively.
Florez Ospina Juan Felipe2024-07-08 15:24:48 +02:00
92eca4d79e
Moved remaining git operations in metadata_review_lib.py to git_ops.py and refactored accoringly
Florez Ospina Juan Felipe2024-07-05 15:46:20 +02:00
cedfe614e7
Implemented input argument to enable append information to exisintg attributes, which must take the values of either strings or lists.
Florez Ospina Juan Felipe2024-06-20 15:32:33 +02:00
106795ae59
Added a few lines to detect the existence of the file and change the file mode from 'w' to 'a' based on that information.
Florez Ospina Juan Felipe2024-06-20 09:03:47 +02:00
498a51cbc6
Updated function to add project level metadata at the root group of the hdf5 file.
Florez Ospina Juan Felipe2024-06-19 18:31:11 +02:00
06c5c6d84b
Incorporated method to MetadataHarvester class to collect project level metadata.
Florez Ospina Juan Felipe2024-06-19 18:30:02 +02:00
a6868d985d
Fixed bug regarding datetime to str column conversion in dataframe by using .map(srt) (element wise operation) as opposed to .apply(str)
Florez Ospina Juan Felipe2024-06-18 09:21:46 +02:00
c68e800967
Incorporated dataframe_to_np_structured_array(df: pd.DataFrame) from another module.
Florez Ospina Juan Felipe2024-06-16 18:39:30 +02:00
e4de4edf28
Incorporated dataframe_to_np_structured_array(df: pd.DataFrame) from another module.
Florez Ospina Juan Felipe2024-06-16 18:26:12 +02:00
2d4ecec806
Moved dataframe_to_np_structured_array(df: pd.DataFrame) to src/g5505_utils.py. This is a more generic function that can be used more broadly accross modules.
Florez Ospina Juan Felipe2024-06-16 18:25:08 +02:00
0fb14b7c6c
Developed a metadata harvesting object to facilitate metadata collection throught the code.
Florez Ospina Juan Felipe2024-06-13 15:47:02 +02:00
f43d86e729
Modified a few variable values in yaml files so that they are within expected values.
Florez Ospina Juan Felipe2024-06-13 15:45:39 +02:00
9ab9aa49c4
Abstracted a code snippet from def create_hdf5_file_from_filesystem_path(..) as transfer_file_dict_to_hdf5() so that it can be reusable.
Florez Ospina Juan Felipe2024-06-13 15:44:01 +02:00
e7ed6145f0
Implemented a data extraction module to access data from an hdf5 file in the form of dataframes.
Florez Ospina Juan Felipe2024-06-11 10:38:04 +02:00
a410bde23e
Removed data table split into categorical and numerical variables and numering is only introduce to disambiguate repeated columns.
Florez Ospina Juan Felipe2024-06-10 16:18:51 +02:00
1ec7ad76ff
Removed additional numbering from some intrument specifications. These are now only added if the column names are ambigous.
Florez Ospina Juan Felipe2024-06-10 16:14:13 +02:00
197ad0288a
Updated file reader and data integration with datastart and dataend properties.
Florez Ospina Juan Felipe2024-06-04 13:37:20 +02:00
9dcc757acc
renamed folder src/instrument_descriptions/ to src/intruments/ and moved text_data_sources.yaml in there.
Florez Ospina Juan Felipe2024-06-04 10:54:09 +02:00
a6ddb24eeb
Added .strip to column names to remove unwanted characters (\r|\t|\n) and included units description to timestamps.
Florez Ospina Juan Felipe2024-06-04 09:57:37 +02:00
014bd14fcd
Modified temperature units from °C to Celcius for simpler string encoding. It seems ascii codec cannot encode such a character
Florez Ospina Juan Felipe2024-06-04 09:44:09 +02:00
385267a98f
Updated treemap visualization to select only root metadata, which is of string type.
Florez Ospina Juan Felipe2024-06-03 14:17:42 +02:00
d335836a7d
Updated reader to standardize timestamps to a desired format when possible. The desired format is set in text_data_sources.yaml.
Florez Ospina Juan Felipe2024-06-02 15:59:01 +02:00
3a9aede909
Made def third_update_hdf5_file_with_review more modular by separating data update and git operations, resulting new functions that can be reused in less restrictive matadata annotation contexts.
Florez Ospina Juan Felipe2024-05-29 15:26:48 +02:00
ef7c6c9efb
Implemented a git operations module for automated git ops, based on subprocess.
Florez Ospina Juan Felipe2024-05-29 15:17:09 +02:00
dad5e082f1
Changed ordering of data integration config files so that they align with our experimental campaign hierarchy.
Florez Ospina Juan Felipe2024-05-28 14:43:32 +02:00
3de6abce50
added the feature to activate or deactivate data copying before reading the input file. This is to avoid redundant copying when we are already working on file copies.
Florez Ospina Juan Felipe2024-05-28 14:40:14 +02:00
fd1c6461bb
Updated some of the raname_as metadata for all instruments so that it is much machine readable and perhpas be used as an alternative to the original name in future releases.
Florez Ospina Juan Felipe2024-05-28 14:37:43 +02:00
804ea52583
Modified function to return list of paths when config_file.yaml integration mode = experimental step.
Florez Ospina Juan Felipe2024-05-28 11:29:32 +02:00
f6a46168ec
Improved parsing from HDF5 attr dict to yaml compatible dict. Now we can parse HDF5 compound attributes (structured np arrays).
Florez Ospina Juan Felipe2024-05-28 11:27:44 +02:00
08d58557df
Fixed bug that didnot allowed analythical_methods composite keywords (e.g., ICAD/HONO) to be matched in intrument configurations.
Florez Ospina Juan Felipe2024-05-28 08:57:57 +02:00
2911416431
Improved modularity of hdf5_file creation by creating a function that copies the intput directory file and applies directory, files, and extensions constraints before regular directory to hdf5 transfer. See [200~def copy_directory_with_contraints(input_dir_path, output_dir_path, select_dir_keywords, select_file_keywords, allowed_file_extensions):
Florez Ospina Juan Felipe2024-05-27 18:15:08 +02:00
77afbbbf8f
Added function to convert list of strings into a np.array of bytes. This is useful to create list-valued attributes in HDF5.
Florez Ospina Juan Felipe2024-05-26 14:56:36 +02:00
88572b44b1
Fixed buggy statement. import datetime ... followed by datetime.now() was fixed as datetime.datetime.now().
Florez Ospina Juan Felipe2024-05-26 12:26:54 +02:00
37071945f5
Removed hdf5 file creation redundancy by creating a helper function create_HDF5_file(date_str,select_file_keywords), which handles variations in date_str and keywords.
Florez Ospina Juan Felipe2024-05-26 12:24:15 +02:00
4dc09339b5
Replaced lambda function with regular function and fstring for better readability and debugging
Florez Ospina Juan Felipe2024-05-26 11:39:40 +02:00
b7f9bfe149
Replaced print statement with logging and raise exception for better error handling and managment
Florez Ospina Juan Felipe2024-05-26 11:34:20 +02:00
ac37235072
Added function setup_logging to configure logger to record logs in specified output directory.
Florez Ospina Juan Felipe2024-05-26 11:19:54 +02:00