Commit Graph

43 Commits

Author SHA1 Message Date
8d4e24c29b Update the helper function to reflect changes in run_config to process time slices of a dataset 2025-09-29 11:05:34 +02:00
da275cdc97 feat: adapt sp2xr_pipeline and helpers to run multiple times across the same dataset and different time slots but ensuring config settings are the same for the entire dataset 2025-09-29 11:00:14 +02:00
d3a7448883 Improve conversion from original csv or zip files to parquet with more robust schema definition 2025-09-11 16:18:21 +02:00
0a71ca614c feat: modernize all type 2025-09-11 14:49:48 +02:00
a2df98042c refactor: reorganize calibration modules and add type hints 2025-09-11 14:45:06 +02:00
6621236ea4 fix: correct handling of file path structures in different operating systems 2025-09-09 19:13:14 +02:00
f437b1c5fe feat: add the possibility to decide the saving partition schema between date or date/hour 2025-09-09 17:09:38 +02:00
29e2351341 Feat: user can now decide frequency of repartition of dask dataframes after being loaded (both hk and pbp) 2025-09-09 16:03:42 +02:00
b377c36c28 Chore: cleanup old code 2025-09-09 15:03:44 +02:00
3a41fbf387 fix: fix bug that was leading to extremely large dask graphs and move all histogram calcualtion logic to the distributio.py module 2025-09-09 14:53:19 +02:00
b91380a6db fix: increase Dask worker wait time to prevent premature shutdown 2025-09-09 14:43:14 +02:00
c1243e3b1e feat: improve cluster shutdown and cleanup logic 2025-08-22 18:18:44 +02:00
8a0c1f3305 feat: add TMPDIR for temporary files in cluster jobs 2025-08-22 18:15:39 +02:00
3173a7c83b Cleanup: remove wreck code from src/sp2xr/resample_pbp_hk.py 2025-08-22 16:42:02 +02:00
2da9eb6089 Cleanup: remove wreck code from src/sp2xr/helpers.py 2025-08-22 16:40:26 +02:00
d2a0533a12 Cleanup: remove wreck code from src/sp2xr/schema.py 2025-08-22 16:38:06 +02:00
681de6c203 Fix: add missing import of cast_and_arrow function to sp2xr_pipeline 2025-08-22 16:31:03 +02:00
40bae0e5d2 Refactor: move function _cast_and_arrow to schema.py 2025-08-22 16:19:42 +02:00
43c31728e0 Fix: add control to skip empty ddf when time chunk is empty 2025-08-22 16:06:05 +02:00
8947046049 Feature: remove saving of ddf_pbp_hk_dt files 2025-08-22 16:03:43 +02:00
547c7f3108 Fix: removed default values in parse_args to avoid unexpected behavior when passing run_config settings 2025-08-22 16:01:01 +02:00
9c384f5245 Fix: Removed default values in the load_and_resolve_config function to avoid unexpected behavior when run_config doesn't provide settings 2025-08-22 15:59:48 +02:00
f42a308474 Feature: processing is now divided in time chunks to reduce size of dask graph 2025-08-22 11:37:58 +02:00
d6b3f2028f Fix: bug fixed in conversion from BC mass to diam due to density units mismatch in the config file and default values 2025-08-22 10:48:08 +02:00
ebd14bcbae Fix: typo in the config reading was blocking calibration 2025-08-21 11:39:11 +02:00
d7f778d531 Fix: the hisotgrams were adding lines with NaNs when the corresponding partition was completely empty. Now it is back to old behavior and no index is added for partitions completely empty. 2025-08-21 11:37:23 +02:00
a2cc520ff2 feat: possibility to choose between running locally and via slurm cluster 2025-08-14 11:48:52 +02:00
554ec26410 Refactor: moved parts of processing code to specific functions in separate modules (join_pbp_with_flow and aggregate_dt) 2025-08-13 17:02:47 +02:00
053d0d5b75 Refactor: separte the workflow for the option to calculate distributions for BC (numb and amss), scattering and time delay 2025-08-13 16:26:39 +02:00
1038b18187 Clean up old parts in the code and re-arranged processing order 2025-08-13 16:08:37 +02:00
5f3d25817c Refactor: moved calibration of PbP data to new file and clear separation between run_config and instr_config 2025-08-13 15:28:57 +02:00
6f87b4cc79 Refactor: implemented input run_config.yaml file 2025-08-13 14:09:58 +02:00
fa4cb9e7d4 Refactor: moved some functions to respective modules and implemented the timelag hists 2025-08-12 15:53:48 +02:00
b05853416b Refactor: functions moved from sp2xr_pipeline to corresponding script in the src directory 2025-08-08 16:53:29 +02:00
6fa6fabf03 refactor: moved some functions from toolkit_legacy 2025-08-06 09:00:16 +02:00
63359449d3 refactor: split out single-particle calibration into its own module 2025-08-04 18:43:25 +02:00
ad897be5db feat: pre-populate YAML export with calibration and threshold placeholders 2025-08-04 12:58:17 +02:00
1d8f12ee35 feat: add ini→yaml conversion and tests for SP2-XR instrument parameters 2025-08-04 12:26:03 +02:00
5705d88092 refactor: move file-finding and chunking utilities to helpers.py 2025-08-04 09:24:39 +02:00
5690fcdcaf refactor: split read_csv_files_with_dask into modular load, enrich, and save functions 2025-07-25 20:28:24 +02:00
399df18e81 test: add real-data tests for PbP and HK ZIP input files 2025-07-25 15:33:33 +02:00
d8c0a4f6c7 style: apply pre-commit fixes globally 2025-07-25 12:27:12 +02:00
b443aa5db8 style: Ruff auto‑fixes in SP2XR_toolkit 2025-07-25 12:12:38 +02:00