38 Commits

Author SHA1 Message Date
35f2876e62 Update config files 2025-11-21 15:00:26 +01:00
6ab90b1a56 New handling of conversion from CSV/ZIP to Parquet via config file 2025-11-19 17:00:07 +01:00
2a5c4aec17 Remove redundant prints 2025-11-19 16:01:15 +01:00
8d9f5bf852 Int are now inferred as float to handle potential NA values 2025-11-19 15:43:58 +01:00
5a52ca2fdd Add local option for converting csv to parquet 2025-11-17 16:07:36 +01:00
0717931c6d Update default path for config files 2025-11-17 15:50:06 +01:00
3431bc8a4d Change date format to string for campatibility with Windows systems 2025-09-30 00:31:59 +02:00
cde421edda cleanup config organization 2025-09-30 00:23:12 +02:00
b97be2dff3 Remove the dask client restart at the beginning of each time chunk processing 2025-09-29 11:03:13 +02:00
da275cdc97 feat: adapt sp2xr_pipeline and helpers to run multiple times across the same dataset and different time slots but ensuring config settings are the same for the entire dataset 2025-09-29 11:00:14 +02:00
872d2c5ac4 fix: Add retry logic and consistent partitioning for distributed processing 2025-09-12 10:29:15 +02:00
0829f1908e fix: fix import in sp2xr_pipeline.py after the changes in the calibration modules 2025-09-11 14:54:07 +02:00
f437b1c5fe feat: add the possibility to decide the saving partition schema between date or date/hour 2025-09-09 17:09:38 +02:00
29e2351341 Feat: user can now decide frequency of repartition of dask dataframes after being loaded (both hk and pbp) 2025-09-09 16:03:42 +02:00
e946d4ff94 Chore: remove the scattering of the mass, size, time delay bins across dask workers outside of the client definition 2025-09-09 15:07:47 +02:00
b377c36c28 Chore: cleanup old code 2025-09-09 15:03:44 +02:00
3a41fbf387 fix: fix bug that was leading to extremely large dask graphs and move all histogram calcualtion logic to the distributio.py module 2025-09-09 14:53:19 +02:00
0268a5460c fix: fix parquet saving of distributions (specify engine, write metadata, ...) 2025-08-25 15:07:13 +02:00
c1243e3b1e feat: improve cluster shutdown and cleanup logic 2025-08-22 18:18:44 +02:00
7bd42c22a2 feat: add sbatch file to run via Slurm 2025-08-22 18:13:50 +02:00
55efbc74a2 Cleanup: remove wreck code from scripts/sp2xr_apply_calibration.py 2025-08-22 16:43:58 +02:00
4696c2cbb9 Cleanup: Remove wreck code from sp2xr_pipeline 2025-08-22 16:36:08 +02:00
681de6c203 Fix: add missing import of cast_and_arrow function to sp2xr_pipeline 2025-08-22 16:31:03 +02:00
43c31728e0 Fix: add control to skip empty ddf when time chunk is empty 2025-08-22 16:06:05 +02:00
f42a308474 Feature: processing is now divided in time chunks to reduce size of dask graph 2025-08-22 11:37:58 +02:00
176dde251f Refactor: remove name key from BC_hist_configs dictionary 2025-08-13 17:22:47 +02:00
554ec26410 Refactor: moved parts of processing code to specific functions in separate modules (join_pbp_with_flow and aggregate_dt) 2025-08-13 17:02:47 +02:00
053d0d5b75 Refactor: separte the workflow for the option to calculate distributions for BC (numb and amss), scattering and time delay 2025-08-13 16:26:39 +02:00
1038b18187 Clean up old parts in the code and re-arranged processing order 2025-08-13 16:08:37 +02:00
5f3d25817c Refactor: moved calibration of PbP data to new file and clear separation between run_config and instr_config 2025-08-13 15:28:57 +02:00
6f87b4cc79 Refactor: implemented input run_config.yaml file 2025-08-13 14:09:58 +02:00
fa4cb9e7d4 Refactor: moved some functions to respective modules and implemented the timelag hists 2025-08-12 15:53:48 +02:00
7e736803c1 Refactor: working version of sp2xr_pipeline but not yet polished 2025-08-08 11:54:32 +02:00
3e620c80f0 refactor: first draft for the new sp2xr processing pipeline 2025-08-06 08:56:32 +02:00
63359449d3 refactor: split out single-particle calibration into its own module 2025-08-04 18:43:25 +02:00
f5c5420209 feat: add CLI utility to convert SP2-XR .ini files to editable YAML configs 2025-08-04 12:33:09 +02:00
83c95a9cdd feat: set default walltime dynamically based on SLURM partition 2025-08-04 11:42:13 +02:00
a1595b1e86 feat: add CLI script to convert SP2XR zip/CSV files to parquet using Dask 2025-08-04 09:40:00 +02:00