Commit Graph

  • 35f2876e62 Update config files main bertoz_b 2025-11-21 15:00:26 +01:00
  • dfcb0a67fc Fix some values bertoz_b 2025-11-21 11:02:58 +01:00
  • 29d4197f45 Add ipykernel and matplotlib as optional dependencies bertoz_b 2025-11-21 10:51:59 +01:00
  • e362624ee4 Fix data path and improve plot layout bertoz_b 2025-11-21 10:44:03 +01:00
  • faf255feee Update processed data with correct calibration bertoz_b 2025-11-21 10:36:02 +01:00
  • 10e6a8b31e Add notebook for data visualization bertoz_b 2025-11-21 09:37:58 +01:00
  • 6ab90b1a56 New handling of conversion from CSV/ZIP to Parquet via config file bertoz_b 2025-11-19 17:00:07 +01:00
  • 2a5c4aec17 Remove redundant prints bertoz_b 2025-11-19 16:01:15 +01:00
  • 8d9f5bf852 Int are now inferred as float to handle potential NA values bertoz_b 2025-11-19 15:43:58 +01:00
  • fc9cf7c861 Remove warnings for log10(0) in histograms calcualtion bertoz_b 2025-11-17 16:44:14 +01:00
  • 5a52ca2fdd Add local option for converting csv to parquet bertoz_b 2025-11-17 16:07:36 +01:00
  • 0717931c6d Update default path for config files bertoz_b 2025-11-17 15:43:35 +01:00
  • 71db651ba4 fix: Fix version number to v2.0.0 bertoz_b 2025-09-30 09:19:07 +02:00
  • cbe0e3c484 Rename calibration example notebook bertoz_b 2025-09-30 01:17:35 +02:00
  • b50c96ad57 Update version numbergit add pyproject.toml bertoz_b 2025-09-30 01:01:03 +02:00
  • 9f3f50b151 Update toml file to include bokeh for dask dashboard visualization bertoz_b 2025-09-30 00:59:39 +02:00
  • 566b728db2 Update readme and documentation bertoz_b 2025-09-30 00:58:43 +02:00
  • ba9670b1df Remove old test data files bertoz_b 2025-09-30 00:56:17 +02:00
  • bea19f5ed8 Reorganize data for testing bertoz_b 2025-09-30 00:51:21 +02:00
  • ae2a0731c1 Silence error for log10(0) when expected bertoz_b 2025-09-30 00:33:01 +02:00
  • 3431bc8a4d Change date format to string for campatibility with Windows systems bertoz_b 2025-09-30 00:31:59 +02:00
  • d46f3319f3 Remove toolkit_legacy.py and references to it bertoz_b 2025-09-30 00:27:40 +02:00
  • cde421edda cleanup config organization bertoz_b 2025-09-30 00:23:12 +02:00
  • 40ba49a61f update gitignore bertoz_b 2025-09-29 11:12:38 +02:00
  • 02e913f24d update gitignore bertoz_b 2025-09-29 11:10:19 +02:00
  • 00464994f4 Update dask client definition with serializer and deserializer bertoz_b 2025-09-29 11:08:40 +02:00
  • c3f23a873a Update the delete partition function to delete also the general metadata to prevent errors when rewriting parquet files bertoz_b 2025-09-29 11:07:20 +02:00
  • 8d4e24c29b Update the helper function to reflect changes in run_config to process time slices of a dataset bertoz_b 2025-09-29 11:05:34 +02:00
  • b97be2dff3 Remove the dask client restart at the beginning of each time chunk processing bertoz_b 2025-09-29 11:03:13 +02:00
  • da275cdc97 feat: adapt sp2xr_pipeline and helpers to run multiple times across the same dataset and different time slots but ensuring config settings are the same for the entire dataset bertoz_b 2025-09-29 11:00:14 +02:00
  • 203bd9d740 feat: add to run_config.yaml number of processes for the dask cluster and option to select start and end dates for processing bertoz_b 2025-09-29 10:09:04 +02:00
  • af814498bf Update .gitignore bertoz_b 2025-09-29 09:56:11 +02:00
  • 872d2c5ac4 fix: Add retry logic and consistent partitioning for distributed processing bertoz_b 2025-09-12 10:26:11 +02:00
  • d3a7448883 Improve conversion from original csv or zip files to parquet with more robust schema definition bertoz_b 2025-09-11 16:18:21 +02:00
  • 0829f1908e fix: fix import in sp2xr_pipeline.py after the changes in the calibration modules bertoz_b 2025-09-11 14:54:07 +02:00
  • 0a71ca614c feat: modernize all type bertoz_b 2025-09-11 14:49:48 +02:00
  • a2df98042c refactor: reorganize calibration modules and add type hints bertoz_b 2025-09-11 14:45:06 +02:00
  • 755656f8c7 Update pyproject.toml bertoz_b 2025-09-11 12:26:09 +02:00
  • a0666be19f Remove example_processing_code.py bertoz_b 2025-09-11 12:01:39 +02:00
  • 641871a567 test: add test for path extraction from file directory bertoz_b 2025-09-09 19:14:30 +02:00
  • 6621236ea4 fix: correct handling of file path structures in different operating systems bertoz_b 2025-09-09 19:13:14 +02:00
  • f437b1c5fe feat: add the possibility to decide the saving partition schema between date or date/hour bertoz_b 2025-09-09 17:09:38 +02:00
  • 29e2351341 Feat: user can now decide frequency of repartition of dask dataframes after being loaded (both hk and pbp) bertoz_b 2025-09-09 16:03:42 +02:00
  • e946d4ff94 Chore: remove the scattering of the mass, size, time delay bins across dask workers outside of the client definition bertoz_b 2025-09-09 15:07:47 +02:00
  • b377c36c28 Chore: cleanup old code bertoz_b 2025-09-09 15:03:44 +02:00
  • 3a41fbf387 fix: fix bug that was leading to extremely large dask graphs and move all histogram calcualtion logic to the distributio.py module bertoz_b 2025-09-09 14:53:19 +02:00
  • b91380a6db fix: increase Dask worker wait time to prevent premature shutdown bertoz_b 2025-09-09 14:43:14 +02:00
  • 0e932e9a70 test: add parquet files to use for testing bertoz_b 2025-09-09 14:25:13 +02:00
  • 0268a5460c fix: fix parquet saving of distributions (specify engine, write metadata, ...) bertoz_b 2025-08-25 15:07:13 +02:00
  • 21e14ae2f1 chore: update run config bertoz_b 2025-08-22 18:20:26 +02:00
  • c1243e3b1e feat: improve cluster shutdown and cleanup logic bertoz_b 2025-08-22 18:18:44 +02:00
  • 8a0c1f3305 feat: add TMPDIR for temporary files in cluster jobs bertoz_b 2025-08-22 18:15:39 +02:00
  • 7bd42c22a2 feat: add sbatch file to run via Slurm bertoz_b 2025-08-22 18:13:50 +02:00
  • 55efbc74a2 Cleanup: remove wreck code from scripts/sp2xr_apply_calibration.py bertoz_b 2025-08-22 16:43:58 +02:00
  • 3173a7c83b Cleanup: remove wreck code from src/sp2xr/resample_pbp_hk.py bertoz_b 2025-08-22 16:42:02 +02:00
  • 2da9eb6089 Cleanup: remove wreck code from src/sp2xr/helpers.py bertoz_b 2025-08-22 16:40:26 +02:00
  • d2a0533a12 Cleanup: remove wreck code from src/sp2xr/schema.py bertoz_b 2025-08-22 16:38:06 +02:00
  • 4696c2cbb9 Cleanup: Remove wreck code from sp2xr_pipeline bertoz_b 2025-08-22 16:36:08 +02:00
  • 681de6c203 Fix: add missing import of cast_and_arrow function to sp2xr_pipeline bertoz_b 2025-08-22 16:31:03 +02:00
  • 40bae0e5d2 Refactor: move function _cast_and_arrow to schema.py bertoz_b 2025-08-22 16:19:42 +02:00
  • 2704128e8d Test notebook moved out of project bertoz_b 2025-08-22 16:11:18 +02:00
  • 43c31728e0 Fix: add control to skip empty ddf when time chunk is empty bertoz_b 2025-08-22 16:06:05 +02:00
  • 8947046049 Feature: remove saving of ddf_pbp_hk_dt files bertoz_b 2025-08-22 16:03:43 +02:00
  • 547c7f3108 Fix: removed default values in parse_args to avoid unexpected behavior when passing run_config settings bertoz_b 2025-08-22 16:01:01 +02:00
  • 9c384f5245 Fix: Removed default values in the load_and_resolve_config function to avoid unexpected behavior when run_config doesn't provide settings bertoz_b 2025-08-22 15:59:48 +02:00
  • f42a308474 Feature: processing is now divided in time chunks to reduce size of dask graph bertoz_b 2025-08-22 11:37:58 +02:00
  • d6b3f2028f Fix: bug fixed in conversion from BC mass to diam due to density units mismatch in the config file and default values bertoz_b 2025-08-22 10:48:08 +02:00
  • bbd21ba7b9 Chore: add test data folder to gitignore bertoz_b 2025-08-21 11:58:39 +02:00
  • bf0e663449 Test: add temporary notebook for tests bertoz_b 2025-08-21 11:57:51 +02:00
  • ebd14bcbae Fix: typo in the config reading was blocking calibration bertoz_b 2025-08-21 11:39:11 +02:00
  • d7f778d531 Fix: the hisotgrams were adding lines with NaNs when the corresponding partition was completely empty. Now it is back to old behavior and no index is added for partitions completely empty. bertoz_b 2025-08-21 11:37:23 +02:00
  • 063b01e73f Test: PbP and HK parquet files added for testing bertoz_b 2025-08-20 19:08:28 +02:00
  • a2cc520ff2 feat: possibility to choose between running locally and via slurm cluster bertoz_b 2025-08-14 11:48:52 +02:00
  • 18b8635147 chore: moved config files part 2 bertoz_b 2025-08-14 10:42:44 +02:00
  • d470f2e811 chore: moved config files bertoz_b 2025-08-14 10:41:44 +02:00
  • 176dde251f Refactor: remove name key from BC_hist_configs dictionary bertoz_b 2025-08-13 17:22:47 +02:00
  • 554ec26410 Refactor: moved parts of processing code to specific functions in separate modules (join_pbp_with_flow and aggregate_dt) bertoz_b 2025-08-13 17:02:47 +02:00
  • 053d0d5b75 Refactor: separte the workflow for the option to calculate distributions for BC (numb and amss), scattering and time delay bertoz_b 2025-08-13 16:26:39 +02:00
  • 1038b18187 Clean up old parts in the code and re-arranged processing order bertoz_b 2025-08-13 16:08:37 +02:00
  • 5f3d25817c Refactor: moved calibration of PbP data to new file and clear separation between run_config and instr_config bertoz_b 2025-08-13 15:28:57 +02:00
  • 6f87b4cc79 Refactor: implemented input run_config.yaml file bertoz_b 2025-08-13 14:09:58 +02:00
  • fa4cb9e7d4 Refactor: moved some functions to respective modules and implemented the timelag hists bertoz_b 2025-08-12 15:53:48 +02:00
  • b05853416b Refactor: functions moved from sp2xr_pipeline to corresponding script in the src directory bertoz_b 2025-08-08 16:53:29 +02:00
  • 7e736803c1 Refactor: working version of sp2xr_pipeline but not yet polished bertoz_b 2025-08-08 11:54:32 +02:00
  • 6fa6fabf03 refactor: moved some functions from toolkit_legacy bertoz_b 2025-08-06 09:00:16 +02:00
  • 3e620c80f0 refactor: first draft for the new sp2xr processing pipeline bertoz_b 2025-08-06 08:56:32 +02:00
  • 63359449d3 refactor: split out single-particle calibration into its own module bertoz_b 2025-08-04 18:43:25 +02:00
  • 74c5a77b48 test: manually added case-specific instrument_config parameters bertoz_b 2025-08-04 13:03:24 +02:00
  • ad897be5db feat: pre-populate YAML export with calibration and threshold placeholders bertoz_b 2025-08-04 12:58:17 +02:00
  • f5c5420209 feat: add CLI utility to convert SP2-XR .ini files to editable YAML configs bertoz_b 2025-08-04 12:33:09 +02:00
  • 1d8f12ee35 feat: add ini→yaml conversion and tests for SP2-XR instrument parameters bertoz_b 2025-08-04 12:26:03 +02:00
  • 83c95a9cdd feat: set default walltime dynamically based on SLURM partition bertoz_b 2025-08-04 11:42:13 +02:00
  • 2ca7704b7e chore: add bokeh for Dask dashboard support bertoz_b 2025-08-04 10:43:44 +02:00
  • 6c3bb8ba2f chore: add dask[distributed] and dask-jobqueue to project dependencies bertoz_b 2025-08-04 09:52:13 +02:00
  • a1595b1e86 feat: add CLI script to convert SP2XR zip/CSV files to parquet using Dask bertoz_b 2025-08-04 09:40:00 +02:00
  • 5705d88092 refactor: move file-finding and chunking utilities to helpers.py bertoz_b 2025-08-04 09:24:39 +02:00
  • 5690fcdcaf refactor: split read_csv_files_with_dask into modular load, enrich, and save functions bertoz_b 2025-07-25 20:28:24 +02:00
  • 399df18e81 test: add real-data tests for PbP and HK ZIP input files bertoz_b 2025-07-25 15:33:33 +02:00
  • 06dd589bc7 refactor: rename and update schema config generator to support YAML output bertoz_b 2025-07-25 15:27:01 +02:00
  • ecd08d4c02 chore(meta): update config.yaml with correct column types bertoz_b 2025-07-25 14:51:48 +02:00