|
|
35f2876e62
|
Update config files
|
2025-11-21 15:00:26 +01:00 |
|
|
|
dfcb0a67fc
|
Fix some values
|
2025-11-21 11:02:58 +01:00 |
|
|
|
29d4197f45
|
Add ipykernel and matplotlib as optional dependencies
|
2025-11-21 10:51:59 +01:00 |
|
|
|
e362624ee4
|
Fix data path and improve plot layout
|
2025-11-21 10:44:03 +01:00 |
|
|
|
faf255feee
|
Update processed data with correct calibration
|
2025-11-21 10:36:02 +01:00 |
|
|
|
10e6a8b31e
|
Add notebook for data visualization
|
2025-11-21 09:37:58 +01:00 |
|
|
|
6ab90b1a56
|
New handling of conversion from CSV/ZIP to Parquet via config file
|
2025-11-19 17:00:07 +01:00 |
|
|
|
2a5c4aec17
|
Remove redundant prints
|
2025-11-19 16:01:15 +01:00 |
|
|
|
8d9f5bf852
|
Int are now inferred as float to handle potential NA values
|
2025-11-19 15:43:58 +01:00 |
|
|
|
fc9cf7c861
|
Remove warnings for log10(0) in histograms calcualtion
|
2025-11-17 16:44:14 +01:00 |
|
|
|
5a52ca2fdd
|
Add local option for converting csv to parquet
|
2025-11-17 16:07:36 +01:00 |
|
|
|
0717931c6d
|
Update default path for config files
|
2025-11-17 15:50:06 +01:00 |
|
|
|
71db651ba4
|
fix: Fix version number to v2.0.0
|
2025-09-30 09:19:07 +02:00 |
|
|
|
cbe0e3c484
|
Rename calibration example notebook
|
2025-09-30 01:17:35 +02:00 |
|
|
|
b50c96ad57
|
Update version numbergit add pyproject.toml
|
2025-09-30 01:01:03 +02:00 |
|
|
|
9f3f50b151
|
Update toml file to include bokeh for dask dashboard visualization
|
2025-09-30 00:59:39 +02:00 |
|
|
|
566b728db2
|
Update readme and documentation
|
2025-09-30 00:58:43 +02:00 |
|
|
|
ba9670b1df
|
Remove old test data files
|
2025-09-30 00:56:17 +02:00 |
|
|
|
bea19f5ed8
|
Reorganize data for testing
|
2025-09-30 00:51:21 +02:00 |
|
|
|
ae2a0731c1
|
Silence error for log10(0) when expected
|
2025-09-30 00:33:01 +02:00 |
|
|
|
3431bc8a4d
|
Change date format to string for campatibility with Windows systems
|
2025-09-30 00:31:59 +02:00 |
|
|
|
d46f3319f3
|
Remove toolkit_legacy.py and references to it
|
2025-09-30 00:27:40 +02:00 |
|
|
|
cde421edda
|
cleanup config organization
|
2025-09-30 00:23:12 +02:00 |
|
|
|
40ba49a61f
|
update gitignore
|
2025-09-29 11:12:38 +02:00 |
|
|
|
02e913f24d
|
update gitignore
|
2025-09-29 11:10:19 +02:00 |
|
|
|
00464994f4
|
Update dask client definition with serializer and deserializer
|
2025-09-29 11:08:40 +02:00 |
|
|
|
c3f23a873a
|
Update the delete partition function to delete also the general metadata to prevent errors when rewriting parquet files
|
2025-09-29 11:07:20 +02:00 |
|
|
|
8d4e24c29b
|
Update the helper function to reflect changes in run_config to process time slices of a dataset
|
2025-09-29 11:05:34 +02:00 |
|
|
|
b97be2dff3
|
Remove the dask client restart at the beginning of each time chunk processing
|
2025-09-29 11:03:13 +02:00 |
|
|
|
da275cdc97
|
feat: adapt sp2xr_pipeline and helpers to run multiple times across the same dataset and different time slots but ensuring config settings are the same for the entire dataset
|
2025-09-29 11:00:14 +02:00 |
|
|
|
203bd9d740
|
feat: add to run_config.yaml number of processes for the dask cluster and option to select start and end dates for processing
|
2025-09-29 10:09:04 +02:00 |
|
|
|
af814498bf
|
Update .gitignore
|
2025-09-29 09:56:11 +02:00 |
|
|
|
872d2c5ac4
|
fix: Add retry logic and consistent partitioning for distributed processing
|
2025-09-12 10:29:15 +02:00 |
|
|
|
d3a7448883
|
Improve conversion from original csv or zip files to parquet with more robust schema definition
|
2025-09-11 16:18:21 +02:00 |
|
|
|
0829f1908e
|
fix: fix import in sp2xr_pipeline.py after the changes in the calibration modules
|
2025-09-11 14:54:07 +02:00 |
|
|
|
0a71ca614c
|
feat: modernize all type
|
2025-09-11 14:49:48 +02:00 |
|
|
|
a2df98042c
|
refactor: reorganize calibration modules and add type hints
|
2025-09-11 14:45:06 +02:00 |
|
|
|
755656f8c7
|
Update pyproject.toml
|
2025-09-11 12:36:21 +02:00 |
|
|
|
a0666be19f
|
Remove example_processing_code.py
|
2025-09-11 12:01:39 +02:00 |
|
|
|
641871a567
|
test: add test for path extraction from file directory
|
2025-09-09 19:14:30 +02:00 |
|
|
|
6621236ea4
|
fix: correct handling of file path structures in different operating systems
|
2025-09-09 19:13:14 +02:00 |
|
|
|
f437b1c5fe
|
feat: add the possibility to decide the saving partition schema between date or date/hour
|
2025-09-09 17:09:38 +02:00 |
|
|
|
29e2351341
|
Feat: user can now decide frequency of repartition of dask dataframes after being loaded (both hk and pbp)
|
2025-09-09 16:03:42 +02:00 |
|
|
|
e946d4ff94
|
Chore: remove the scattering of the mass, size, time delay bins across dask workers outside of the client definition
|
2025-09-09 15:07:47 +02:00 |
|
|
|
b377c36c28
|
Chore: cleanup old code
|
2025-09-09 15:03:44 +02:00 |
|
|
|
3a41fbf387
|
fix: fix bug that was leading to extremely large dask graphs and move all histogram calcualtion logic to the distributio.py module
|
2025-09-09 14:53:19 +02:00 |
|
|
|
b91380a6db
|
fix: increase Dask worker wait time to prevent premature shutdown
|
2025-09-09 14:43:14 +02:00 |
|
|
|
0e932e9a70
|
test: add parquet files to use for testing
|
2025-09-09 14:25:13 +02:00 |
|
|
|
0268a5460c
|
fix: fix parquet saving of distributions (specify engine, write metadata, ...)
|
2025-08-25 15:07:13 +02:00 |
|
|
|
21e14ae2f1
|
chore: update run config
|
2025-08-22 18:20:26 +02:00 |
|