SP2XR - Single Particle Soot Photometer Extended Range Toolkit

A comprehensive Python toolkit for analyzing SP2-XR (Single Particle Soot Photometer Extended Range, Droplet Measurement Technologies) data, providing calibration and data processing for black carbon (BC) aerosol measurements.

Some functions in the calibration part of the toolkit have been adapted from the Igor SP2 toolkit.

Some helper functions have been adapted from Rob Modini's first implementation of the analysis toolkit for the SP2-XR.

Overview

The SP2-XR is a scientific instrument that measures individual BC particles in real time, providing data on their mass and mixing state.

A high-power Nd:YAG laser at 1064 nm illuminates each aerosol particle drawn into the optical region of the instrument. Single-particle black carbon mass is measured via laser-induced incandescence (LII). Particles composed of refractory absorbing carbon (i.e., black carbon) absorb the laser energy and heat up, eventually vaporizing and incandescing (emitting thermal radiation). The intensity of that emission is proportional to the mass of the incandescing BC (via a set of calibration constants). All particles, including BC-free particles (i.e., particles without a detectable incandescence signal), scatter the laser light, providing a measurement of their optical diameter (via a set of calibration constants).

This repository contains Python functions and scripts to process and analyze SP2-XR data files, including:

  • Converting raw data files (*Pbp* and *hk* files in .csv/.zip formats) to Parquet files indexed by time
  • Applying scattering and incandescence calibrations
  • Processing single particle data with quality control flags (e.g., saturation, FWHM outside of accepted range, ...)
  • Calculating number and mass concetrations
  • Calculating size distributions as a function of their mass equivalent diameter for BC-containing particle (dNdlogDmev, dMdlogDmev), or of their optical diameter for particles without deteclable BC content (dNdlogDsc)
  • Mixing state analysis based on the time delay method (Moteki and Kondo, 2007)

The toolkit is designed for parallel processing to handle large datasets efficiently.

Quick Start

Installation

git clone <repository-url>
cd SP2XR_code
pip install -e .

Basic Usage

# 1. Generate configuration from your data
python scripts/sp2xr_generate_config.py /path/to/data --mapping

# 2. Convert CSV/ZIP to Parquet
python scripts/sp2xr_csv2parquet.py \
    --source /path/to/csv \
    --target /path/to/parquet \
    --config config_with_mapping.yaml \
    --local

# 3. Run processing pipeline
python scripts/sp2xr_pipeline.py --config your_config.yaml

Repository Structure

SP2XR_code/
├── src/sp2xr/           # Main package source code
├── scripts/             # Command-line processing scripts
├── docs/                # Detailed documentation
├── meta_files/          # Configuration templates
├── tests/               # Test suite
└── calibration_workflow.ipynb  # Interactive calibration

Core Scripts

  • sp2xr_csv2parquet.py - Data format conversion with column mapping
  • sp2xr_pipeline.py - Complete processing pipeline with distributed computing
  • sp2xr_generate_config.py - Auto-generate configuration files
  • sp2xr_apply_calibration.py - Apply instrument calibrations

Documentation

Installation Guide - Setup and dependencies
Configuration Guide - Schema and calibration files
Processing Workflow - Step-by-step data processing
Scripts Reference - Detailed script usage
API Reference - Function documentation
Usage Examples - Code examples and workflows
CSV to Parquet Conversion - Detailed conversion guide

Requirements

  • Python >= 3.9
  • Dask, pandas, numpy, pyyaml
  • Optional: Jupyter for interactive calibration

License

See LICENSE.md for license information.

Description
No description provided
Readme 1.6 GiB
Languages
Jupyter Notebook 92.1%
Python 7.9%