added PyScripts/nexus_muon_validator.py to nexus.

This commit is contained in:
2026-03-07 14:50:57 +01:00
parent 3255e9ef6d
commit d1d2c99bab
3 changed files with 1680 additions and 0 deletions
+126
View File
@@ -0,0 +1,126 @@
# Installation
## Requirements
| Package | Version | Purpose |
|---------|---------|---------|
| Python | ≥ 3.9 | Runtime |
| h5py | ≥ 3.0 | Read HDF5 files (NeXus Version 2) |
| pyhdf | ≥ 0.10 | Read HDF4 files (NeXus Version 1) — optional |
| pdfplumber | ≥ 0.9 | Extract schema from instrument definition PDF (`--pdf` mode) — optional |
---
## Install h5py (required)
h5py provides HDF5 support and is needed for all modern muon NeXus files (Version 2,
written since ~2020).
```bash
pip install h5py
```
Or via your system package manager:
```bash
# Fedora / RHEL
sudo dnf install python3-h5py
# Ubuntu / Debian
sudo apt install python3-h5py
# macOS (Homebrew)
brew install hdf5
pip install h5py
```
---
## Install pyhdf (optional, HDF4 / Version 1 files only)
pyhdf is only needed for reading old HDF4-format files (NeXus Version 1, written
before ~2011 by the MCS software at ISIS). If you only work with modern HDF5 files,
you can skip this step.
pyhdf requires the HDF4 C library to be present on the system.
### Linux
```bash
# Fedora / RHEL
sudo dnf install python3-devel hdf hdf-devel
pip install pyhdf
# Ubuntu / Debian
sudo apt install python3-dev libhdf4-dev
pip install pyhdf
```
> **Note (GCC 14+ / Fedora 40+):** pyhdf may fail to build with a
> `-Wincompatible-pointer-types` error. Work around it with:
> ```bash
> CFLAGS="-Wno-incompatible-pointer-types -Wno-discarded-qualifiers" pip install pyhdf
> ```
### macOS
```bash
brew install hdf4
pip install pyhdf
```
### Windows
Pre-built wheels are available on PyPI for some Python / Windows combinations:
```bash
pip install pyhdf
```
If no wheel is available, consider using a conda environment:
```bash
conda install -c conda-forge pyhdf
```
---
## Install pdfplumber (optional, PDF-driven validation only)
pdfplumber is only needed when you use the `--pdf` option to validate files against
a specific revision of the instrument definition PDF.
```bash
pip install pdfplumber
```
Or via your system package manager (if available):
```bash
# Fedora / RHEL
sudo dnf install python3-pdfplumber # may not be in all repos
# Ubuntu / Debian
sudo apt install python3-pdfplumber # may not be in all repos
# macOS (Homebrew)
pip install pdfplumber
```
---
## Verify the installation
```bash
python3 -c "import h5py; print('h5py', h5py.__version__)"
python3 -c "import pyhdf; print('pyhdf ok')" # optional — HDF4 support
python3 -c "import pdfplumber; print('pdfplumber ok')" # optional — PDF-driven mode
```
---
## No installation needed
The validator is a single self-contained script — no build step, no package
installation of the script itself is required. Simply place
`nexus_muon_validator.py` anywhere on your system and run it with Python.
+223
View File
@@ -0,0 +1,223 @@
# Usage — nexus_muon_validator.py
Validates muon NeXus HDF4/5 files against the ISIS Muon Instrument Definitions
(Version 1 and Version 2 / *muonTD*).
Two validation modes are available:
- **Hardcoded mode** (default) — built-in rules based on the 2026 rev 11 spec.
No extra dependencies beyond `h5py`.
- **PDF-driven mode** (`--pdf`) — rules are extracted live from a
`nexus_instrument_definitions_*.pdf` that you supply. Requires `pdfplumber`.
Reference document:
*NeXus Instrument Definitions for Muon Data*, S. Cottrell, 21 January 2026
(`nexus_instrument_definitions_for_muon_data_2026_rev11.pdf`)
---
## Basic invocation
```bash
python3 nexus_muon_validator.py <file.nxs> [<file2.nxs> ...]
```
Validate one or more files in a single call:
```bash
python3 nexus_muon_validator.py run001.nxs run002.nxs run003.nxs
```
---
## Command-line options
| Option | Description |
|--------|-------------|
| `--pdf <def.pdf>` | Parse schema from a NeXus instrument definition PDF and validate against it |
| `--list-schema` | Print the schema extracted from `--pdf` and exit (no files needed) |
| `-v`, `--verbose` | Also show INFO-level findings (optional fields, format info) |
| `--errors-only` | Show only ERROR-level issues; suppress warnings |
| `-h`, `--help` | Show built-in help and exit |
---
## Severity levels
| Level | Meaning |
|---------|---------|
| `ERROR` | A field required by the specification is missing or unreadable. |
| `WARNING` | A field has an unexpected value, a legacy name, or a shape inconsistency. |
| `INFO` | An optional field recommended by the specification is absent (shown only with `-v`). |
---
## Exit codes
| Code | Meaning |
|------|---------|
| `0` | Validation passed — no ERRORs found |
| `1` | At least one ERROR was reported |
| `2` | File could not be opened or is not a recognised NeXus format |
---
## Examples
**Validate a single file (errors and warnings only):**
```bash
python3 nexus_muon_validator.py EMU00139040.nxs
```
**Validate a whole directory of runs:**
```bash
python3 nexus_muon_validator.py /data/musr/2025/*.nxs
```
**Show full detail including optional fields:**
```bash
python3 nexus_muon_validator.py -v EMU00139040.nxs
```
**Show only hard errors (useful in scripts):**
```bash
python3 nexus_muon_validator.py --errors-only EMU00139040.nxs
echo "Exit code: $?"
```
**Use in a shell script with exit-code checking:**
```bash
#!/bin/bash
python3 nexus_muon_validator.py --errors-only "$1"
if [ $? -ne 0 ]; then
echo "Validation failed for $1"
exit 1
fi
```
**Validate against a specific revision of the instrument definition PDF:**
```bash
python3 nexus_muon_validator.py \
--pdf nexus_instrument_definitions_for_muon_data_2026_rev11.pdf \
EMU00139040.nxs
```
**Inspect the schema extracted from a PDF (no files needed):**
```bash
python3 nexus_muon_validator.py \
--pdf nexus_instrument_definitions_for_muon_data_2026_rev11.pdf \
--list-schema
```
Example `--list-schema` output:
```
Parsed schema from: nexus_instrument_definitions_for_muon_data_2026_rev11.pdf …
→ 35 NX classes found (42 version entries)
NXdata v1 required=0 optional=8 attrs=18
NXdata v2 required=2 optional=6 attrs=9
NXdetector v1 required=0 optional=4 attrs=4
NXdetector v2 required=4 optional=28 attrs=34
NXentry v1 required=0 optional=18 attrs=1
NXentry v2 required=11 optional=18 attrs=9
...
```
---
## What is checked
### File format
- Detects HDF5 (via `h5py`) or HDF4 (via `pyhdf`) automatically.
- HDF4 files are Version 1 by definition; HDF5 files may be Version 1 or 2.
- Reports an error if the format is unrecognised or the file cannot be opened.
### Version detection
The instrument definition version is detected automatically:
| Condition | Detected version |
|-----------|-----------------|
| HDF4 file | **Version 1** (always) |
| HDF5: entry `definition` = `muonTD` or `pulsedTD`, or `IDF_version` = 2 | **Version 2** |
| HDF5: entry group named `run` (NXentry), no `definition` field | **Version 1** |
### Version 1 checks (HDF4 or HDF5 `NXfile` / `NXentry`)
Covers the original muon instrument definition (MCS/RAL, 2001).
- Root attribute: `@NeXus_version` (WARNING if absent)
- NXentry (`run`): `IDF_version`, `program_name`, `number`, `title`, `notes`,
`analysis`, `lab`, `beamline`, `start_time`, `stop_time`, `switching_states`
- NXuser: `name`, `experiment_number`
- NXsample: `temperature` (+`@units`), `magnetic_field` (+`@units`)
- NXinstrument: `name`
- NXdetector: `number`; optional `deadtimes` (+`@units`, `@available`),
`angles` (+`@coordinate_system`, `@available`)
- NXcollimator: `type`
- NXbeam: `total_counts` (+`@units`)
- NXdata (`histogram_data_1`): `counts` (+`@units`, `@signal`,
`@t0_bin`, `@first_good_bin`, `@last_good_bin`),
`resolution` (+`@units`), `time_zero` (+`@units`, `@available`),
`raw_time` (+`@axis`, `@primary`, `@units`)
### Version 2 checks (`NXroot` / `muonTD`)
Covers the revised muon instrument definition (ISIS, 20112026).
- Root attributes: `@file_name` (required), `@file_time` (required)
- At least one `raw_data_N` NXentry must be present
- NXentry: `IDF_version` (= 2), `definition` (= `muonTD`), `run_number`,
`title`, `start_time`, `end_time`, `experiment_identifier`
- NXsample: `name`
- NXinstrument: `name`
- NXsource: `name`, `type`, `probe`
- NXdetector (`detector_*`): `counts` (+`@signal`, `@axes`, `@long_name`),
`raw_time` (+`@units`), `spectrum_index`
- NXdata (`detector_*`): `counts` (+`@signal`, `@axes`), `raw_time` (+`@units`)
- NXuser (`user_1`): `name`
### Dimensional consistency checks
- `raw_time` shape must be `(ntc,)` (bin centres) or `(ntc+1,)` (bin boundaries),
where `ntc` is the last dimension of `counts`.
- `corrected_time` shape must be `(ntc,)`.
- `spectrum_index` shape must be `(ns,)`, matching the second-to-last dimension
of `counts`.
### Legacy / transitional handling
The validator distinguishes real errors from known historical deviations:
| Observed value | Expected (spec) | Reported as |
|----------------|-----------------|-------------|
| `pulsedTD` | `muonTD` | WARNING — legacy name, used in files written before rev 8 |
| `time_of_flight` | `raw_time` | WARNING — legacy dataset name used in files before ~2020 |
| `muons` | `positive muons` or `negative muons` | WARNING — non-specific probe label |
| `n/a` for `type` or `probe` in NXsource | specific string | WARNING |
---
## Sample output
```
========================================================================
File: EMU00139040.nxs
========================================================================
[WARNING] /raw_data_1/definition → Value is 'pulsedTD' (legacy name);
current spec (rev≥8) requires 'muonTD'
========================================================================
Summary: 0 error(s), 1 warning(s)
========================================================================
```
With `--verbose`:
```
========================================================================
File: EMU00139040.nxs
========================================================================
[WARNING] /raw_data_1/definition → Value is 'pulsedTD' (legacy name); ...
[INFO ] / → File format: HDF5
[INFO ] / → Detected muon NeXus instrument definition version: 2
[INFO ] /raw_data_1/instrument → Optional group 'beamline' not present
[INFO ] /raw_data_1/sample → Optional dataset 'magnetic_field_state' not present
...
========================================================================
Summary: 0 error(s), 1 warning(s), 13 info(s)
========================================================================
```
File diff suppressed because it is too large Load Diff