Files
Jungfraujoch/compression/JFJochZstdHuffCompressor.h
T
leonarski_fandClaude Opus 4.8 7e7a73062c Compression: add BSHUF_ZSTD_RLE_HUFF (RLE runs + Huffman literals)
New CompressionAlgorithm that emits a standard Zstandard frame: zero/0xFF runs
become RLE_Blocks (like BSHUF_ZSTD_RLE) and literal regions become
Compressed_Blocks with per-block adaptive Huffman literals and no sequences
(Number_of_Sequences=0). Short runs are absorbed into the literal stream;
incompressible literals fall back to Raw_Blocks so the worst case stays within
ZSTD_compressBound.

The Huffman tree + bitstream are produced by zstd's own HUF_compress{1,4}X_repeat
(the same calls ZSTD_compressLiterals uses); only the frame/block/literals-section
framing is hand-written, with comments citing zstd_compression_format.md so it can
be checked clause by clause. Output decodes with stock ZSTD_decompress, so no
reader changes are needed (decode routes like BSHUF_ZSTD).

On sparse diffraction this gives ~12% smaller files than bitshuffle/LZ4 at about
the same end-to-end speed, sitting between LZ4 and full ZSTD; for maximum ratio
use BSHUF_ZSTD. Robust on any input: tests round-trip pure zeros, Poisson(10),
Mersenne-Twister noise (checked against the size bound), an extreme-sparsity mask,
and a real lyso image through stock ZSTD_decompress.

API: exposed as "bszstd_rlehuf"; regenerate the Python/TS clients (update_version.sh)
to surface the new value there.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-27 14:41:46 +02:00

40 lines
2.1 KiB
C++

// SPDX-FileCopyrightText: 2024 Filip Leonarski, Paul Scherrer Institute <filip.leonarski@psi.ch>
// SPDX-License-Identifier: GPL-3.0-only
#pragma once
#include <cstdint>
#include <cstddef>
#include <vector>
// Produces a STANDARD Zstandard frame from bitshuffled data, decodable by stock ZSTD_decompress:
// - zero / 0xFF runs -> RLE_Blocks (cheap, like JFJochZstdCompressor)
// - literal regions -> Compressed_Blocks with Huffman literals (no sequences); short
// runs are absorbed into the literal stream
// - incompressible literals -> Raw_Blocks (bounded worst case)
// Faster than full ZSTD (no match search) and better ratio than the plain RLE compressor (it
// entropy-codes the literals). The Huffman table is built/reused per block from that block's own
// literals via zstd's HUF_compress*X_repeat, so it is robust on any input (random, Poisson, masks,
// zeros) with no trained tables.
class JFJochZstdHuffCompressor {
std::vector<uint8_t> out; // assembled frame
std::vector<uint8_t> literals; // literal bytes in stream order (incl. absorbed short runs)
std::vector<uint8_t> hufbuf; // scratch for one Huffman-coded literal chunk
std::vector<size_t> ctable; // HUF_CElt[] (size_t-aligned)
std::vector<uint64_t> entwksp; // HUF compression workspace
unsigned repeat_state = 0; // HUF_repeat across literal blocks within the current frame
struct Seg { uint8_t type; size_t bytes; size_t lit_off; }; // type 0=run0, 1=runFF, 2=literals
std::vector<Seg> segs;
void put_le(uint64_t v, int nbytes);
size_t blk_hdr(uint32_t type, uint32_t size);
void emit_run(uint8_t value, size_t nbytes, size_t &last_off);
void emit_lit_chunk(const uint8_t *lits, size_t n, size_t &last_off);
public:
JFJochZstdHuffCompressor();
// src = bitshuffled block (src_size bytes, a multiple of 8). Writes one zstd frame to dst and
// returns its size. dst must hold at least ZSTD_compressBound(src_size) + 12 bytes.
size_t Compress(uint8_t *dst, const uint64_t *src, size_t src_size);
};