Replace the fixed-element DefaultBlockSize with a byte target divided by elem_size to get the block element count, so the per-block working set (and thus cache behaviour) stays constant across pixel bit depths instead of halving from 8- to 16- to 32-bit. The target is per-algorithm, following the measured sweet spots on sparse data: LZ4 wants a small, cache-resident block for throughput (16 kB), ZSTD/RLE want a large block for ratio (128 kB). The gap is widest on extreme-sparsity inputs such as the uint32 pixel_mask, where large-block ZSTD reaches 100-1800x vs ~160x for LZ4. The block size is read back per-dataset from the bitshuffle stream header (block_size = header_bytes / elem_size) and the HDF5 filter params, so the decompressor and external readers (XDS/Neggia/Durin/CrystFEL) need no change. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
21 lines
869 B
C++
21 lines
869 B
C++
// SPDX-FileCopyrightText: 2024 Filip Leonarski, Paul Scherrer Institute <filip.leonarski@psi.ch>
|
|
// SPDX-License-Identifier: GPL-3.0-only
|
|
|
|
#include <bitshuffle/bitshuffle.h>
|
|
|
|
#include "JFJochCompressor.h"
|
|
#include "MaxCompressedSize.h"
|
|
|
|
int64_t MaxCompressedSize(CompressionAlgorithm algorithm, int64_t pixels_number, uint16_t pixel_depth) {
|
|
switch (algorithm) {
|
|
case CompressionAlgorithm::BSHUF_LZ4:
|
|
return bshuf_compress_lz4_bound(pixels_number, pixel_depth, JFJochBitShuffleCompressor::BlockSize(algorithm, pixel_depth)) + 12;
|
|
case CompressionAlgorithm::BSHUF_ZSTD:
|
|
case CompressionAlgorithm::BSHUF_ZSTD_RLE:
|
|
return bshuf_compress_zstd_bound(pixels_number, pixel_depth, JFJochBitShuffleCompressor::BlockSize(algorithm, pixel_depth)) + 12;
|
|
default:
|
|
return pixels_number * pixel_depth;
|
|
}
|
|
}
|
|
|