23d27f30c4
Decouple the raw-pixel side of JFJochHDF5Reader from the rest as the first step toward swappable per-dataset metadata snapshots. - HDF5ImageLocator: single owner of the legacy/VDS/contiguous layout resolution plus a persistent open-file cache, replacing the four duplicated resolvers (GetImageLocation, ReadSpots, ReadReflections) and their per-call file caches. Also hosts the source-mapping logic (former GetHDF5DataSource body). - HDF5ImageSource: raw-pixel reading (locator + LoadImageDataset); the part whose links to files stay fixed while the metadata master may change. - JFJochHDF5Reader keeps a thin GetImageLocation/GetRawImage/GetHDF5DataSource that delegate to image_source_; the six layout members are gone, parsed into a local Layout handed to the source at the end of ReadFile. Cache cleared on Close(). Verified: tests/jfjoch_test [HDF5] (79 cases / 1775 assertions), and jfjoch_process/azint/extract_hkl/scale relink unchanged. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
61 lines
2.8 KiB
C++
61 lines
2.8 KiB
C++
// SPDX-FileCopyrightText: 2026 Filip Leonarski, Paul Scherrer Institute <filip.leonarski@psi.ch>
|
|
// SPDX-License-Identifier: GPL-3.0-only
|
|
|
|
#pragma once
|
|
|
|
#include <map>
|
|
#include <memory>
|
|
#include <optional>
|
|
#include <string>
|
|
#include <vector>
|
|
|
|
#include "../writer/HDF5Objects.h" // HDF5ReadOnlyFile, HDF5VirtualDatasetMapping, HDF5DataSetLayout
|
|
#include "../common/JFJochMessages.h" // FileWriterFormat, HDF5DataSourceMessage
|
|
|
|
// Turns a global image number into the HDF5 file + local index that physically holds its pixels,
|
|
// for all three on-disk layouts (legacy linked data files, VDS, contiguous/integrated). This is
|
|
// the part of the reader whose "links to files stay" constant: it knows where the raw images
|
|
// live, independent of which master file the per-image metadata is read from.
|
|
//
|
|
// Open data-file handles are cached, so scanning many images (e.g. reprocessing) does not reopen
|
|
// the same file on every read. HDF5 is not thread-safe, so every call must be made with the
|
|
// global hdf5_mutex held by the caller; the locator does no locking of its own.
|
|
class HDF5ImageLocator {
|
|
public:
|
|
struct Location {
|
|
std::shared_ptr<HDF5ReadOnlyFile> file;
|
|
uint32_t local_index = 0;
|
|
};
|
|
|
|
// Layout description, filled by the reader once the master file has been parsed. All paths
|
|
// are absolute: legacy data files and VDS mapping filenames are resolved relative to the
|
|
// master before being handed over, so the locator never deals with relative paths.
|
|
struct Layout {
|
|
FileWriterFormat format = FileWriterFormat::NoFile;
|
|
HDF5DataSetLayout data_layout = HDF5DataSetLayout::CONTIGUOUS;
|
|
std::shared_ptr<HDF5ReadOnlyFile> master_file;
|
|
std::string master_filename;
|
|
std::vector<std::string> legacy_files;
|
|
size_t images_per_file = 1;
|
|
std::vector<HDF5VirtualDatasetMapping> vds_mappings;
|
|
};
|
|
|
|
void Configure(Layout layout);
|
|
void Clear();
|
|
|
|
// Resolve a global image number to {file, local index}. Throws if the image is not covered
|
|
// by the layout. Does not bounds-check against the total image count - the caller does that.
|
|
Location Resolve(int64_t global_image) const;
|
|
|
|
// Source mapping for re-writing a derived file (e.g. _process.h5) so it links back to the
|
|
// original pixel sources rather than to a master. total_images is supplied by the caller.
|
|
std::vector<HDF5DataSourceMessage> GetSourceMapping(uint64_t first_image,
|
|
std::optional<uint64_t> image_count,
|
|
uint64_t total_images) const;
|
|
|
|
private:
|
|
Layout layout_;
|
|
mutable std::map<std::string, std::shared_ptr<HDF5ReadOnlyFile> > file_cache_;
|
|
std::shared_ptr<HDF5ReadOnlyFile> OpenCached(const std::string &path) const;
|
|
};
|