IDEAR Project Name
IDEAR is a project template designed to support reproducible data analysis and collaboration across people and computing systems. Built on JupyterLab, Docker, and DIMA, it provides a consistent environment for integrating and curating multi-instrument data in HDF5 format, along with standardized metadata annotation and analysis workflows to improve data exchange and reuse.
Figure: IDEAR landing page and interface overview.
Key Features
- Reproducible and portable computational environment
- Integration of multi-instrument data (HDF5)
- Metadata annotation and curation workflows
- JupyterLab-based user interface for interactive analysis
- Improved data exchange and reuse across teams and systems
- (Optional) network-mounted input for seamless integration with shared drives
Output Format
- Self-describing HDF5 files, including:
- Project-level, contextual, and data lineage metadata
Extensibility
New instruments can be supported by extending the file parsing capabilities in the dima/ module.
Repository Structure
Click to expand
data/— Input and output datasets (mounted volume)figures/— Output visualizations (mounted volume)notebooks/— Jupyter notebooks for processing and metadata integrationscripts/— Supplementary processing logicdima/— Metadata and HDF5 schema utilities (persisted module)Dockerfile— Container image definitiondocker-compose.yaml— Local and networked deployment optionsenv_setup.sh— Optional local environment bootstrapCITATION.cff,LICENCE,README.md,.gitignore,.dockerignore— Project metadata and configcampaignDescriptor.yaml— Campaign-specific metadata
Getting Started
Requirements
For Docker-based usage:
- Docker Desktop
- Git Bash (for running shell scripts on Windows)
Optional for local (non-Docker) usage:
- Conda (
minicondaoranaconda)
If accessing network drives (e.g., PSI):
- PSI credentials with access to mounted network shares
Clone the Repository
git clone --recurse-submodules <your-repo-url>
cd <your-repo-name>
Run with Docker
This toolkit includes a containerized JupyterLab computational environment and a welcome landing page to support reanalysis and reuse of research data in the IDEAR project.
- Open PowerShell (as Administrator) and navigate to the
your-repo-namerepository. - Create a
.envfile in the root ofyour-repo-name/. - Securely store your network drive access credentials in the
.envfile by adding the following lines:To protect your credentials:CIFS_USER=<your-username> CIFS_PASS=<your-password> JUPYTER_TOKEN=my-token NETWORK_MOUNT=//your-server/your-share- Do not share the .env file with others.
- Ensure the file is excluded from version control by adding .env to your .gitignore and .dockerignore files.
- Open Docker Desktop, then build the container image:
docker build -f Dockerfile -t idear_processor . - Start the environment:
-
Locally without network drive mount: Regardless of value in .env,
NETWORK_MOUNTdefaults to<your-repo-name>/data/.docker compose up idear_processor -
With network drive mount:
docker compose up idear_processor_networked
-
Access Jupyter Lab (Welcome Page):
Once the container is running, open:
Log in using your token (default:
my-token).Jupyter Lab will automatically open WELCOME.ipynb,
which contains the onboarding guidelines for this IDEAR Project. -
Stop the app: In the previously open PowerShell terminal, enter:
Ctrl + CAfter the container is properly Stopped, remove the container process as:
docker rm $(docker ps -aq --filter ancestor=idear_processor)
(Optional) Set Up the Python Environment
Required only if you plan to run the toolkit outside of Docker