Summary and Setup

This lesson is about Interoperability in Climate and Atmospheric Sciences. The value of scientific data depends not only on its scientific content but on how easily it can be found, accessed, integrated, and reused by others, whether they are human researchers or automated computational workflows.

This course focuses on how to create first-class research outputs using the NetCDF format and publishing them through the 4TU.ResearchData repository. By following community best practices, these datasets can be:

easily found through rich, machine-actionable metadata,
reliably accessed using open standards and stable identifiers,
seamlessly integrated with other datasets and
confidently reused .

Throughout this course, you will learn how to produce NetCDF datasets that meet these standards, datasets that are not only scientifically valuable today, but that remain accessible, interoperable, and reusable for years to come.

Target audience

This lesson is intended for researchers in the climate and atmospheric sciences who handle multidimensional NetCDF datasets and intend to make their data and software more reusable by others.

Ash’s challenge: combining climate data (use case)

Ash is studying extreme heatwaves in Europe. She wants to compare her climate model results with satellite observations, urban sensor data, and aircraft measurements.

She starts searching across platforms like Copernicus Climate Data Store, NASA EarthData, and 4TU.ResearchData. At first, everything seems available. But once she begins working with the data, problems appear: Data is spread across different repositories with different access methods, files come in many formats (NetCDF, CSV, GeoTIFF, Excel), and variable names, units, and metadata are inconsistent or unclear. Instead of focusing on heatwaves, Ash spends days just trying to understand and prepare the data. Ash’s problem is not a lack of data or tools. It is a lack of interoperability.

Data was not created using shared standards
Metadata is not machine-readable or consistent
Datasets are difficult to combine across sources

If datasets followed community practices, Ash could:

Find data faster
Access it programmatically
Combine datasets without manual cleanup
Focus on science instead of data wrangling

This is why interoperability matters: it turns data into something that can be easily reused, combined, and trusted. This lesson helps researchers in climate and atmospheric sciences recognize and apply this essential aspect of modern research.

Learning objectives

By the end of this lesson, learners will be able to:

Analyze climate and atmospheric datasets to distinguish interoperable from non-interoperable systems across structural, semantic, and technical layers.
Analyze a NetCDF dataset to evaluate how its data model, dimensions, variables, and metadata organization enable structural interoperability.
Evaluate the semantic interoperability of a NetCDF dataset using CF Conventions and explain how shared vocabularies enable machine-actionable meaning.
Apply OPeNDAP to access and subset remote NetCDF datasets, distinguishing between metadata retrieval and data transfer in distributed infrastructures.
Apply and analyze REST API principles to programmatically create and manage repository metadata, explaining how APIs operationalize technical interoperability.
Analyze how cloud-native data layouts (NetCDF vs Zarr) affect performance, scalability, and structural interoperability in distributed environments.
Evaluate a research data infrastructure against AI-readiness requirements by linking structural, semantic, and technical interoperability components to scalable machine learning workflows.

References and Glossary

For further reading and definitions of key terms introduced in this workshop, consult the Reference section.

Prerequisite

To follow this lesson, learners should already be able to have :

Working knowledge in Python (write and execute short scripts in Python)
Awareness of NetCDF format

Project Setup

Create a working directory for this course:

BASH

cd ~/Desktop
mkdir Interoperability_climate_sciences
cd Interoperability_climate_sciences

Software Setup

We will use JupyterLab for live coding and exercises.

This course requires:

A Python 3 environment
A Unix-like terminal
Several Python libraries (installed via requirements.txt)

Follow the installation steps carefully.

1. Install Python 3 (Required)

If you don’t have Python installed, download Python from:

👉 https://www.python.org/downloads/

This course was tested with Python 3.11, but any supported version should work: https://devguide.python.org/versions/#versions

Caution

⚠️ Python 2.7 is not supported

Callout

If you are not sure if you have the Python 3 installed, or if your version is supported, follow the steps from Verify installation section before the installation step.

Verify Installation

Open a terminal and run:

BASH

python3 --version   # macOS / Linux
python --version    # Windows

Expected output (example):

BASH

Python 3.11.4

You can also start Python interactively:

BASH

python3   # or python on Windows

Exit with:

BASH

exit()

or press CTRL+D.

2. Set Up the Python Environment

We will:

Create a virtual environment
Define dependencies in requirements.txt
Install all libraries in one step

Step 1 — Create a Virtual Environment

BASH

python3 -m venv nes-course-env

Activate it:

macOS / Linux
BASH
```
source nes-course-env/bin/activate
```
Windows (PowerShell)
BASH
```
nes-course-env\Scripts\Activate.ps1
```

You should now see (nes-course-env) in your terminal prompt.

Virtual environment activation for some shells.

There are different Unix shell programs. On some of them, the default virtual environment activation command will fail.

On fish shell, the correct command is:

source nes-course-env/bin/activate.fish

On tcsh, the command is:

source nes-course-env/bin/activate.csh

On nushell, the command is:

source nes-course-env/bin/activate.nu

Step 2 — Create `requirements.txt`

Make sure you are in your project folder:

BASH

cd ~/Desktop/Interoperability_climate_sciences

Create a file named:

BASH

touch requirements.txt

Open the file in a text editor and add the following content:

TXT

# Core scientific stack
xarray
netCDF4
pydap
matplotlib
scipy

# Cloud-native data access
zarr
kerchunk
fsspec[http]
h5netcdf
h5py

# Interactive environment
jupyterlab
ipykernel

Download requirements.txt instead

You can use your shell to download the requirements.txt:

BASH

cd ~/Desktop/Interoperability_climate_sciences
curl -L -O https://github.com/4TUResearchData-Carpentries/interoperability-climate-sciences/blob/main/learners/files/requirements.txt

Step 3 — Install Dependencies

Upgrade pip and install all packages:

BASH

pip install --upgrade pip
pip install -r requirements.txt

Step 4 — Verify Installation (Recommended)

BASH

python -c "import xarray, netCDF4, pydap, zarr, kerchunk, fsspec; print('All good')"

Step 5 — Register the Environment in Jupyter

BASH

python -m ipykernel install --user --name nes-course-env --display-name "NES Course (Python)"

Step 6 — Launch JupyterLab

Launch JupyterLab:

BASH

jupyter lab

In JupyterLab:

Click on the button NES Course (Python) under Notebook (see image below):

Screenshot from JupyterLab Launcher with title Notebook and two Python icons underneath: one named 'Python 3 (ipykernel)' and the other named: NES Course (Python)'

3. Unix Terminal (Required for API Episodes)

You will need a Unix-like terminal.

Linux

Use the default terminal.

macOS

Use the default Terminal app. Terminal can be found under /Applications/Utilities. You can also search for “terminal” through Spotlight.

Windows

Install one of:

Git Bash: https://git-scm.com/downloads
Windows Subsystem for Linux (WSL): https://learn.microsoft.com/en-us/windows/wsl/install

4. API Command-Line Tools (Required for REST API Episodes)

`jq` (Optional but Recommended)

JSON processor for formatting API output.

Linux

BASH

sudo apt-get update
sudo apt-get install -y jq

macOS

BASH

brew install jq

Windows

BASH

scoop install main/jq

Verify Installation

BASH

jq --version

Summary and Setup

Target audience

Ash’s challenge: combining climate data (use case)

Learning objectives

References and Glossary

Project Setup

BASH

Software Setup

1. Install Python 3 (Required)

Verify Installation

BASH

BASH

BASH

BASH

2. Set Up the Python Environment

Step 1 — Create a Virtual Environment

BASH

BASH

BASH

Virtual environment activation for some shells.

Step 2 — Create requirements.txt

BASH

BASH

TXT

Download requirements.txt instead

BASH

Step 3 — Install Dependencies

BASH

Step 4 — Verify Installation (Recommended)

BASH

Step 5 — Register the Environment in Jupyter

BASH

Step 6 — Launch JupyterLab

BASH

3. Unix Terminal (Required for API Episodes)

Linux

macOS

Windows

4. API Command-Line Tools (Required for REST API Episodes)

jq (Optional but Recommended)

Linux

BASH

macOS

BASH

Windows

BASH

Verify Installation

BASH

Step 2 — Create `requirements.txt`

`jq` (Optional but Recommended)