Reference
Last updated on 2025-12-16 | Edit this page
Reference Materials
- Wilkinson, M. D., Dumontier, M., Aalbersberg, I. J., Appleton, G., Axton, M., Baak, A., … & Mons, B. (2016). The FAIR Guiding Principles for scientific data management and stewardship. Scientific data, 3(1), 1-9.
- European Commission (Ed.). (2004). European interoperability framework for pan-European egovernment services. Publications Office.
- European Commission. Directorate General for Research and Innovation. & EOSC Executive Board. (2021). EOSC interoperability framework: Report from the EOSC Executive Board Working Groups FAIR and Architecture. Publications Office. https://data.europa.eu/doi/10.2777/620649
- Open Geospatial Consortium (OGC) Registry for Accessible Identifiers of Names and Basic Ontologies for the Web (RAINBOW)
Glossary
- Interoperability
The ability of data, tools, and systems to work together automatically and reliably with minimal manual effort.
- Semantic Interoperability
Shared meaning across datasets achieved through standardized vocabularies, units, and metadata conventions.
- Structural Interoperability
Shared representation achieved through common file formats, data models, and predictable array structures.
- Technical Interoperability
Shared access achieved through standardized interfaces, protocols, and machine-readable mechanisms.
- Metadata
Information describing data (e.g., units, variable names, coordinates, attributes) that enables interpretation and reuse.
- CF Conventions
A widely used climate and forecast metadata standard defining variable names, units, coordinate systems, and grid information.
- Community Format
A data format widely adopted and maintained by a scientific community (e.g., NetCDF, Zarr, Parquet).
- NetCDF
A self-describing community format for multidimensional scientific data widely used in climate and atmospheric sciences.
- Zarr
A cloud-native, chunked, and distributed storage format optimized for large-scale scientific datasets.
- Parquet
A columnar data format optimized for efficient access to tabular or metadata-rich information.
- API (Application Programming Interface)
A standardized mechanism that enables software systems to communicate programmatically using defined rules (e.g., HTTP, JSON).
- OPeNDAP
A protocol that enables remote access to subsets of scientific datasets without downloading entire files.
- THREDDS
A data server that exposes scientific datasets through standardized protocols such as OPeNDAP and HTTP.
- Catalog
A machine-readable index of datasets describing what exists, where it is located, and how it can be accessed (e.g., STAC, Intake-ESM, ESGF).
- Cloud-Native Layout
A storage pattern in which datasets are chunked and stored as individual objects in cloud storage, enabling parallel and scalable access.
- Reanalysis
A consistent, long-term dataset produced by combining models and observations through data assimilation.
- In-situ Data
Observational data collected directly by instruments located within the environment being studied (e.g., stations, buoys, radiosondes).