Context
The Arctic Hub provides more than 100 Earth Observation datasets from several originating centers (more info on data available on Arctic Hub).
In this article, we will discuss the various formats of the data provided and show easy instructions on how to open the data using Python.
Table of data formats
| Observation domain | Data format | 
| Atmosphere | GRIB and NetCDF* in  | 
| Climate Change | GRIB and NetCDF* | 
| Emergency | GRIB and NetCDF in  | 
| Land | GRIB, GeoTIFF and NetCDF* in  | 
| Marine | NetCDF | 
*the NetCDF format is experimental for these Services
π In the following examples, we will use different datasets over Italy.
NetCDF format
Let's see here how to open a NetCDF data file.
We will focus on the atmospheric temperature of January 1978, provided in the product ERA5 hourly data on pressure levels from 1940 to present (datasetID = EO:ECMWF:DAT:REANALYSIS_ERA5_PRESSURE_LEVELS).
The main packages we use are:
- xarray: to open the dataset 
- matplotlib: to customize our map 
This simple code allows to access all the information (dimensions, coordinates, variables, attributes) of the NetCDF file:
import xarray as xr
dataset = xr.open_dataset("ERA5_CAMS_1978.nc")
dataset
Now, to generate a quick map, just call the xarray.plot() function as follows:
time = "1978-01-01"
dataset.t.sel(time=time).plot()
We can then use matplotlib to enhance the plot. We are visualizing here the atmospheric temperature, and we add the coasts and better georeference the map:
import matplotlib.pyplot as plt
f = plt.figure(figsize=(15,10))
ax = plt.axes(projection=ccrs.PlateCarree())
ax.coastlines()
ax.add_feature(cfeature.LAND, zorder=1, edgecolor='k')
dataset.t.sel(time=time).plot()
plt.title(f"Atmospheric Temperature (K) on {time}", size = 15)
GRIB format
For the GRIB data format, we downloaded the ERA5-Land hourly data from 1950 to present product (datasetID = "EO:ECMWF:DAT:REANALYSIS_ERA5_LAND"), and its leaf area index (lai_hv) for the first week of January 2022.
We recommend installing cfgrib via conda:
conda install -c conda-forge cfgrib
As for the data in NetCDF format, we use xarray to open the data:
import xarray as xr
grib = xr.load_dataset("mydirectory/ERA_CLMS_2022.grib", engine = "cfgrib")
grib
π Note: it is important here to specify the engine type engine = "cfgrib".
This command will open the .grib file and explore the downloaded data:
π‘WEkEO Pro Tip: make sure you have the cfgrib package installed, otherwise the error message "ValueError: unrecognized engine cfgrib must be one of: ['netcdf4', 'scipy', 'store']" will be displayed.
Finally, a simple line of code to plot and view the data:
grib.lai_hv.sel(time="2022-01-01").plot()
GeoTIFF format
For GeoTIFF data format, we downloaded the Fractional Snow Cover (raster 20m) 2016-present, Europe, daily, Jul. 2020 product (datasetID = "EO:CRYO:DAT:HRSI:FSC").
Using the rasterio package, we will loop through a folder containing .tif files, open each one, print some basic information, and display the images.
import rasterio as rs
import rasterio.plot
# set the directory where .tif files are stored
data_dir = './tiff_files'
all_tiff = []
# loop to open and plot all .tif files
for path in os.listdir(data_dir):
if os.path.isfile(os.path.join(data_dir, path)):
all_tiff.append(path)
print('file_name = ', path) # file's name
with rs.open(os.path.join(data_dir, path)) as file:
print("data info : ", file.profile) # file's information
rasterio.plot.show(file) # plot
print(all_tiff)
Thus, for each GeoTIFF in the data_dir directory, we will obtain:
- The file name: 
file_name = FSC_20250121T082000_S2B_T37SCA_V102_1_CLD.tif
- General file information, such as the data format, data type, crs, and more: 
data info :  {'driver': 'GTiff', 'dtype': 'uint8', 'nodata': None, 'width': 5490, 'height': 5490, 'count': 1, 'crs': CRS.from_wkt('...'), 'transform': Affine(20.0, 0.0, 300000.0,        0.0, -20.0, 4100040.0), 'blockxsize': 1024, 'blockysize': 1024, 'tiled': True, 'compress': 'deflate', 'interleave': 'band'}- The basic generated image: 
π Note: rasterio also allows to get some information about the .tif file:
- tiff.bounds: indicates the spatial bounding box
- tiff.count: number of bands
- tiff.width: number of columns of the raster dataset
- tiff.height: number of rows of the raster dataset
- tiff.crs: coordinate reference system
For more information about this Python package, please consult the rasterio documentation page.
What's next?
Please let us know if anything is missing or confusing that would need modification or more explanation! Feel free to contact us:
- through a chat session available in the bottom right corner of the page 
- via our contact webpage 
- via e-mail to our support team (supportATarctic.hub.copernicus.eu) 






