07 Demo: ERA5 Data Download#

UW Geospatial Data Analysis
CEE467/CEWA567
David Shean

Climate reanalysis#

Nice introduction: https://climate.copernicus.eu/climate-reanalysis

“Climate reanalyses combine past observations with models to generate consistent time series of multiple climate variables. Reanalyses are among the most-used datasets in the geophysical sciences. They provide a comprehensive description of the observed climate as it has evolved during recent decades, on 3D grids at sub-daily intervals. “

ERA5#

ERA5 = “ECMWF ReAnalysis 5”
ECMWF = “European Centre for Medium-Range Weather Forecasts”

https://www.ecmwf.int/en/forecasts/dataset/ecmwf-reanalysis-v5

“ERA5 provides hourly estimates of a large number of atmospheric, land and oceanic climate variables. The data cover the Earth on a 30km grid and resolve the atmosphere using 137 levels from the surface up to a height of 80km.”

“ERA5 combines vast amounts of historical observations into global estimates using advanced modelling and data assimilation systems.”

Variables#

Hundreds of output variables for each hourly timestep. See a list of all of the available variables:

  • https://apps.ecmwf.int/codes/grib/param-db

Resolution#

The ERA5 HRES (High Resolution) data have a native resolution of 0.28125 degrees (31km)

  • https://confluence.ecmwf.int/display/CKB/ERA5%3A+What+is+the+spatial+reference

The ERA5-Land data have a native resolution of 9 km (~0.08°)

  • https://confluence.ecmwf.int/display/CKB/ERA5-Land%3A+data+documentation

How many grid cells are required to store one variable (like temperature) for full 72 year record at hourly resolution?#

#Space
s = 360*180*4*4*137
s
142041600
#Time
t = 72*365.25*24
t
631152.0
s*t
89649839923200.0
f'{s*t:e}'
'8.964984e+13'

Additional notes on the ERA5 grid#

The model is actually run using a “reduced gaussian grid” with quasi-equal spacing across the planet:

  • https://confluence.ecmwf.int/display/CKB/ERA5%3A+What+is+the+spatial+reference

  • https://www.ecmwf.int/sites/default/files/elibrary/2016/17262-new-grid-ifs.pdf

These values are then interpolated to the regular grid of 0.25° cells in the netCDF files. We will revisit this issue during the exercises.

Data Availability#

From CDS (Climate Data Store)#

For future reference, you can access the ERA5 data directly! The CDS API allows you to request subsets of ERA5 products for desired spatial extent, time periods, time intervals, etc.:

  • https://cds.climate.copernicus.eu/api-how-to

  • https://confluence.ecmwf.int/display/CKB/How+to+download+ERA5

Some commonly used products are also available on Amazon S3#

  • https://registry.opendata.aws/ecmwf-era5/

Shortcut: download sample datasets#

We could submit requests directly from the CDS API, but you will need to create an account and use a unique API key. The server-side processing and download will require at least 5-40 minutes per dataset.

For this lab, I submitted some requests to prepare sample ERA5 datasets. The scripts are available in the cds_scripts subdirectory. But as a shortcut, we will download these datasets, which were staged in public data archive.

Zenodo#

Zenodo is a great, free, permanent data archiving solution: https://about.zenodo.org/

Lab09 Zenodo record#

  • https://zenodo.org/record/6302343

  • Three main files needed for the Lab07 notebooks. Original datasets from CDS are also archived.

    • Notebook 1: ‘climatology_0.25g_ea_2t.nc’, ‘1month_anomaly_Global_ea_2t.nc’

    • Notebook 2: ‘WA_ERA5-Land_hourly_1950-2022_6hr.nc’

Check disk space!#

  • Before running, open a terminal on the hub and run the following command df -h ~. Should report something like this:

Filesystem      Size  Used Avail Use% Mounted on
/dev/sdf         50G   41G  9.7G  81% /home/jovyan
  • You will need ~4.5 GB available for these data products

  • If you don’t have that, you can go back and delete some of the products from previous labs that are no longer needed, or can be easily downloaded again

!df -h ~
Filesystem      Size  Used Avail Use% Mounted on
/dev/sdc       1007G  136G  821G  15% /
import os
from pathlib import Path
era5_data_dir = f'{Path.home()}/gda_demo_data/era5_data'

if not os.path.exists(era5_data_dir):
    os.makedirs(era5_data_dir)
base_url = 'https://zenodo.org/record/6302343/files/'
fn_list = ['climatology_0.25g_ea_2t.nc', \
           '1month_anomaly_Global_ea_2t.nc', \
           #'WA_ERA5-Land_hourly_1950-2022_6hr.nc'
           ]
url_list = [base_url+fn for fn in fn_list]
#For parallel download from command line:
#url_list_str = ' '.join(url_list)
url_list
['https://zenodo.org/record/6302343/files/climatology_0.25g_ea_2t.nc',
 'https://zenodo.org/record/6302343/files/1month_anomaly_Global_ea_2t.nc']
for url in url_list:
    !wget -nc -P {era5_data_dir} {url}
--2025-02-18 18:58:00--  https://zenodo.org/record/6302343/files/climatology_0.25g_ea_2t.nc
Resolving zenodo.org (zenodo.org)... 188.185.45.92, 188.185.48.194, 188.185.43.25, ...
Connecting to zenodo.org (zenodo.org)|188.185.45.92|:443... connected.
HTTP request sent, awaiting response... 301 MOVED PERMANENTLY
Location: /records/6302343/files/climatology_0.25g_ea_2t.nc [following]
--2025-02-18 18:58:00--  https://zenodo.org/records/6302343/files/climatology_0.25g_ea_2t.nc
Reusing existing connection to zenodo.org:443.
HTTP request sent, awaiting response... 200 OK
Length: 49874647 (48M) [application/octet-stream]
Saving to: ‘/home/eric/gda_demo_data/era5_data/climatology_0.25g_ea_2t.nc’

climatology_0.25g_e 100%[===================>]  47.56M  10.2MB/s    in 5.6s    

2025-02-18 18:58:06 (8.51 MB/s) - ‘/home/eric/gda_demo_data/era5_data/climatology_0.25g_ea_2t.nc’ saved [49874647/49874647]

--2025-02-18 18:58:06--  https://zenodo.org/record/6302343/files/1month_anomaly_Global_ea_2t.nc
Resolving zenodo.org (zenodo.org)... 188.185.48.194, 188.185.43.25, 188.185.45.92, ...
Connecting to zenodo.org (zenodo.org)|188.185.48.194|:443... connected.
HTTP request sent, awaiting response... 301 MOVED PERMANENTLY
Location: /records/6302343/files/1month_anomaly_Global_ea_2t.nc [following]
--2025-02-18 18:58:07--  https://zenodo.org/records/6302343/files/1month_anomaly_Global_ea_2t.nc
Reusing existing connection to zenodo.org:443.
HTTP request sent, awaiting response... 200 OK
Length: 2147121539 (2.0G) [application/octet-stream]
Saving to: ‘/home/eric/gda_demo_data/era5_data/1month_anomaly_Global_ea_2t.nc’

1month_anomaly_Glob 100%[===================>]   2.00G  19.4MB/s    in 1m 57s  

2025-02-18 19:00:04 (17.6 MB/s) - ‘/home/eric/gda_demo_data/era5_data/1month_anomaly_Global_ea_2t.nc’ saved [2147121539/2147121539]
!ls -lh $era5_data_dir
total 2.1G
-rw-r--r-- 1 eric eric 2.0G Feb 18 19:00 1month_anomaly_Global_ea_2t.nc
-rw-r--r-- 1 eric eric  48M Feb 18 18:58 climatology_0.25g_ea_2t.nc