下载ERA5-Land数据及数据处理（每小时转日数据）

网站：Copernicus Climate Data Store | Copernicus Climate Data Store数据目标：2013-2020年温度日数据一、数据下载在网站上注册后，点击Datasets，以“ERA5-Land”搜索，选择“ERA5-Land hourly data from 1950 to present”1. 手动下载每小时数据点击Download data，选择

是一个橙子呀

15128人浏览 · 2022-02-26 16:18:10

是一个橙子呀 · 2022-02-26 16:18:10 发布

网站：Copernicus Climate Data Store | Copernicus Climate Data Store

数据目标：2013-2020年温度日数据

一、数据下载

在网站上注册后，点击Datasets，以“ERA5-Land”搜索，选择“ERA5-Land hourly data from 1950 to present”

1. 手动下载每小时数据

点击Download data，选择需要的时间和地理范围，点击Submit Form，可以在Your Requests中查看请求进程，数据量越大等待时间越久。

2. python实现自动下载

（1）cdsapi库的安装

cdsapi库的安装可以移步官方教程How to install and use CDS API on Windows - Copernicus Knowledge Base - ECMWF Confluence Wiki

如果安装了Pycharm，可以点击File->Settings->Python Interpreter，点击➕加号，搜索【cdsapi】点击安装即可

（2）保存.cdsapirc文件

点击这个链接 How to use the CDS API | Copernicus Climate Data Store

复制下图右方的url和Key，新建文本文件，放入，再修改文件拓展名为【.cdsapirc】

将该文件放入C盘->用户->用户名，文件夹内。

（3）python批量下载

代码主体可从官网上获取，下方是实现循环批量下载月数据

# coding=utf-8
import cdsapi

c = cdsapi.Client()
for year in range(2013, 2021):
    for month in range(1, 13):
        outpath = 'E:\\test\\' + str(year) + str(month).zfill(2) + '.nc'
        print(outpath)
        c.retrieve(
            'reanalysis-era5-land',
            {
                'variable': '2m_temperature',
                'year': str(year),
                'month': [
                    str(month).zfill(2)
                ],
                'day': [
                    '01', '02', '03',
                    '04', '05', '06',
                    '07', '08', '09',
                    '10', '11', '12',
                    '13', '14', '15',
                    '16', '17', '18',
                    '19', '20', '21',
                    '22', '23', '24',
                    '25', '26', '27',
                    '28', '29', '30',
                    '31',
                ],
                'time': [
                    '00:00', '01:00', '02:00',
                    '03:00', '04:00', '05:00',
                    '06:00', '07:00', '08:00',
                    '09:00', '10:00', '11:00',
                    '12:00', '13:00', '14:00',
                    '15:00', '16:00', '17:00',
                    '18:00', '19:00', '20:00',
                    '21:00', '22:00', '23:00',
                ],
                'area': [
                    60, 70, 0,
                    140,
                ],
                'format': 'netcdf',
            },
            outpath)

批量下载参考：

https://blog.csdn.net/qq_34734252/article/details/108781538

http://bbs.06climate.com/forum.php?mod=viewthread&tid=100616&highlight=ERA5

3. 手动下载日数据

这里用到官网上自带的应用：

Daily statistics calculated from ERA5 data

不过每次只能计算一个月的

二、数据处理

1. 每小时数据取平均得到每日数据

# !usr/bin/env python
# -*- coding: utf-8 -*-

import os, sys
import netCDF4 as nc
import numpy as np
import arcpy as ap
from arcpy.sa import *
import datetime
import gdal,osr

def dayextract_nc(nc_file, output_dir):
    dataset = nc.Dataset(nc_file)
    Lat = dataset.variables["latitude"][:]
    Lon = dataset.variables["longitude"][:]

    # 影像的左上角和右下角坐标
    LonMin, LatMax, LonMax, LatMin = [Lon.min(), Lat.max(), Lon.max(), Lat.min()]

    # 分辨率计算
    N_Lat = len(Lat)
    N_Lon = len(Lon)
    Lon_Res = (LonMax - LonMin) / (float(N_Lon) - 1)
    Lat_Res = (LatMax - LatMin) / (float(N_Lat) - 1)
    cols = N_Lon
    rows = N_Lat

    # 设置影像的显示范围
    geo = (LonMin, Lon_Res, 0, LatMax, 0, -Lat_Res)

    # 构造projection
    src_srs = osr.SpatialReference()
    src_srs.ImportFromEPSG(4326)  # 定义输出的坐标系为"WGS 84"，AUTHORITY["EPSG","4326"]
    src_srs_wkt = src_srs.ExportToWkt()   # 给新建图层赋予投影信息

    st = datetime.datetime(1900, 1, 1, 0, 0)
    time1 = []

    for n in time:
        time1.append(str(st + datetime.timedelta(hours=int(n))))

    key = np.array(dataset.variables['t2m'][0, :, :])

    ## 测试是否需要缩放和偏移，结果表明不需要
    # print(dataset.variables['t2m'].set_auto_maskandscale(True))
    # print(dataset.variables['t2m'][0, :, :])  # data1
    # print(dataset.variables['t2m'].set_auto_maskandscale(False))  # 设置缩放和掩膜数组关闭
    # print(dataset.variables['t2m'][0, :, :])  # 关闭后打印数据，得出的数据全部为整数
    # 将关闭后的数据应用缩放和偏移转换
    # print(dataset.variables['t2m'][0, :, :] * dataset.variables['t2m'].scale_factor + dataset.variables['t2m'].add_offset)  # data2

    dayarray = np.zeros(key.shape, key.dtype)
    dayarray1 = np.zeros(key.shape, key.dtype)

    hour = 0
    a = 0
    for i in range(0, len(time)):
        dayarray = dayarray + np.array(dataset.variables['t2m'][i, :, :])
        hour = hour + 1
        if hour % 24 == 0:
            numarray = np.ones(dayarray.shape, dayarray.dtype) * 24
            dayarray = dayarray / numarray
            out_file = os.path.join(output_dir, time1[a][0:10] + ".tif")
            print(out_file)
            write_tiff(out_file, geo, src_srs_wkt, rows, cols, dayarray)
            a = a + 24
            dayarray = dayarray1

这里有一个点，我用matlab查看Nc数据时发现，有scale_factor和offset_factor，就在想这个数据是否是真实值还是需要处理的，于是测试了一下是否需要偏移和缩放。

一般来说如果值是整数有可能需要，如果是小数则不需要

测试是否需要缩放和偏移参考：

https://blog.csdn.net/Will_Zhan/article/details/105388609

2. 转tif

def nc_to_tif(nc_file, output_dir):
    nc_file_name = os.path.basename(nc_file)

    # 提取nc文件
    dayextract_nc(nc_file, output_dir)

def write_tiff(tiff_file, geo, proj, rows, cols, data_array):

    driver = gdal.GetDriverByName("Gtiff")

    outdataset = driver.Create(tiff_file, cols, rows, 1, gdal.GDT_Float32)
    outdataset.SetGeoTransform(geo)
    outdataset.SetProjection(proj)

    band = outdataset.GetRasterBand(1)
    band.WriteArray(data_array)