Working with the ECCC GeoAPI to access weather station data

Environment and Climate Change Canada (ECCC) hosts a data server compatible with the GeoAPI standard. This notebook shows how to send requests for daily climate station data and display the results.

Climate stations

The server holds different collections, and requests are made to a particular collection. Here we’ll start with the climate-station collection, which holds metadata about available stations, but no actual meteorological data. Useful queryables fields in this collection include DLY_FIRST_DATE and DLY_LAST_DATE, ENG_PROV_NAME, LATITUDE, LONGITUDE and ELEVATION and STATION_NAME, among many others.

Creating a request to the server for data

Let’s start by showing a map of all available stations locations in New-Brunswick. To do so, we first need to compose a URL request. The request includes the address of the server, the collection, then a query to filter results.

import os

os.environ["USE_PYGEOS"] = "0"  # force use Shapely with GeoPandas

import urllib

import geopandas as gpd
from urlpath import URL

# Compose the request
host = URL("https://api.weather.gc.ca")
climate_stations = host / "collections" / "climate-stations" / "items"
url = climate_stations.with_query({"ENG_PROV_NAME": "NOVA-SCOTIA"})
print(url)

# Send the request to the server
resp = url.get()
resp
https://api.weather.gc.ca/collections/climate-stations/items?ENG_PROV_NAME=NOVA-SCOTIA
<Response [200]>

The response from the server is a Response class instance. What we’re interested in is the content of this response, which in this case is a geoJSON file.

# NBVAL_IGNORE_OUTPUT

resp.content[:100]
b'{"type":"FeatureCollection","features":[{"type":"Feature","properties":{"STN_ID":6289,"STATION_NAME"'

We’ll open the geoJSON using geopandas. We have a few options to do this:

  • Load the response’ content using json.load, then create GeoDataFrame using the from_features class method;

  • Save the response content to a file on disk, then open using geopandas.read_file;

  • Save the response in an in-memory file using StringIO;

  • Call geopandas.read_file(url) to let geopandas handle the data download.

Here we’ll use the last option, as it’s the simplest. Note that the first method ignores the feature id, which seems to create problems with visualization with folium below.

# NBVAL_IGNORE_OUTPUT

# The first approach would look like this:
# import json
# stations = gpd.GeoDataFrame.from_features(json.loads(resp.content))

with urllib.request.urlopen(url=str(url)) as req:
    stations = gpd.read_file(filename=req, engine="pyogrio")
stations.head()
ERROR 1: PROJ: proj_create_from_database: Open of /home/docs/checkouts/readthedocs.org/user_builds/pavics-sdi/conda/latest/share/proj failed
/home/docs/checkouts/readthedocs.org/user_builds/pavics-sdi/conda/latest/lib/python3.10/site-packages/pyogrio/geopandas.py:49: FutureWarning: errors='ignore' is deprecated and will raise in a future version. Use to_datetime without passing `errors` and catch exceptions explicitly instead
  res = pd.to_datetime(ser, **datetime_kwargs)
/home/docs/checkouts/readthedocs.org/user_builds/pavics-sdi/conda/latest/lib/python3.10/site-packages/pyogrio/geopandas.py:49: FutureWarning: errors='ignore' is deprecated and will raise in a future version. Use to_datetime without passing `errors` and catch exceptions explicitly instead
  res = pd.to_datetime(ser, **datetime_kwargs)
/home/docs/checkouts/readthedocs.org/user_builds/pavics-sdi/conda/latest/lib/python3.10/site-packages/pyogrio/geopandas.py:49: FutureWarning: errors='ignore' is deprecated and will raise in a future version. Use to_datetime without passing `errors` and catch exceptions explicitly instead
  res = pd.to_datetime(ser, **datetime_kwargs)
/home/docs/checkouts/readthedocs.org/user_builds/pavics-sdi/conda/latest/lib/python3.10/site-packages/pyogrio/geopandas.py:49: FutureWarning: errors='ignore' is deprecated and will raise in a future version. Use to_datetime without passing `errors` and catch exceptions explicitly instead
  res = pd.to_datetime(ser, **datetime_kwargs)
/home/docs/checkouts/readthedocs.org/user_builds/pavics-sdi/conda/latest/lib/python3.10/site-packages/pyogrio/geopandas.py:49: FutureWarning: errors='ignore' is deprecated and will raise in a future version. Use to_datetime without passing `errors` and catch exceptions explicitly instead
  res = pd.to_datetime(ser, **datetime_kwargs)
/home/docs/checkouts/readthedocs.org/user_builds/pavics-sdi/conda/latest/lib/python3.10/site-packages/pyogrio/geopandas.py:49: FutureWarning: errors='ignore' is deprecated and will raise in a future version. Use to_datetime without passing `errors` and catch exceptions explicitly instead
  res = pd.to_datetime(ser, **datetime_kwargs)
/home/docs/checkouts/readthedocs.org/user_builds/pavics-sdi/conda/latest/lib/python3.10/site-packages/pyogrio/geopandas.py:49: FutureWarning: errors='ignore' is deprecated and will raise in a future version. Use to_datetime without passing `errors` and catch exceptions explicitly instead
  res = pd.to_datetime(ser, **datetime_kwargs)
/home/docs/checkouts/readthedocs.org/user_builds/pavics-sdi/conda/latest/lib/python3.10/site-packages/pyogrio/geopandas.py:49: FutureWarning: errors='ignore' is deprecated and will raise in a future version. Use to_datetime without passing `errors` and catch exceptions explicitly instead
  res = pd.to_datetime(ser, **datetime_kwargs)
id STN_ID STATION_NAME PROV_STATE_TERR_CODE ENG_PROV_NAME FRE_PROV_NAME COUNTRY LATITUDE LONGITUDE TIMEZONE ... HLY_FIRST_DATE HLY_LAST_DATE DLY_FIRST_DATE DLY_LAST_DATE MLY_FIRST_DATE MLY_LAST_DATE HAS_MONTHLY_SUMMARY HAS_NORMALS_DATA HAS_HOURLY_DATA geometry
0 8200100 6289 ANNAPOLIS ROYAL NS NOVA SCOTIA NOUVELLE-ÉCOSSE CAN 444500000 -653100000 AST ... NaT NaT 1914-04-01 2007-06-04 1914-01-01 2006-02-01 Y N N POINT (-65.51667 44.75000)
1 8200400 6298 BARRIE BROOK NS NOVA SCOTIA NOUVELLE-ÉCOSSE CAN 453900000 -612600000 AST ... NaT NaT 1950-02-01 1972-01-31 1950-01-01 1972-12-01 Y N N POINT (-61.43333 45.65000)
2 8200575 6302 BEDFORD NS NOVA SCOTIA NOUVELLE-ÉCOSSE CAN 444400000 -634000000 AST ... NaT NaT 1954-12-01 1966-05-31 1955-01-01 1966-12-01 Y N N POINT (-63.66667 44.73333)
3 8200579 6303 BIG INTERVALE NS NOVA SCOTIA NOUVELLE-ÉCOSSE CAN 465000000 -603700000 AST ... NaT NaT 1984-09-01 1992-12-31 1984-01-01 1992-12-01 Y N N POINT (-60.61667 46.83333)
4 8200630 6309 CAMBRIDGE STATION NS NOVA SCOTIA NOUVELLE-ÉCOSSE CAN 450400000 -643700000 AST ... NaT NaT 1973-10-01 1978-09-30 1973-01-01 1978-12-01 Y N N POINT (-64.61667 45.06667)

5 rows × 34 columns

Filter stations

Now let’s say we want to filter the stations that were in operations for at least 50 years. What we’ll do is create a new column n_days and filter on it.

# NBVAL_IGNORE_OUTPUT

import pandas as pd

# Create a datetime.Timedelta object from the subtraction of two dates.
delta = pd.to_datetime(stations["DLY_LAST_DATE"]) - pd.to_datetime(
    stations["DLY_FIRST_DATE"]
)

# Get the number of days in the time delta
stations["n_days"] = delta.apply(lambda x: x.days)

# Compute condition
over_50 = stations["n_days"] > 50 * 365.25

# Index the data frame using the condition
select = stations[over_50]
select.head()
id STN_ID STATION_NAME PROV_STATE_TERR_CODE ENG_PROV_NAME FRE_PROV_NAME COUNTRY LATITUDE LONGITUDE TIMEZONE ... HLY_LAST_DATE DLY_FIRST_DATE DLY_LAST_DATE MLY_FIRST_DATE MLY_LAST_DATE HAS_MONTHLY_SUMMARY HAS_NORMALS_DATA HAS_HOURLY_DATA geometry n_days
0 8200100 6289 ANNAPOLIS ROYAL NS NOVA SCOTIA NOUVELLE-ÉCOSSE CAN 444500000 -653100000 AST ... NaT 1914-04-01 2007-06-04 1914-01-01 2006-02-01 Y N N POINT (-65.51667 44.75000) 34032
9 8201410 6336 DEMING NS NOVA SCOTIA NOUVELLE-ÉCOSSE CAN 451259007 -611040090 AST ... NaT 1956-10-01 2011-12-31 1956-01-01 2006-02-01 Y Y N POINT (-61.17780 45.21639) 20179
29 8203400 6399 MALAY FALLS NS NOVA SCOTIA NOUVELLE-ÉCOSSE CAN 445900000 -622900000 AST ... 2000-08-31 23:00:00 1950-02-01 2000-08-31 1950-01-01 2000-08-01 Y N N POINT (-62.48333 44.98333) 18474
45 8205698 6485 SYDNEY NS NOVA SCOTIA NOUVELLE-ÉCOSSE CAN 460900000 -601200000 AST ... NaT 1870-01-01 1941-03-31 1870-01-01 1941-12-01 Y N N POINT (-60.20000 46.15000) 26021
50 8206300 6506 WHITEHEAD NS NOVA SCOTIA NOUVELLE-ÉCOSSE CAN 451300000 -611100000 AST ... NaT 1883-12-01 1960-06-30 1883-01-01 1960-12-01 Y N N POINT (-61.18333 45.21667) 27970

5 rows × 35 columns

Map the data

We can then simply map the locations of station with at least 50 years of data using the explore method. This will display an interactive base map and overlay the station locations, where on a station marker will display this station’s information.

On top of this map, we’ll add controls to draw a rectangle. To use the drawing tool, click on the square on the left hand side menu, and the click and drag to draw a rectangle over the area of interest. Once that’s done, click on the Export button on the right of the map. This will download a file called data.geojson

from folium.plugins import Draw

# Add control to draw a rectangle, and an export button.
draw_control = Draw(
    draw_options={
        "polyline": False,
        "poly": False,
        "circle": False,
        "polygon": False,
        "marker": False,
        "circlemarker": False,
        "rectangle": True,
    },
    export=True,
)

# The map library Folium chokes on columns including time stamps, so we first select the data to plot.
m = select[["geometry", "n_days"]].explore("n_days")
draw_control.add_to(m)
m
Make this Notebook Trusted to load map: File -> Trust Notebook

Filter stations using bounding box

Next, we’ll use the bounding box drawn on the map to select a subset of stations. We first open the data.geojson file downloaded to disk, create a shapely object and use it filter stations.

# NBVAL_IGNORE_OUTPUT

# Adjust directory if running this locally.
# rect = gpd.read_file("~/Downloads/data.geojson")

# Here we're using an existing file so the notebook runs without user interaction.
rect = gpd.read_file(filename="./data.geojson", engine="pyogrio")

# Filter stations DataFrame using bbox
inbox = select.within(rect.loc[0].geometry)

print("Number of stations within subregion: ", sum(inbox))
sub_select = select[inbox]
sub_select.head()
Number of stations within subregion:  9
id STN_ID STATION_NAME PROV_STATE_TERR_CODE ENG_PROV_NAME FRE_PROV_NAME COUNTRY LATITUDE LONGITUDE TIMEZONE ... HLY_LAST_DATE DLY_FIRST_DATE DLY_LAST_DATE MLY_FIRST_DATE MLY_LAST_DATE HAS_MONTHLY_SUMMARY HAS_NORMALS_DATA HAS_HOURLY_DATA geometry n_days
9 8201410 6336 DEMING NS NOVA SCOTIA NOUVELLE-ÉCOSSE CAN 451259007 -611040090 AST ... NaT 1956-10-01 2011-12-31 1956-01-01 2006-02-01 Y Y N POINT (-61.17780 45.21639) 20179
45 8205698 6485 SYDNEY NS NOVA SCOTIA NOUVELLE-ÉCOSSE CAN 460900000 -601200000 AST ... NaT 1870-01-01 1941-03-31 1870-01-01 1941-12-01 Y N N POINT (-60.20000 46.15000) 26021
50 8206300 6506 WHITEHEAD NS NOVA SCOTIA NOUVELLE-ÉCOSSE CAN 451300000 -611100000 AST ... NaT 1883-12-01 1960-06-30 1883-01-01 1960-12-01 Y N N POINT (-61.18333 45.21667) 27970
128 8201000 6329 COLLEGEVILLE NS NOVA SCOTIA NOUVELLE-ÉCOSSE CAN 452900000 -620100000 AST ... NaT 1916-06-01 2016-09-30 1916-01-01 2006-02-01 Y Y N POINT (-62.01667 45.48333) 36646
155 8205600 6481 STILLWATER NS NOVA SCOTIA NOUVELLE-ÉCOSSE CAN 451100000 -620000000 AST ... NaT 1915-12-01 1979-10-31 1915-01-01 1979-12-01 Y N N POINT (-62.00000 45.18333) 23345

5 rows × 35 columns

Request meteorological data

Now we’ll make a request for actual meteorological data from the stations filtered above. For this, we’ll use the Daily Climate Observations collection (climate-daily). Here, we’re picking just one station but we could easily loop on each station.

# NBVAL_IGNORE_OUTPUT

coll = host / "collections" / "climate-daily" / "items"
station_id = "8201410"

# Restricting the number of entries returned to keep things fast.
url = str(coll.with_query({"CLIMATE_IDENTIFIER": station_id, "limit": 365}))
print("Request: ", url)
with urllib.request.urlopen(url=str(url)) as req:
    data = gpd.read_file(filename=req, engine="pyogrio")
data.head()
Request:  https://api.weather.gc.ca/collections/climate-daily/items?CLIMATE_IDENTIFIER=8201410&limit=365
/home/docs/checkouts/readthedocs.org/user_builds/pavics-sdi/conda/latest/lib/python3.10/site-packages/pyogrio/geopandas.py:49: FutureWarning: errors='ignore' is deprecated and will raise in a future version. Use to_datetime without passing `errors` and catch exceptions explicitly instead
  res = pd.to_datetime(ser, **datetime_kwargs)
id SNOW_ON_GROUND_FLAG SNOW_ON_GROUND SPEED_MAX_GUST TOTAL_SNOW_FLAG SPEED_MAX_GUST_FLAG LOCAL_MONTH LOCAL_DAY MIN_TEMPERATURE TOTAL_PRECIPITATION ... TOTAL_RAIN CLIMATE_IDENTIFIER COOLING_DEGREE_DAYS_FLAG MAX_REL_HUMIDITY MIN_TEMPERATURE_FLAG DIRECTION_MAX_GUST PROVINCE_CODE MAX_TEMPERATURE_FLAG MAX_TEMPERATURE geometry
0 8201410.1996.6.13 None 0.0 None None None 6 13 9.0 0.0 ... 0.0 8201410 None None None None NS None 12.5 POINT (-61.17780 45.21639)
1 8201410.1993.12.17 None 0.0 None None None 12 17 -2.5 0.0 ... 0.0 8201410 None None None None NS None 0.5 POINT (-61.17780 45.21639)
2 8201410.1967.11.6 None 0.0 None None None 11 6 3.9 0.0 ... 0.0 8201410 None None None None NS None 8.9 POINT (-61.17780 45.21639)
3 8201410.2003.1.14 None 45.0 None None None 1 14 -6.0 0.0 ... 0.0 8201410 None None None None NS None -3.5 POINT (-61.17780 45.21639)
4 8201410.2009.1.30 None 5.0 None None None 1 30 -5.5 0.0 ... 0.0 8201410 None None None None NS None -1.0 POINT (-61.17780 45.21639)

5 rows × 36 columns

We can also send a request for data inside a bounding box at a specific date.

bbox = rect.iloc[0].geometry.bounds
print("Bounding box: ", bbox)
url = str(
    coll.with_query(
        {
            "bbox": str(bbox).strip("()"),
            "LOCAL_DATE": "2000-01-01 00:00:00",
            "limit": 100,
        }
    )
)
with urllib.request.urlopen(url=str(url)) as req:
    snapshot = gpd.read_file(filename=req, engine="pyogrio")
Bounding box:  (-62.186675, 44.78125, -59.123882, 47.53125)
/home/docs/checkouts/readthedocs.org/user_builds/pavics-sdi/conda/latest/lib/python3.10/site-packages/pyogrio/geopandas.py:49: FutureWarning: errors='ignore' is deprecated and will raise in a future version. Use to_datetime without passing `errors` and catch exceptions explicitly instead
  res = pd.to_datetime(ser, **datetime_kwargs)
import cartopy
from cartopy import crs as ccrs
from matplotlib import pyplot as plt

# Create map projection
proj = ccrs.PlateCarree()

# If using another projection, remember you'll need to reproject the snapshot's coordinates.
# snapshot.to_crs(proj, inplace=True)

# Create figure and axes
fig = plt.figure(figsize=(5, 3))
ax = fig.add_subplot(projection=proj)

# Set the map extent to the bounding box and draw the coastlines
ax.set_extent([bbox[0], bbox[2], bbox[1], bbox[3]])
ax.coastlines()

# Plot markers color-coded according to the temperature recorded.
ax = snapshot.plot(column="MEAN_TEMPERATURE", ax=ax, cmap=plt.cm.viridis, legend=True)

# Add a label to the colorbar
cax = ax.figure.axes[-1]
cax.set_ylabel("Mean temperature [°C]")
/home/docs/checkouts/readthedocs.org/user_builds/pavics-sdi/conda/latest/lib/python3.10/site-packages/cartopy/io/__init__.py:241: DownloadWarning: Downloading: https://naturalearth.s3.amazonaws.com/10m_physical/ne_10m_coastline.zip
  warnings.warn(f'Downloading: {url}', DownloadWarning)
Text(448.07777777777784, 0.5, 'Mean temperature [°C]')
../_images/de7f6842e7e2493f7e86e061fb6d66f3f9f2e7640cc275003c81b6d8c7c0821a.png

Another useful filter is on dates and times. Let’s say we only want data in a given period, we simply create a request with the datetime argument and a / separating the start and end dates. You may leave the start or end date open-ended using .. instead of a date time string.

url = str(
    coll.with_query(
        {
            "CLIMATE_IDENTIFIER": station_id,
            "datetime": "1990-01-01 00:00:00/1991-01-01 00:00:00",
        }
    )
)
print(url)
with urllib.request.urlopen(url=str(url)) as req:
    gdf = gpd.read_file(filename=req, engine="pyogrio")
https://api.weather.gc.ca/collections/climate-daily/items?CLIMATE_IDENTIFIER=8201410&datetime=1990-01-01+00%3A00%3A00%2F1991-01-01+00%3A00%3A00
/home/docs/checkouts/readthedocs.org/user_builds/pavics-sdi/conda/latest/lib/python3.10/site-packages/pyogrio/geopandas.py:49: FutureWarning: errors='ignore' is deprecated and will raise in a future version. Use to_datetime without passing `errors` and catch exceptions explicitly instead
  res = pd.to_datetime(ser, **datetime_kwargs)
# Convert the datetime string to a datetime object
gdf["LOCAL_DATE"] = pd.to_datetime(gdf["LOCAL_DATE"])

# Create a time series out of the column for mean temperature
ts = gdf.set_index("LOCAL_DATE")["MEAN_TEMPERATURE"]
# Plot the time series
ax = ts.plot()
ax.set_xlabel("Time")
ax.set_ylabel("Mean temperature [°C]")
ax.set_title(gdf.iloc[0]["STATION_NAME"])
plt.show()
../_images/2cbf518b19a38bec0c907938031e09385ba3d51937d102f928241e2ad9757591.png

Adjusted and Homogenized Canadian Climate Data (AHCCD)

The Adjusted and Homogenized Canadian Climate Data (AHCCD) datasets from ECCC are climate station data adjusted to account for discontinuities in the record, such as instrument relocation. The collections related to these datasets are ahccd-stations for station metadata, ahccd-annual, ahccd-monthly and ahccd-seasonal for temporally aggregated time series, and ahccd-trends for trends computed on the data.

Now, unfortunately, the fields for these datasets are different from those of the climate stations… One strategy to find out what keywords are accepted is to make a query with no filter except for limit=1. Another is to go to the collection search page (click on the link printed below), and inspect the column names.

# NBVAL_IGNORE_OUTPUT

# The url to query station metadata - this should behave similarly as `climate-stations`
ahccd_stations = host / "collections" / "ahccd-stations" / "items"
url = ahccd_stations.with_query({"limit": 1})
print(ahccd_stations)
with urllib.request.urlopen(url=str(url)) as req:
    gpd.read_file(filename=req, engine="pyogrio")
https://api.weather.gc.ca/collections/ahccd-stations/items
/home/docs/checkouts/readthedocs.org/user_builds/pavics-sdi/conda/latest/lib/python3.10/site-packages/pyogrio/geopandas.py:49: FutureWarning: errors='ignore' is deprecated and will raise in a future version. Use to_datetime without passing `errors` and catch exceptions explicitly instead
  res = pd.to_datetime(ser, **datetime_kwargs)
/home/docs/checkouts/readthedocs.org/user_builds/pavics-sdi/conda/latest/lib/python3.10/site-packages/pyogrio/geopandas.py:49: FutureWarning: errors='ignore' is deprecated and will raise in a future version. Use to_datetime without passing `errors` and catch exceptions explicitly instead
  res = pd.to_datetime(ser, **datetime_kwargs)

So if we want to see the stations in Yukon, we’d have to query with the province__province keyword… Now how do you know what code to use for provinces? One solution is to go again to the collection search page, zoom on the area of interest and click on the check box to “Only show items by map view”, then inspect the results.

# NBVAL_IGNORE_OUTPUT

url = ahccd_stations.with_query({"province__province": "YT"})
with urllib.request.urlopen(url=str(url)) as req:
    gpd.read_file(filename=req, engine="pyogrio")
/home/docs/checkouts/readthedocs.org/user_builds/pavics-sdi/conda/latest/lib/python3.10/site-packages/pyogrio/geopandas.py:49: FutureWarning: errors='ignore' is deprecated and will raise in a future version. Use to_datetime without passing `errors` and catch exceptions explicitly instead
  res = pd.to_datetime(ser, **datetime_kwargs)
/home/docs/checkouts/readthedocs.org/user_builds/pavics-sdi/conda/latest/lib/python3.10/site-packages/pyogrio/geopandas.py:49: FutureWarning: errors='ignore' is deprecated and will raise in a future version. Use to_datetime without passing `errors` and catch exceptions explicitly instead
  res = pd.to_datetime(ser, **datetime_kwargs)

Let’s pick the Dawson station (2100LRP), which seems to have a long record. Again, use the trick above to see which fields are accepted.

# NBVAL_IGNORE_OUTPUT

ahccd_mon = host / "collections" / "ahccd-monthly" / "items"
url = ahccd_mon.with_query({"station_id__id_station": "2100LRP"})
with urllib.request.urlopen(url=str(url)) as req:
    mts = gpd.read_file(filename=req, engine="pyogrio")
mts.head()
id temp_min__temp_min rain_units__pluie_unites snow_units__neige_unites total_precip__precip_totale temp_max__temp_max pressure_station__pression_station wind_speed_units__vitesse_vent_unites date identifier__identifiant ... temp_min_units__temp_min_unites pressure_sea_level__pression_niveau_mer pressure_station_units__pression_station_unites temp_max_units__temp_max_unites snow__neige period_group__groupe_periode total_precip_units__precip_totale_unites period_value__valeur_periode pressure_sea_level_units__pression_niveau_mer_unite geometry
0 2100LRP.1983.03 -24.5 mm mm None -6.8 None kph 1983-03 2100LRP.1983.03 ... C None hPa C None Monthly mm Mar hPa POINT (-139.13000 64.07000)
1 2100LRP.1921.03 -24.7 mm mm None -8.8 None kph 1921-03 2100LRP.1921.03 ... C None hPa C None Monthly mm Mar hPa POINT (-139.13000 64.07000)
2 2100LRP.1992.06 6.4 mm mm None 22.1 None kph 1992-06 2100LRP.1992.06 ... C None hPa C None Monthly mm Jun hPa POINT (-139.13000 64.07000)
3 2100LRP.1938.03 -22.9 mm mm None -4.5 None kph 1938-03 2100LRP.1938.03 ... C None hPa C None Monthly mm Mar hPa POINT (-139.13000 64.07000)
4 2100LRP.1963.09 0.9 mm mm None 12.1 None kph 1963-09 2100LRP.1963.09 ... C None hPa C None Monthly mm Sep hPa POINT (-139.13000 64.07000)

5 rows × 28 columns

Now let’s plot the mean temperature time series. Note that the server does not necessarily return a continuous time series, and when plotting time series with gaps, matplotlib just draws a continuous line between values. To convey the presence of missing values, here we’ll use the asfreq("MS") method to fill-in gaps in the time series with explicit missing values.

# Set the DataFrame index to a datetime object and sort it
mts.set_index(pd.to_datetime(mts["date"]), inplace=True)
mts.sort_index(inplace=True)

# Convert the temperature to a continuous monthly time series (so missing values are visible in the graphic)
tas = mts["temp_mean__temp_moyenne"].asfreq("MS")

# Mask missing values
tas.mask(tas < -300, inplace=True)

tas.plot(figsize=(12, 3))
plt.ylabel("Mean monthly temperature [°C]")
Text(0, 0.5, 'Mean monthly temperature [°C]')
../_images/a56cc6ee9fcc629bfedd2965679034836b50c5b69902669565014460b40df0f7.png