Python API#

Introduction#

The API offers access to different data products. They are outlined in more detail within the Coverage chapter. Please also check out complete examples about how to use the API in the example folder. In order to explore all features interactively, you might want to try the cli.

Available APIs#

The available APIs can be accessed by the top-level API Wetterdienst. This API also allows the user to discover the available APIs of each service included:

In [1]: from wetterdienst import Wetterdienst

In [2]: Wetterdienst.discover()
Out[2]: 
{'DWD': ['OBSERVATION', 'MOSMIX', 'RADAR'],
 'ECCC': ['OBSERVATION'],
 'NOAA': ['GHCN'],
 'WSV': ['PEGEL'],
 'EA': ['HYDROLOGY'],
 'NWS': ['OBSERVATION'],
 'EAUFRANCE': ['HUBEAU'],
 'GEOSPHERE': ['OBSERVATION']}

To load any of the available APIs pass the provider and the network of data to the Wetterdienst API:

In [3]: from wetterdienst import Wetterdienst

In [4]: API = Wetterdienst(provider="dwd", network="observation")

Request arguments#

Some of the wetterdienst request arguments e.g. parameter, resolution, period are based on enumerations. This allows the user to define them in three different ways:

  • by using the exact enumeration e.g.
    Parameter.CLIMATE_SUMMARY
    
  • by using the enumeration name (our proposed name) e.g.
    "climate_summary" or "CLIMATE_SUMMARY"
    
  • by using the enumeration value (most probably the original name) e.g.
    "kl"
    

This leaves a lot of flexibility to the user defining the arguments either by what they know from the weather service or what they know from wetterdienst itself.

Typical requests are defined by five arguments:

  • parameter

  • resolution

  • period

  • start_date

  • end_date

Only the parameter, start_date and end_date argument may be needed for a request, as the resolution and period of the data may be fixed (per station or for all data) within individual services. However if the period is not defined, it is assumed that the user wants data for all available periods and the request then is handled that way.

Arguments start_date and end_date are possible replacements for the period argument if the period of a weather service is fixed. In case both arguments are given they are combined thus data is only taken from the given period and between the given time span.

Enumerations for resolution and period arguments are given at the main level e.g.

In [5]: from wetterdienst import Resolution, Period

or at the domain specific level e.g.

In [6]: from wetterdienst.provider.dwd.observation import DwdObservationResolution, DwdObservationPeriod

Both enumerations can be used interchangeably however the weather services enumeration is limited to what resolutions and periods are actually available while the main level enumeration is a summation of all kinds of resolutions and periods found at the different weather services.

Regarding the definition of requested parameters:

Parameters can be requested in three different ways:

  1. Requesting an entire dataset e.g. climate_summary

from wetterdienst.provider.dwd.observation import DwdObservationRequest
request = DwdObservationRequest(
    parameter="kl"
)
  1. Requesting one parameter of a specific resolution without defining the exact dataset.

For each offered resolution we have created a list of unique parameters which are drafted from the entire space of all datasets e.g. when two datasets contain the somehwat similar parameter we do a pre-selection of the dataset from which the parameter is taken.

from wetterdienst.provider.dwd.observation import DwdObservationRequest
request = DwdObservationRequest(
    parameter="precipitation_height"
)
  1. Request a parameter-dataset tuple

    This gives you entire freedom to request a unique parameter-dataset tuple just as you wish.

from wetterdienst.provider.dwd.observation import DwdObservationRequest
request = DwdObservationRequest(
    parameter=[("precipitation_height", "more_precip"), ("temperature_air_mean_200", "kl")]
)

Core settings#

Wetterdienst holds core settings in its Settings class. You can import and show the Settings like

In [7]: from wetterdienst import Settings

In [8]: settings = Settings.default()  # default settings

In [9]: print(settings)
Settings({
    "cache_disable": false,
    "cache_dir": "/home/docs/.cache/wetterdienst",
    "fsspec_client_kwargs": {},
    "humanize": true,
    "tidy": true,
    "si_units": true,
    "skip_empty": false,
    "skip_threshold": 0.95,
    "dropna": false,
    "interp_use_nearby_station_until_km": 1
})

or modify them for your very own request like

In [10]: from wetterdienst import Settings

In [11]: settings = Settings(tidy=False)

In [12]: print(settings)
Settings({
    "cache_disable": false,
    "cache_dir": "/home/docs/.cache/wetterdienst",
    "fsspec_client_kwargs": {},
    "humanize": true,
    "tidy": false,
    "si_units": true,
    "skip_empty": false,
    "skip_threshold": 0.95,
    "dropna": false,
    "interp_use_nearby_station_until_km": 1
})

Settings has four layers of which those arguments are sourced: - Settings arguments e.g. Settings(tidy=True) - environment variables e.g. WD_SCALAR_TIDY = “0” - local .env file in the same folder (same as above) - default arguments set by us

The arguments are overruled in the above order meaning: - Settings argument overrules environmental variable - environment variable overrules .env file - .env file overrules default argument

The evaluation of environment variables can be skipped by using ignore_env:

In [13]: from wetterdienst import Settings

In [14]: Settings.default()  # similar to Settings(ignore_env=True)
Out[14]: Settings(cache_disable:False,cache_dir:/home/docs/.cache/wetterdienst,fsspec_client_kwargs:{},humanize:True,tidy:True,si_units:True,skip_empty:False,skip_threshold:0.95,dropna:False,interp_use_nearby_station_until_km:1)

and to set it back to standard

In [15]: from wetterdienst import Settings

In [16]: settings = Settings(tidy=False)

In [17]: settings = settings.reset()

The environmental settings recognized by our settings are

  • WD_CACHE_DISABLE

  • WD_FSSPEC_CLIENT_KWARGS

  • WD_SCALAR_HUMANIZE

  • WD_SCALAR_TIDY

  • WD_SCALAR_SI_UNITS

  • WD_SCALAR_SKIP_EMPTY

  • WD_SCALAR_SKIP_THRESHOLD

  • WD_SCALAR_DROPNA

  • WD_SCALAR_INTERPOLATION_USE_NEARBY_STATION_UNTIL_KM

Scalar arguments are: - tidy can be used to reshape the returned data to a tidy format. - humanize can be used to rename parameters to more meaningful names. - si_units can be used to convert values to SI units. - skip_empty (requires option tidy) can be used to skip empty stations

  • empty stations are defined via skip_threshold which defaults to 0.95

and requires all parameters that are requested (for an entire dataset all of the dataset parameters) to have at least 95 per cent of actual values (relative to start and end date if provided)

  • skip_threshold is used in combination with skip_empty to define when a station is empty, with 1.0 meaning no

values per parameter should be missing and e.g. 0.9 meaning 10 per cent of values can be missing

  • dropna (requires option tidy) is used to drop all empty entries thus reducing the workload

  • fsspec_client_kwargs can be used to pass arguments to fsspec, especially for querying data behind a proxy

All of tidy, humanize and si_units are defaulted to True.

If your system is running behind a proxy e.g. like here you may want to use the trust_env like

```python

from wetterdienst import Settings settings = Settings(fsspec_client_kwargs={“trust_env”: True})

```

to allow requesting through a proxy.

Historical Weather Observations#

In case of the DWD requests have to be defined by resolution and period (respectively start_date and end_date). Use DwdObservationRequest.discover() to discover available parameters based on the given filter arguments.

Stations#

Get station information for a given parameter/dataset, resolution and period.

In [18]: from wetterdienst.provider.dwd.observation import DwdObservationRequest, DwdObservationDataset, DwdObservationPeriod, DwdObservationResolution

In [19]: stations = DwdObservationRequest(
   ....:     parameter=DwdObservationDataset.PRECIPITATION_MORE,
   ....:     resolution=DwdObservationResolution.DAILY,
   ....:     period=DwdObservationPeriod.HISTORICAL
   ....: ).all()
   ....: 

In [20]: df = stations.df

In [21]: print(df.head())
  station_id  ...                state
0      00001  ...    Baden-Württemberg
1      00002  ...  Nordrhein-Westfalen
2      00003  ...  Nordrhein-Westfalen
3      00004  ...  Nordrhein-Westfalen
4      00006  ...    Baden-Württemberg

[5 rows x 8 columns]

The function returns a Pandas DataFrame with information about the available stations.

Filter for specific station ids:

In [22]: from wetterdienst.provider.dwd.observation import DwdObservationRequest, DwdObservationDataset, DwdObservationPeriod, DwdObservationResolution

In [23]: stations = DwdObservationRequest(
   ....:     parameter=DwdObservationDataset.PRECIPITATION_MORE,
   ....:     resolution=DwdObservationResolution.DAILY,
   ....:     period=DwdObservationPeriod.HISTORICAL
   ....: ).filter_by_station_id(station_id=("01048", ))
   ....: 

In [24]: df = stations.df

In [25]: print(df.head())
    station_id                 from_date  ...               name    state
928      01048 1926-04-25 00:00:00+00:00  ...  Dresden-Klotzsche  Sachsen

[1 rows x 8 columns]

Filter for specific station name:

In [26]: from wetterdienst.provider.dwd.observation import DwdObservationRequest, DwdObservationDataset, DwdObservationPeriod, DwdObservationResolution

In [27]: stations = DwdObservationRequest(
   ....:     parameter=DwdObservationDataset.PRECIPITATION_MORE,
   ....:     resolution=DwdObservationResolution.DAILY,
   ....:     period=DwdObservationPeriod.HISTORICAL
   ....: ).filter_by_name(name="Dresden-Klotzsche")
   ....: 

In [28]: df = stations.df

In [29]: print(df.head())
  station_id                 from_date  ...               name    state
0      01048 1926-04-25 00:00:00+00:00  ...  Dresden-Klotzsche  Sachsen

[1 rows x 8 columns]

Values#

Use the DwdObservationRequest class in order to get hold of stations.

In [30]: from wetterdienst.provider.dwd.observation import DwdObservationRequest, DwdObservationDataset, DwdObservationPeriod, DwdObservationResolution

In [31]: from wetterdienst import Settings

In [32]: settings = Settings(tidy=True, humanize=True, si_units=True)

In [33]: request = DwdObservationRequest(
   ....:     parameter=[DwdObservationDataset.CLIMATE_SUMMARY, DwdObservationDataset.SOLAR],
   ....:     resolution=DwdObservationResolution.DAILY,
   ....:     start_date="1990-01-01",
   ....:     end_date="2020-01-01",
   ....:     settings=settings
   ....: ).filter_by_station_id(station_id=[3, 1048])
   ....: 

From here you can query data by station:

In [34]: for result in request.values.query():
   ....:     print(result.df.dropna().head())
   ....: 
  station_id          dataset  ... value quality
1      00003  climate_summary  ...   4.6    10.0
2      00003  climate_summary  ...   5.7    10.0
3      00003  climate_summary  ...   6.7    10.0
4      00003  climate_summary  ...   6.7    10.0
5      00003  climate_summary  ...   7.2    10.0

[5 rows x 6 columns]
  station_id          dataset  ... value quality
3      01048  climate_summary  ...   5.0    10.0
4      01048  climate_summary  ...   9.0    10.0
5      01048  climate_summary  ...   7.0    10.0
6      01048  climate_summary  ...   7.0    10.0
7      01048  climate_summary  ...  10.0    10.0

[5 rows x 6 columns]

Query data all together:

In [35]: df = request.values.all().df.dropna()

In [36]: print(df.head())
  station_id          dataset  ... value quality
0      00003  climate_summary  ...   4.6    10.0
1      00003  climate_summary  ...   5.7    10.0
2      00003  climate_summary  ...   6.7    10.0
3      00003  climate_summary  ...   6.7    10.0
4      00003  climate_summary  ...   7.2    10.0

[5 rows x 6 columns]

This gives us the most options to work with the data, getting multiple parameters at once, parsed nicely into column structure with improved parameter names. Instead of start_date and end_date you may as well want to use period to update your database once in a while with a fixed set of records.

Geospatial support#

Inquire the list of stations by geographic coordinates.

  • Calculate weather stations close to the given coordinates and set of parameters.

  • Select stations by
    • rank (n stations)

    • distance (km, mi,…)

    • bbox

Distance with default (kilometers)

In [37]: from datetime import datetime

In [38]: from wetterdienst.provider.dwd.observation import DwdObservationRequest, DwdObservationDataset, DwdObservationPeriod, DwdObservationResolution

In [39]: stations = DwdObservationRequest(
   ....:     parameter=DwdObservationDataset.TEMPERATURE_AIR,
   ....:     resolution=DwdObservationResolution.HOURLY,
   ....:     period=DwdObservationPeriod.RECENT,
   ....:     start_date=datetime(2020, 1, 1),
   ....:     end_date=datetime(2020, 1, 20)
   ....: )
   ....: 

In [40]: df = stations.filter_by_distance(
   ....:     latlon=(50.0, 8.9),
   ....:     distance=30,
   ....:     unit="km"
   ....: ).df
   ....: 

In [41]: print(df.head())
  station_id                 from_date  ...   state   distance
0      02480 2004-09-01 00:00:00+00:00  ...  Bayern   9.759385
1      04411 2002-01-24 00:00:00+00:00  ...  Hessen  10.156943
2      07341 2005-07-16 00:00:00+00:00  ...  Hessen  12.891318
3      00917 2004-09-01 00:00:00+00:00  ...  Hessen  20.688403
4      01424 2008-08-01 00:00:00+00:00  ...  Hessen  21.680660

[5 rows x 9 columns]

Distance with miles

In [42]: from datetime import datetime

In [43]: from wetterdienst.provider.dwd.observation import DwdObservationRequest, DwdObservationDataset, DwdObservationPeriod, DwdObservationResolution

In [44]: stations = DwdObservationRequest(
   ....:     parameter=DwdObservationDataset.TEMPERATURE_AIR,
   ....:     resolution=DwdObservationResolution.HOURLY,
   ....:     period=DwdObservationPeriod.RECENT,
   ....:     start_date=datetime(2020, 1, 1),
   ....:     end_date=datetime(2020, 1, 20)
   ....: )
   ....: 

In [45]: df = stations.filter_by_distance(
   ....:     latlon=(50.0, 8.9),
   ....:     distance=30,
   ....:     unit="mi"
   ....: ).df
   ....: 

In [46]: print(df.head())
  station_id                 from_date  ...   state   distance
0      02480 2004-09-01 00:00:00+00:00  ...  Bayern   9.759385
1      04411 2002-01-24 00:00:00+00:00  ...  Hessen  10.156943
2      07341 2005-07-16 00:00:00+00:00  ...  Hessen  12.891318
3      00917 2004-09-01 00:00:00+00:00  ...  Hessen  20.688403
4      01424 2008-08-01 00:00:00+00:00  ...  Hessen  21.680660

[5 rows x 9 columns]

Rank selection

In [47]: from datetime import datetime

In [48]: from wetterdienst.provider.dwd.observation import DwdObservationRequest, DwdObservationDataset, DwdObservationPeriod, DwdObservationResolution

In [49]: stations = DwdObservationRequest(
   ....:     parameter=DwdObservationDataset.TEMPERATURE_AIR,
   ....:     resolution=DwdObservationResolution.HOURLY,
   ....:     period=DwdObservationPeriod.RECENT,
   ....:     start_date=datetime(2020, 1, 1),
   ....:     end_date=datetime(2020, 1, 20)
   ....: )
   ....: 

In [50]: df = stations.filter_by_rank(
   ....:     latlon=(50.0, 8.9),
   ....:     rank=5
   ....: ).df
   ....: 

In [51]: print(df.head())
  station_id                 from_date  ...   state   distance
0      02480 2004-09-01 00:00:00+00:00  ...  Bayern   9.759385
1      04411 2002-01-24 00:00:00+00:00  ...  Hessen  10.156943
2      07341 2005-07-16 00:00:00+00:00  ...  Hessen  12.891318
3      00917 2004-09-01 00:00:00+00:00  ...  Hessen  20.688403
4      01424 2008-08-01 00:00:00+00:00  ...  Hessen  21.680660

[5 rows x 9 columns]

Bbox selection

In [52]: from datetime import datetime

In [53]: from wetterdienst.provider.dwd.observation import DwdObservationRequest, DwdObservationDataset, DwdObservationPeriod, DwdObservationResolution

In [54]: stations = DwdObservationRequest(
   ....:     parameter=DwdObservationDataset.TEMPERATURE_AIR,
   ....:     resolution=DwdObservationResolution.HOURLY,
   ....:     period=DwdObservationPeriod.RECENT,
   ....:     start_date=datetime(2020, 1, 1),
   ....:     end_date=datetime(2020, 1, 20)
   ....: )
   ....: 

In [55]: df = stations.filter_by_bbox(
   ....:     left=8.9,
   ....:     bottom=50.0,
   ....:     right=8.91,
   ....:     top=50.01,
   ....: ).df
   ....: 

In [56]: print(df.head())
Empty DataFrame
Columns: [station_id, from_date, to_date, height, latitude, longitude, name, state]
Index: []

The function returns a StationsResult with the list of stations being filtered for distances [in km] to the given coordinates.

Again from here we can jump to the corresponding data:

In [57]: stations = DwdObservationRequest(
   ....:     parameter=DwdObservationDataset.TEMPERATURE_AIR,
   ....:     resolution=DwdObservationResolution.HOURLY,
   ....:     period=DwdObservationPeriod.RECENT,
   ....:     start_date=datetime(2020, 1, 1),
   ....:     end_date=datetime(2020, 1, 20)
   ....: ).filter_by_distance(
   ....:     latlon=(50.0, 8.9),
   ....:     distance=30
   ....: )
   ....: 

In [58]: for result in stations.values.query():
   ....:     print(result.df.dropna().head())
   ....: 
Empty DataFrame
Columns: [station_id, dataset, parameter, date, value, quality]
Index: []
Empty DataFrame
Columns: [station_id, dataset, parameter, date, value, quality]
Index: []
Empty DataFrame
Columns: [station_id, dataset, parameter, date, value, quality]
Index: []
Empty DataFrame
Columns: [station_id, dataset, parameter, date, value, quality]
Index: []
Empty DataFrame
Columns: [station_id, dataset, parameter, date, value, quality]
Index: []
Empty DataFrame
Columns: [station_id, dataset, parameter, date, value, quality]
Index: []

Et voila: We just got the data we wanted for our location and are ready to analyse the temperature on historical developments.

Interpolation#

Sometimes you might need data for your exact position instead of values measured at the location of a station. Therefore, we added the interpolation feature which allows you to interpolate weather data of stations around you to your exact location. The function uses the four nearest stations to your given lat/lon point and interpolates the given parameter values. It uses the bilinear interpolation method from the scipy package (interp2d). The interpolation currently only works for DWDObservationRequest and individual parameters. It is still in an early phase and will be improved based on feedback.

The graphic below shows values of the parameter temperature_air_mean_200 from multiple stations measured at the same time. The blue points represent the position of a station and includes the measured value. The red point represents the position of the interpolation and includes the interpolated value.

usage/docs/img/interpolation.png

Values represented as a table:

Individual station values#

station_id

parameter

date

value

02480

temperature_air_mean_200

2022-01-02 00:00:00+00:00

278.15

04411

temperature_air_mean_200

2022-01-02 00:00:00+00:00

277.15

07341

temperature_air_mean_200

2022-01-02 00:00:00+00:00

278.35

00917

temperature_air_mean_200

2022-01-02 00:00:00+00:00

276.25

The interpolated value looks like this:

Interpolated value#

parameter

date

value

temperature_air_mean_200

2022-01-02 00:00:00+00:00

277.65

The code to execute the interpolation is given below. It currently only works for DwdObservationRequest and individual parameters. Currently the following parameters are supported (more will be added if useful): temperature_air_mean_200, wind_speed, precipitation_height.

In [59]: from wetterdienst.provider.dwd.observation import DwdObservationRequest

In [60]: from wetterdienst import Parameter, Resolution

In [61]: stations = DwdObservationRequest(
   ....:     parameter=Parameter.TEMPERATURE_AIR_MEAN_200,
   ....:     resolution=Resolution.HOURLY,
   ....:     start_date=datetime(2022, 1, 1),
   ....:     end_date=datetime(2022, 1, 20),
   ....: )
   ....: 

In [62]: result = stations.interpolate(latlon=(50.0, 8.9))

In [63]: df = result.df

In [64]: print(df.head())
                       date  ...                   station_ids
0 2022-01-01 00:00:00+00:00  ...  [02480, 04411, 07341, 00917]
1 2022-01-01 01:00:00+00:00  ...  [02480, 04411, 07341, 00917]
2 2022-01-01 02:00:00+00:00  ...  [02480, 04411, 07341, 00917]
3 2022-01-01 03:00:00+00:00  ...  [02480, 04411, 07341, 00917]
4 2022-01-01 04:00:00+00:00  ...  [02480, 04411, 07341, 00917]

[5 rows x 5 columns]

Instead of a latlon you may alternatively use an existing station id for which to interpolate values in a manner of getting a more complete dataset:

In [65]: from wetterdienst.provider.dwd.observation import DwdObservationRequest

In [66]: from wetterdienst import Parameter, Resolution

In [67]: stations = DwdObservationRequest(
   ....:     parameter=Parameter.TEMPERATURE_AIR_MEAN_200,
   ....:     resolution=Resolution.HOURLY,
   ....:     start_date=datetime(2022, 1, 1),
   ....:     end_date=datetime(2022, 1, 20),
   ....: )
   ....: 

In [68]: result = stations.interpolate_by_station_id(station_id="02480")

In [69]: df = result.df

In [70]: print(df.head())
                       date  ... station_ids
0 2022-01-01 00:00:00+00:00  ...     [02480]
1 2022-01-01 01:00:00+00:00  ...     [02480]
2 2022-01-01 02:00:00+00:00  ...     [02480]
3 2022-01-01 03:00:00+00:00  ...     [02480]
4 2022-01-01 04:00:00+00:00  ...     [02480]

[5 rows x 5 columns]

Summary#

Similar to interpolation you may sometimes want to combine multiple stations to get a complete list of data. For that reason you can use .summary(lat, lon), which goes through nearest stations and combines data from them meaningful.

The code to execute the summary is given below. It currently only works for DwdObservationRequest and individual parameters. Currently the following parameters are supported (more will be added if useful): temperature_air_mean_200, wind_speed, precipitation_height.

In [71]: from wetterdienst.provider.dwd.observation import DwdObservationRequest

In [72]: from wetterdienst import Parameter, Resolution

In [73]: stations = DwdObservationRequest(
   ....:     parameter=Parameter.TEMPERATURE_AIR_MEAN_200,
   ....:     resolution=Resolution.HOURLY,
   ....:     start_date=datetime(2022, 1, 1),
   ....:     end_date=datetime(2022, 1, 20),
   ....: )
   ....: 

In [74]: result = stations.summarize(latlon=(50.0, 8.9))

In [75]: df = result.df

In [76]: print(df.head())
                       date                 parameter  ...  distance  station_id
0 2022-01-01 00:00:00+00:00  temperature_air_mean_200  ...  9.759385       02480
1 2022-01-01 01:00:00+00:00  temperature_air_mean_200  ...  9.759385       02480
2 2022-01-01 02:00:00+00:00  temperature_air_mean_200  ...  9.759385       02480
3 2022-01-01 03:00:00+00:00  temperature_air_mean_200  ...  9.759385       02480
4 2022-01-01 04:00:00+00:00  temperature_air_mean_200  ...  9.759385       02480

[5 rows x 5 columns]

Instead of a latlon you may alternatively use an existing station id for which to summarize values in a manner of getting a more complete dataset:

In [77]: from wetterdienst.provider.dwd.observation import DwdObservationRequest

In [78]: from wetterdienst import Parameter, Resolution

In [79]: stations = DwdObservationRequest(
   ....:     parameter=Parameter.TEMPERATURE_AIR_MEAN_200,
   ....:     resolution=Resolution.HOURLY,
   ....:     start_date=datetime(2022, 1, 1),
   ....:     end_date=datetime(2022, 1, 20),
   ....: )
   ....: 

In [80]: result = stations.summarize_by_station_id(station_id="02480")

In [81]: df = result.df

In [82]: print(df.head())
                       date                 parameter  ...  distance  station_id
0 2022-01-01 00:00:00+00:00  temperature_air_mean_200  ...       0.0       02480
1 2022-01-01 01:00:00+00:00  temperature_air_mean_200  ...       0.0       02480
2 2022-01-01 02:00:00+00:00  temperature_air_mean_200  ...       0.0       02480
3 2022-01-01 03:00:00+00:00  temperature_air_mean_200  ...       0.0       02480
4 2022-01-01 04:00:00+00:00  temperature_air_mean_200  ...       0.0       02480

[5 rows x 5 columns]

SQL support#

Querying data using SQL is provided by an in-memory DuckDB database. In order to explore what is possible, please have a look at the DuckDB SQL introduction.

The result data is provided through a virtual table called data.

from wetterdienst.provider.dwd.observation import DwdObservationRequest, DwdObservationDataset, DwdObservationPeriod, DwdObservationResolution
from wetterdienst import Settings

settings = Settings(tidy=True, humanize=True, si_units=True)  # defaults

stations = DwdObservationRequest(
    parameter=[DwdObservationDataset.TEMPERATURE_AIR],
    resolution=DwdObservationResolution.HOURLY,
    start_date="2019-01-01",
    end_date="2020-01-01",
    settings=settings
).filter_by_station_id(station_id=[1048])

results = stations.values.all()
df = results.filter_by_sql("SELECT * FROM data WHERE parameter='temperature_air_200' AND value < -7.0;")
print(df.head())

Data export#

Data can be exported to SQLite, DuckDB, InfluxDB, CrateDB and more targets. A target is identified by a connection string.

Examples:

  • sqlite:///dwd.sqlite?table=weather

  • duckdb:///dwd.duckdb?table=weather

  • influxdb://localhost/?database=dwd&table=weather

  • crate://localhost/?database=dwd&table=weather

from wetterdienst.provider.dwd.observation import DwdObservationRequest, DwdObservationDataset,
    DwdObservationPeriod, DwdObservationResolution
from wetterdienst import Settings

settings = Settings(tidy=True, humanize=True, si_units=True)  # defaults

stations = DwdObservationRequest(
    parameter=[DwdObservationDataset.TEMPERATURE_AIR],
    resolution=DwdObservationResolution.HOURLY,
    start_date="2019-01-01",
    end_date="2020-01-01",
    settings=settings
).filter_by_station_id(station_id=[1048])

results = stations.values.all()
results.to_target("influxdb://localhost/?database=dwd&table=weather")

Mosmix#

Get stations for Mosmix:

In [83]: from wetterdienst.provider.dwd.mosmix import DwdMosmixRequest

In [84]: stations = DwdMosmixRequest(
   ....:     parameter="large",
   ....:     mosmix_type="large"
   ....: )  # actually same for small and large
   ....: 

In [85]: print(stations.all().df.head())
  station_id icao_id from_date  ... longitude           name  state
0      01001    ENJA       NaT  ...     -8.67      JAN MAYEN    NaN
1      01008    ENSB       NaT  ...     15.47       SVALBARD    NaN
2      01025     NaN       NaT  ...     18.92        TROMSOE    NaN
3      01028    ENBJ       NaT  ...     19.02       BJORNOYA    NaN
4      01049    ENAT       NaT  ...     23.37  ALTA LUFTHAVN    NaN

[5 rows x 9 columns]

Mosmix forecasts require us to define station_ids and mosmix_type. Furthermore we can also define explicitly the requested parameters.

Get Mosmix-L data (single station file):

In [86]: from wetterdienst.provider.dwd.mosmix import DwdMosmixRequest, DwdMosmixType

In [87]: stations = DwdMosmixRequest(
   ....:     parameter="large",
   ....:     mosmix_type=DwdMosmixType.LARGE
   ....: ).filter_by_station_id(station_id=["01001", "01008"])
   ....: 

In [88]: response =  next(stations.values.query())

In [89]: print(response.stations.df)
  station_id icao_id from_date to_date  ...  latitude  longitude       name state
0      01001    ENJA       NaT     NaT  ...     70.93      -8.67  JAN MAYEN   NaN
1      01008    ENSB       NaT     NaT  ...     78.25      15.47   SVALBARD   NaN

[2 rows x 9 columns]

In [90]: print(response.df)
      station_id dataset  ...     value quality
0          01001   large  ...  100940.0     NaN
1          01001   large  ...  101010.0     NaN
2          01001   large  ...  101080.0     NaN
3          01001   large  ...  101160.0     NaN
4          01001   large  ...  101190.0     NaN
...          ...     ...  ...       ...     ...
28153      01001   large  ...      85.0     NaN
28154      01001   large  ...       NaN     NaN
28155      01001   large  ...       NaN     NaN
28156      01001   large  ...       NaN     NaN
28157      01001   large  ...       NaN     NaN

[28158 rows x 6 columns]

Get Mosmix-L data (all stations file):

In [91]: from wetterdienst.provider.dwd.mosmix import DwdMosmixRequest, DwdMosmixType

In [92]: stations = DwdMosmixRequest(
   ....:     parameter="large",
   ....:     mosmix_type=DwdMosmixType.LARGE,
   ....:     station_group="all_stations"
   ....: ).filter_by_station_id(station_id=["01001", "01008"])
   ....: 

In [93]: response =  next(stations.values.query())

In [94]: print(response.stations.df)
  station_id icao_id from_date to_date  ...  latitude  longitude       name state
0      01001    ENJA       NaT     NaT  ...     70.93      -8.67  JAN MAYEN   NaN
1      01008    ENSB       NaT     NaT  ...     78.25      15.47   SVALBARD   NaN

[2 rows x 9 columns]

In [95]: print(response.df)
      station_id dataset  ...     value quality
0          01001   large  ...  100940.0     NaN
1          01001   large  ...  101010.0     NaN
2          01001   large  ...  101080.0     NaN
3          01001   large  ...  101160.0     NaN
4          01001   large  ...  101190.0     NaN
...          ...     ...  ...       ...     ...
28153      01001   large  ...      85.0     NaN
28154      01001   large  ...       NaN     NaN
28155      01001   large  ...       NaN     NaN
28156      01001   large  ...       NaN     NaN
28157      01001   large  ...       NaN     NaN

[28158 rows x 6 columns]

Radar#

Sites#

Retrieve information about all OPERA radar sites.

In [96]: from wetterdienst.provider.eumetnet.opera.sites import OperaRadarSites

# Acquire information for all OPERA sites.
In [97]: sites = OperaRadarSites().all()

In [98]: print(f"Number of OPERA radar sites: {len(sites)}")
Number of OPERA radar sites: 205

# Acquire information for a specific OPERA site.
In [99]: site_ukdea = OperaRadarSites().by_odimcode("ukdea")

In [100]: print(site_ukdea)
{'number': 1312, 'country': 'United Kingdom', 'countryid': 'EGRR61', 'oldcountryid': 'UK61', 'wmocode': 3859, 'wigosid': None, 'odimcode': 'ukdea', 'location': 'Dean Hill', 'status': True, 'latitude': 51.0307, 'longitude': -1.6534, 'heightofstation': 155.0, 'band': 'C', 'doppler': True, 'polarization': 'D', 'maxrange': 255, 'startyear': 2005, 'heightantenna': 180.0, 'diametrantenna': 3.66, 'beam': 1.0, 'gain': 43.0, 'frequency': 5.627, 'stratus': None, 'cirusnimbus': None, 'wrwp': None}

Retrieve information about the DWD radar sites.

In [101]: from wetterdienst.provider.dwd.radar.api import DwdRadarSites

# Acquire information for a specific site.
In [102]: site_asb = DwdRadarSites().by_odimcode("ASB")

In [103]: print(site_asb)
{'number': 1389, 'country': 'Germany', 'countryid': 'EDZW40', 'oldcountryid': 'DL40', 'wmocode': 10103, 'wigosid': None, 'odimcode': 'deasb', 'location': 'Isle of Borkum', 'status': True, 'latitude': 53.564, 'longitude': 6.7482, 'heightofstation': 4.0, 'band': 'C', 'doppler': True, 'polarization': 'S', 'maxrange': 180, 'startyear': 2018, 'heightantenna': 36.2, 'diametrantenna': 2.4, 'beam': 1.57, 'gain': 41.55, 'frequency': 5.64, 'stratus': None, 'cirusnimbus': None, 'wrwp': None}

Data#

To use DWDRadarRequest, you have to provide a RadarParameter, which designates the type of radar data you want to obtain. There is radar data available at different locations within the DWD data repository:

For RADOLAN_CDC-data, the time resolution parameter (either hourly or daily) must be specified.

The date_times (list of datetimes or strings) or a start_date and end_date parameters can optionally be specified to obtain data from specific points in time.

For RADOLAN_CDC-data, datetimes are rounded to HH:50min, as the data is packaged for this minute step.

This is an example on how to acquire RADOLAN_CDC data using wetterdienst and process it using wradlib.

For more examples, please have a look at example/radar/.

from wetterdienst.provider.dwd.radar import DwdRadarValues, DwdRadarParameter, DwdRadarResolution
import wradlib as wrl

radar = DwdRadarValues(
    radar_parameter=DwdRadarParameter.RADOLAN_CDC,
    resolution=DwdRadarResolution.DAILY,
    start_date="2020-09-04T12:00:00",
    end_date="2020-09-04T12:00:00"
)

for item in radar.query():

    # Decode item.
    timestamp, buffer = item

    # Decode data using wradlib.
    data, attributes = wrl.io.read_radolan_composite(buffer)

    # Do something with the data (numpy.ndarray) here.

Caching#

The backbone of wetterdienst uses fsspec caching. It requires to create a directory under /home for the most cases. If you are not allowed to write into /home you will run into OSError. For this purpose you can set an environment variable WD_CACHE_DIR to define the place where the caching directory should be created.

FSSPEC is used for flexible file caching. It relies on the two libraries requests and aiohttp. Aiohttp is used for asynchronous requests and may swallow some errors related to proxies, ssl or similar. Use the defined variable FSSPEC_CLIENT_KWARGS to pass your very own client kwargs to fsspec e.g.

In [104]: from wetterdienst import Settings

In [105]: settings = Settings(fsspec_client_kwargs={"trust_env": True})  # use proxy from environment variables