{ "cells": [ { "cell_type": "markdown", "id": "e9dfe0df-9f1a-4e5d-99d3-81f01a76d3c8", "metadata": { "tags": [] }, "source": [ "# Awesome HRRR \n", "\n", "**Rich Signell (USGS) with Martin Durant (Anaconda), Eskild Eriksen (Quansight)**\n", "\n", "NOAA's High-Resolution Rapid Refresh (HRRR) model is a 3km model of CONUS updated every hour, assimilating Radar data every 15 minutes. \n", "\n", "![](https://rapidrefresh.noaa.gov/hrrr/hrrrcrefimage)\n", "\n", "**Data**: \n", "* [AWS Public Data](https://registry.opendata.aws/noaa-hrrr-pds/): Available on the Cloud, YAY! Thousands of GRIB2 files each day -- not awesome yet!\n", "\n", "**Goal**:\n", "Make it easy and cloud-performant to access this collection of data. We use Kerchunk/Fsspec to extract all the metadata, create a virtual Zarrset which just pokes into the files to only extract the compressed data chunks from the pile of files. \n", "\n", "**Tech Stack**:\n", "* [Nebari](https://www.nebari.dev/) (formerly Qhub) for deploying JupyterHub on Kubernetes with [Dask Gateway](https://gateway.dask.org/)\n", "* [fsspec](https://filesystem-spec.readthedocs.io/en/latest/) for representing scientific data format files as a Zarr dataset (ReferenceFileSystem)\n", "* [Kerchunk](https://github.com/fsspec/kerchunk) for creating the ReferenceFileSystem JSON\n", "* [kbatch](https://github.com/kbatch-dev/kbatch) for running notebooks as batch jobs on Kubernetes (includes new `cronjob` feature for running on a schedule)\n", "* Jupyter, Dask, Xarray, Zarr, Holoviz\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "id": "9ac4c885-3e20-473b-aca5-63e611c181f9", "metadata": {}, "outputs": [], "source": [ "import xarray as xr\n", "import hvplot.xarray\n", "import panel as pn\n", "import fsspec\n", "import datetime as dt" ] }, { "cell_type": "markdown", "id": "eac4889f-9979-4016-b492-ac54b3b1289d", "metadata": {}, "source": [ "#### Create Dask Cluster \n", "Here we use Dask Gateway, but could use any Dask Cluster" ] }, { "cell_type": "code", "execution_count": null, "id": "f0d7312e-ccbf-4962-8320-12a31fbabe16", "metadata": {}, "outputs": [], "source": [ "from dask_gateway import Gateway\n", "# instantiate dask gateway\n", "gateway = Gateway()\n", "\n", "# setup a cluster\n", "options = gateway.cluster_options(use_local_defaults=False)\n", "\n", "options.conda_environment='users/pangeo'\n", "options.profile = 'Medium Worker'\n", "\n", "cluster = gateway.new_cluster(options)\n", "\n", "cluster.scale(30)\n", "\n", "# get the client for the cluster\n", "client = cluster.get_client()" ] }, { "cell_type": "code", "execution_count": 21, "id": "55e2b4d0-2753-4a6e-b14e-43aa3c932a0f", "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "8d537adc78c14d4db0a9b3a88389a1a3", "version_major": 2, "version_minor": 0 }, "text/plain": [ "VBox(children=(HTML(value='

GatewayCluster

'), HBox(children=(HTML(value='\\n
\\n
<xarray.Dataset>\n",
       "Dimensions:            (valid_time: 90, y: 1059, x: 1799)\n",
       "Coordinates:\n",
       "    heightAboveGround  float64 ...\n",
       "    latitude           (y, x) float64 dask.array<chunksize=(1059, 1799), meta=np.ndarray>\n",
       "    longitude          (y, x) float64 dask.array<chunksize=(1059, 1799), meta=np.ndarray>\n",
       "    step               timedelta64[ns] ...\n",
       "    time               (valid_time) datetime64[ns] dask.array<chunksize=(1,), meta=np.ndarray>\n",
       "  * valid_time         (valid_time) datetime64[ns] 2022-07-10T17:00:00 ... 20...\n",
       "Dimensions without coordinates: y, x\n",
       "Data variables:\n",
       "    d2m                (valid_time, y, x) float32 dask.array<chunksize=(1, 1059, 1799), meta=np.ndarray>\n",
       "    pt                 (valid_time, y, x) float32 dask.array<chunksize=(1, 1059, 1799), meta=np.ndarray>\n",
       "    r2                 (valid_time, y, x) float32 dask.array<chunksize=(1, 1059, 1799), meta=np.ndarray>\n",
       "    sh2                (valid_time, y, x) float32 dask.array<chunksize=(1, 1059, 1799), meta=np.ndarray>\n",
       "    si10               (valid_time, y, x) float32 dask.array<chunksize=(1, 1059, 1799), meta=np.ndarray>\n",
       "    t2m                (valid_time, y, x) float32 dask.array<chunksize=(1, 1059, 1799), meta=np.ndarray>\n",
       "    u10                (valid_time, y, x) float32 dask.array<chunksize=(1, 1059, 1799), meta=np.ndarray>\n",
       "    unknown            (valid_time, y, x) float32 dask.array<chunksize=(1, 1059, 1799), meta=np.ndarray>\n",
       "    v10                (valid_time, y, x) float32 dask.array<chunksize=(1, 1059, 1799), meta=np.ndarray>\n",
       "Attributes:\n",
       "    Conventions:             CF-1.7\n",
       "    GRIB_centre:             kwbc\n",
       "    GRIB_centreDescription:  US National Weather Service - NCEP\n",
       "    GRIB_edition:            2\n",
       "    GRIB_subCentre:          0\n",
       "    history:                 2022-07-13T18:55 GRIB to CDM+CF via cfgrib-0.9.1...\n",
       "    institution:             US National Weather Service - NCEP
" ], "text/plain": [ "\n", "Dimensions: (valid_time: 90, y: 1059, x: 1799)\n", "Coordinates:\n", " heightAboveGround float64 ...\n", " latitude (y, x) float64 dask.array\n", " longitude (y, x) float64 dask.array\n", " step timedelta64[ns] ...\n", " time (valid_time) datetime64[ns] dask.array\n", " * valid_time (valid_time) datetime64[ns] 2022-07-10T17:00:00 ... 20...\n", "Dimensions without coordinates: y, x\n", "Data variables:\n", " d2m (valid_time, y, x) float32 dask.array\n", " pt (valid_time, y, x) float32 dask.array\n", " r2 (valid_time, y, x) float32 dask.array\n", " sh2 (valid_time, y, x) float32 dask.array\n", " si10 (valid_time, y, x) float32 dask.array\n", " t2m (valid_time, y, x) float32 dask.array\n", " u10 (valid_time, y, x) float32 dask.array\n", " unknown (valid_time, y, x) float32 dask.array\n", " v10 (valid_time, y, x) float32 dask.array\n", "Attributes:\n", " Conventions: CF-1.7\n", " GRIB_centre: kwbc\n", " GRIB_centreDescription: US National Weather Service - NCEP\n", " GRIB_edition: 2\n", " GRIB_subCentre: 0\n", " history: 2022-07-13T18:55 GRIB to CDM+CF via cfgrib-0.9.1...\n", " institution: US National Weather Service - NCEP" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ds" ] }, { "cell_type": "markdown", "id": "2199fcb9-3b85-47ea-80de-e80232c1e2ec", "metadata": {}, "source": [ "Hvplot wants lon [-180,180], not [0,360]:" ] }, { "cell_type": "code", "execution_count": 26, "id": "ad699973-8780-49ac-a7c0-7ce3f74a7147", "metadata": {}, "outputs": [], "source": [ "ds = ds.assign_coords(longitude=(((ds.longitude + 180) % 360) - 180))" ] }, { "cell_type": "code", "execution_count": 27, "id": "baf91e48-4b4d-49ab-ae8a-e52c7458232c", "metadata": {}, "outputs": [], "source": [ "var = 't2m' # Temperature at 2m height" ] }, { "cell_type": "code", "execution_count": 28, "id": "a408c1b6-06a9-43d3-b69a-1b73dc0d9a48", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2022-07-13 22:00:00\n" ] } ], "source": [ "now = dt.datetime.utcnow().strftime('%Y-%m-%d %H:00:00')\n", "print(now)" ] }, { "cell_type": "markdown", "id": "95869e81-657b-4fb1-9553-b9197c92bbd1", "metadata": {}, "source": [ "With 30 worker cluster, takes 50 seconds to display, and 15 seconds to change the time step\n", "after closing the dask client, it takes 30 seconds to display, 8 seconds to display a time step" ] }, { "cell_type": "code", "execution_count": 29, "id": "171c7153-efb7-4aa0-acac-ec308a691f11", "metadata": {}, "outputs": [ { "data": {}, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.holoviews_exec.v0+json": "", "text/html": [ "
\n", "
\n", "
\n", "" ], "text/plain": [ ":DynamicMap []\n", " :Overlay\n", " .Tiles.I :Tiles [x,y]\n", " .Image.I :Image [longitude,latitude] (t2m)" ] }, "execution_count": 29, "metadata": { "application/vnd.holoviews_exec.v0+json": { "id": "1571" } }, "output_type": "execute_result" } ], "source": [ "da = ds[var].sel(valid_time=now).load() - 273.15\n", "da.hvplot.quadmesh(x='longitude', y='latitude', geo=True, tiles='OSM',\n", " rasterize=True, cmap='turbo', title=now)" ] }, { "cell_type": "markdown", "id": "5fa77bb5-1c95-46c6-a24d-79571b4165da", "metadata": { "tags": [] }, "source": [ "#### Extract a time series at a point\n", "We are reading GRIB2 files, which compress the entire spatial domain as a single chunk. Therefore reading all the time values at a single point actually needs to load and uncompress *all* the data for that variable. " ] }, { "cell_type": "code", "execution_count": 30, "id": "7474541e-383a-4839-a4f2-da4e459a7619", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "CPU times: user 79.1 ms, sys: 9.07 ms, total: 88.2 ms\n", "Wall time: 2.79 s\n" ] }, { "data": {}, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.holoviews_exec.v0+json": "", "text/html": [ "
\n", "
\n", "
\n", "" ], "text/plain": [ ":Curve [valid_time] (t2m)" ] }, "execution_count": 30, "metadata": { "application/vnd.holoviews_exec.v0+json": { "id": "1792" } }, "output_type": "execute_result" } ], "source": [ "%%time\n", "(ds[var][:,223,891]-273.15).hvplot(x='valid_time', grid=True, title='Surface Temp (C): Austin, TX')" ] }, { "cell_type": "markdown", "id": "3a6fa3a9-e413-4ba1-9708-e996f8278804", "metadata": {}, "source": [ "#### The JSON is updating every hour thanks to Kbatch `cronjob` ####\n", "\n", "```yaml\n", "$ more hrrr-best.yaml\n", "\n", "name: hrrr-awesome\n", "image: ghcr.io/iameskild/ogc-demo:latest\n", "command: \n", " - papermill\n", " - hrrr_best_kbatch.ipynb\n", "code: hrrr_best_kbatch.ipynb\n", "schedule: \"53 * * * *\" \n", "\n", "$ kbatch cronjob submit -f hrrr_best.yaml\n", "```" ] }, { "cell_type": "markdown", "id": "7e30a8a6-ad1b-41e3-b77d-c9cf640f2cab", "metadata": {}, "source": [ "#### Close the client and shutdown the cluster" ] }, { "cell_type": "code", "execution_count": null, "id": "be1d8cec-b866-4a88-a88a-acd95f50b39c", "metadata": {}, "outputs": [], "source": [ "#client.close(); cluster.shutdown()" ] }, { "cell_type": "code", "execution_count": null, "id": "a0600824-77df-4631-b480-07e10376a5fc", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "users-pangeo", "language": "python", "name": "conda-env-users-pangeo-py" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.13" }, "widgets": { "application/vnd.jupyter.widget-state+json": { "state": { "0be288d2ef3e4737ad6a0922bef57946": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "DescriptionStyleModel", "state": { "description_width": "" } }, "0eea27cd6af844e48bc453d22885e049": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HTMLModel", "state": { "layout": "IPY_MODEL_c3befad4fd754b2ea0c0c546f26ddf7b", "style": "IPY_MODEL_17e7e5cbbb9b4143b50d6e64396d5c5c", "value": "\n
\n\n\n \n \n \n
Workers 30
Threads 60
Memory 240.00 GiB
\n
\n" } }, "17e7e5cbbb9b4143b50d6e64396d5c5c": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "DescriptionStyleModel", "state": { "description_width": "" } }, "201fac89f02349b6b2acbe72998b42f9": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "DescriptionStyleModel", "state": { "description_width": "" } }, "249a49c2bf2745d19b876e40acd05da3": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "ButtonStyleModel", "state": {} }, "2e59ba6c5888411e90a27a4cd84d6b09": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HBoxModel", "state": { "children": [ "IPY_MODEL_0eea27cd6af844e48bc453d22885e049", "IPY_MODEL_4b9bebb57661456c8641b9a73d17b858" ], "layout": "IPY_MODEL_ad5bf6e47d52444fa552fbb1936ddf1d" } }, "31df01b8363343cd91505b11db0d70b5": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": {} }, "3f1983ed4754497fa2d26ff371fcdcdf": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": {} }, "41c7ad774d874e70b37f06501d27dd38": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "IntTextModel", "state": { "description": "Minimum", "layout": "IPY_MODEL_6a6fb9398bfe4fb9ba54b3efa99fa8ee", "step": 1, "style": "IPY_MODEL_eaaa4c8f1cb2409e8e4572cf6ac27853" } }, "4b9bebb57661456c8641b9a73d17b858": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "AccordionModel", "state": { "_titles": { "0": "Manual Scaling", "1": "Adaptive Scaling" }, "children": [ "IPY_MODEL_e76b8fa4c53542c48aba66ff1b629d6e", "IPY_MODEL_dcc23361e2784464be49d97bb3ee806b" ], "layout": "IPY_MODEL_c68ef040e684402a8e1c6cc692205bf3", "selected_index": null } }, "55af4f69e0f34a008e369ad7f8406374": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HTMLModel", "state": { "layout": "IPY_MODEL_be3e7455fb824a39a4165c3436c39424", "style": "IPY_MODEL_5b64058914174fedb653bed7cf6af5f6", "value": "

Dashboard: https://jupyter.qhub.esipfed.org/gateway/clusters/dev.af59dda21ced4ffba572390fe11a4780/status

\n" } }, "5b64058914174fedb653bed7cf6af5f6": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "DescriptionStyleModel", "state": { "description_width": "" } }, "6456d941dc8344c7ab9195230062a321": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "IntTextModel", "state": { "description": "Maximum", "layout": "IPY_MODEL_6a6fb9398bfe4fb9ba54b3efa99fa8ee", "step": 1, "style": "IPY_MODEL_abbaa29df02b40a2b59fbcffc43cd303" } }, "6a6fb9398bfe4fb9ba54b3efa99fa8ee": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "width": "150px" } }, "8d537adc78c14d4db0a9b3a88389a1a3": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "VBoxModel", "state": { "children": [ "IPY_MODEL_e493111b36fc474d94c2b19f7f41fe53", "IPY_MODEL_2e59ba6c5888411e90a27a4cd84d6b09", "IPY_MODEL_f94143594e094e8294f5cbdc6f590b8a", "IPY_MODEL_55af4f69e0f34a008e369ad7f8406374" ], "layout": "IPY_MODEL_3f1983ed4754497fa2d26ff371fcdcdf" } }, "abbaa29df02b40a2b59fbcffc43cd303": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "DescriptionStyleModel", "state": { "description_width": "" } }, "ad5bf6e47d52444fa552fbb1936ddf1d": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": {} }, "add02d8fbcb048c2b5a308656443a7e4": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "ButtonStyleModel", "state": {} }, "ba49ce3fecc644d190ac054103004326": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": {} }, "be3e7455fb824a39a4165c3436c39424": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": {} }, "c06c9ed068c446d7afcd6c5b9463f70f": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": {} }, "c3befad4fd754b2ea0c0c546f26ddf7b": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "min_width": "150px" } }, "c68ef040e684402a8e1c6cc692205bf3": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "min_width": "500px" } }, "c9016a30b7894e929c8b7d4288d8babb": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "IntTextModel", "state": { "description": "Workers", "layout": "IPY_MODEL_6a6fb9398bfe4fb9ba54b3efa99fa8ee", "step": 1, "style": "IPY_MODEL_201fac89f02349b6b2acbe72998b42f9" } }, "dcc23361e2784464be49d97bb3ee806b": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HBoxModel", "state": { "children": [ "IPY_MODEL_41c7ad774d874e70b37f06501d27dd38", "IPY_MODEL_6456d941dc8344c7ab9195230062a321", "IPY_MODEL_e4192dcb139a46119d7fdcdda161667e" ], "layout": "IPY_MODEL_f033b5c20e3f4a6e938b3af9e208c808" } }, "e4192dcb139a46119d7fdcdda161667e": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "ButtonModel", "state": { "description": "Adapt", "layout": "IPY_MODEL_6a6fb9398bfe4fb9ba54b3efa99fa8ee", "style": "IPY_MODEL_add02d8fbcb048c2b5a308656443a7e4" } }, "e493111b36fc474d94c2b19f7f41fe53": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HTMLModel", "state": { "layout": "IPY_MODEL_31df01b8363343cd91505b11db0d70b5", "style": "IPY_MODEL_0be288d2ef3e4737ad6a0922bef57946", "value": "

GatewayCluster

" } }, "e76b8fa4c53542c48aba66ff1b629d6e": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HBoxModel", "state": { "children": [ "IPY_MODEL_c9016a30b7894e929c8b7d4288d8babb", "IPY_MODEL_eadc91dd0a80484f90d9e1ed2410ee78" ], "layout": "IPY_MODEL_ba49ce3fecc644d190ac054103004326" } }, "eaaa4c8f1cb2409e8e4572cf6ac27853": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "DescriptionStyleModel", "state": { "description_width": "" } }, "eadc91dd0a80484f90d9e1ed2410ee78": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "ButtonModel", "state": { "description": "Scale", "layout": "IPY_MODEL_6a6fb9398bfe4fb9ba54b3efa99fa8ee", "style": "IPY_MODEL_249a49c2bf2745d19b876e40acd05da3" } }, "f033b5c20e3f4a6e938b3af9e208c808": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": {} }, "f94143594e094e8294f5cbdc6f590b8a": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HTMLModel", "state": { "layout": "IPY_MODEL_c06c9ed068c446d7afcd6c5b9463f70f", "style": "IPY_MODEL_fcbb071b5217439b864323987d4fd483", "value": "

Name: dev.af59dda21ced4ffba572390fe11a4780

" } }, "fcbb071b5217439b864323987d4fd483": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "DescriptionStyleModel", "state": { "description_width": "" } } }, "version_major": 2, "version_minor": 0 } } }, "nbformat": 4, "nbformat_minor": 5 }