models.hint.ipynb

{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#| default_exp models.hint"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# HINT"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The Hierarchical Mixture Networks (HINT) are a highly modular framework that combines SoTA neural forecast architectures with task-specialized mixture probability and advanced hierarchical reconciliation strategies. This powerful combination allows HINT to produce accurate and coherent probabilistic forecasts.\n",
    "\n",
    "HINT's incorporates a `TemporalNorm` module into any neural forecast architecture, the module normalizes inputs into the network's non-linearities operating range and recomposes its output's scales through a global skip connection, improving accuracy and training robustness. HINT ensures the forecast coherence via bootstrap sample reconciliation that restores the aggregation constraints into its base samples.\n",
    "\n",
    "**References**<br>\n",
    "- [Kin G. Olivares, David Luo, Cristian Challu, Stefania La Vattiata, Max Mergenthaler, Artur Dubrawski (2023). \"HINT: Hierarchical Mixture Networks For Coherent Probabilistic Forecasting\". Neural Information Processing Systems, submitted. Working Paper version available at arxiv.](https://arxiv.org/abs/2305.07089)<br>\n",
    "- [Kin G. Olivares, O. Nganba Meetei, Ruijun Ma, Rohan Reddy, Mengfei Cao, Lee Dicker (2022).\"Probabilistic Hierarchical Forecasting with Deep Poisson Mixtures\". International Journal Forecasting, accepted paper available at arxiv.](https://arxiv.org/pdf/2110.13179.pdf)<br>\n",
    "- [Kin G. Olivares, Federico Garza, David Luo, Cristian Challu, Max Mergenthaler, Souhaib Ben Taieb, Shanika Wickramasuriya, and Artur Dubrawski (2022). \"HierarchicalForecast: A reference framework for hierarchical forecasting in python\". Journal of Machine Learning Research, submitted, abs/2207.03517, 2022b.](https://arxiv.org/abs/2207.03517)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "![Figure 1. Hierarchical Mixture Networks (HINT).](imgs_models/hint.png)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#| hide\n",
    "from nbdev.showdoc import show_doc\n",
    "from neuralforecast.losses.pytorch import GMM\n",
    "from neuralforecast import NeuralForecast\n",
    "from neuralforecast.models import NHITS\n",
    "import pandas as pd"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#| export\n",
    "from typing import Optional\n",
    "\n",
    "import numpy as np\n",
    "import torch"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Reconciliation Methods"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#| export\n",
    "def get_bottomup_P(S: np.ndarray):\n",
    "    \"\"\"BottomUp Reconciliation Matrix.\n",
    "\n",
    "    Creates BottomUp hierarchical \\\"projection\\\" matrix is defined as:\n",
    "    $$\\mathbf{P}_{\\\\text{BU}} = [\\mathbf{0}_{\\mathrm{[b],[a]}}\\;|\\;\\mathbf{I}_{\\mathrm{[b][b]}}]$$    \n",
    "\n",
    "    **Parameters:**<br>\n",
    "    `S`: Summing matrix of size (`base`, `bottom`).<br>\n",
    "\n",
    "    **Returns:**<br>\n",
    "    `P`: Reconciliation matrix of size (`bottom`, `base`).<br>\n",
    "\n",
    "    **References:**<br>\n",
    "    - [Orcutt, G.H., Watts, H.W., & Edwards, J.B.(1968). \\\"Data aggregation and information loss\\\". The American \n",
    "    Economic Review, 58 , 773(787)](http://www.jstor.org/stable/1815532).    \n",
    "    \"\"\"\n",
    "    n_series = len(S)\n",
    "    n_agg = n_series-S.shape[1]\n",
    "    P = np.zeros_like(S)\n",
    "    P[n_agg:,:] = S[n_agg:,:]\n",
    "    P = P.T\n",
    "    return P\n",
    "\n",
    "def get_mintrace_ols_P(S: np.ndarray):\n",
    "    \"\"\"MinTraceOLS Reconciliation Matrix.\n",
    "\n",
    "    Creates MinTraceOLS reconciliation matrix as proposed by Wickramasuriya et al.\n",
    "\n",
    "    $$\\mathbf{P}_{\\\\text{MinTraceOLS}}=\\\\left(\\mathbf{S}^{\\intercal}\\mathbf{S}\\\\right)^{-1}\\mathbf{S}^{\\intercal}$$\n",
    "\n",
    "    **Parameters:**<br>\n",
    "    `S`: Summing matrix of size (`base`, `bottom`).<br>\n",
    "      \n",
    "    **Returns:**<br>\n",
    "    `P`: Reconciliation matrix of size (`bottom`, `base`).<br>\n",
    "\n",
    "    **References:**<br>\n",
    "    - [Wickramasuriya, S.L., Turlach, B.A. & Hyndman, R.J. (2020). \\\"Optimal non-negative\n",
    "    forecast reconciliation\". Stat Comput 30, 1167–1182,\n",
    "    https://doi.org/10.1007/s11222-020-09930-0](https://robjhyndman.com/publications/nnmint/).\n",
    "    \"\"\"\n",
    "    n_hiers, n_bottom = S.shape\n",
    "    n_agg = n_hiers - n_bottom\n",
    "\n",
    "    W = np.eye(n_hiers)\n",
    "\n",
    "    # We compute reconciliation matrix with\n",
    "    # Equation 10 from https://robjhyndman.com/papers/MinT.pdf\n",
    "    A = S[:n_agg,:]\n",
    "    U = np.hstack((np.eye(n_agg), -A)).T\n",
    "    J = np.hstack((np.zeros((n_bottom,n_agg)), np.eye(n_bottom)))\n",
    "    P = J - (J @ W @ U) @ np.linalg.pinv(U.T @ W @ U) @ U.T\n",
    "    return P\n",
    "\n",
    "def get_mintrace_wls_P(S: np.ndarray):\n",
    "    \"\"\"MinTraceOLS Reconciliation Matrix.\n",
    "\n",
    "    Creates MinTraceOLS reconciliation matrix as proposed by Wickramasuriya et al.\n",
    "    Depending on a weighted GLS estimator and an estimator of the covariance matrix of the coherency errors $\\mathbf{W}_{h}$.\n",
    "\n",
    "    $$ \\mathbf{W}_{h} = \\mathrm{Diag}(\\mathbf{S} \\mathbb{1}_{[b]})$$\n",
    "\n",
    "    $$\\mathbf{P}_{\\\\text{MinTraceWLS}}=\\\\left(\\mathbf{S}^{\\intercal}\\mathbf{W}_{h}\\mathbf{S}\\\\right)^{-1}\n",
    "    \\mathbf{S}^{\\intercal}\\mathbf{W}^{-1}_{h}$$    \n",
    "\n",
    "    **Parameters:**<br>\n",
    "    `S`: Summing matrix of size (`base`, `bottom`).<br>\n",
    "      \n",
    "    **Returns:**<br>\n",
    "    `P`: Reconciliation matrix of size (`bottom`, `base`).<br>\n",
    "\n",
    "    **References:**<br>\n",
    "    - [Wickramasuriya, S.L., Turlach, B.A. & Hyndman, R.J. (2020). \\\"Optimal non-negative\n",
    "    forecast reconciliation\". Stat Comput 30, 1167–1182,\n",
    "    https://doi.org/10.1007/s11222-020-09930-0](https://robjhyndman.com/publications/nnmint/).\n",
    "    \"\"\"\n",
    "    n_hiers, n_bottom = S.shape\n",
    "    n_agg = n_hiers - n_bottom\n",
    "    \n",
    "    W = np.diag(S @ np.ones((n_bottom,)))\n",
    "\n",
    "    # We compute reconciliation matrix with\n",
    "    # Equation 10 from https://robjhyndman.com/papers/MinT.pdf\n",
    "    A = S[:n_agg,:]\n",
    "    U = np.hstack((np.eye(n_agg), -A)).T\n",
    "    J = np.hstack((np.zeros((n_bottom,n_agg)), np.eye(n_bottom)))\n",
    "    P = J - (J @ W @ U) @ np.linalg.pinv(U.T @ W @ U) @ U.T\n",
    "    return P\n",
    "\n",
    "def get_identity_P(S: np.ndarray):\n",
    "    # Placeholder function for identity P (no reconciliation).\n",
    "    pass"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "show_doc(get_bottomup_P, title_level=3)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "show_doc(get_mintrace_ols_P, title_level=3)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "show_doc(get_mintrace_wls_P, title_level=3)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## HINT"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#| export\n",
    "class HINT:\n",
    "    \"\"\" HINT\n",
    "\n",
    "    The Hierarchical Mixture Networks (HINT) are a highly modular framework that \n",
    "    combines SoTA neural forecast architectures with a task-specialized mixture \n",
    "    probability and advanced hierarchical reconciliation strategies. This powerful \n",
    "    combination allows HINT to produce accurate and coherent probabilistic forecasts.\n",
    "\n",
    "    HINT's incorporates a `TemporalNorm` module into any neural forecast architecture, \n",
    "    the module normalizes inputs into the network's non-linearities operating range \n",
    "    and recomposes its output's scales through a global skip connection, improving \n",
    "    accuracy and training robustness. HINT ensures the forecast coherence via bootstrap \n",
    "    sample reconciliation that restores the aggregation constraints into its base samples.\n",
    "\n",
    "    Available reconciliations:<br>\n",
    "    - BottomUp<br>\n",
    "    - MinTraceOLS<br>\n",
    "    - MinTraceWLS<br>\n",
    "    - Identity\n",
    "\n",
    "    **Parameters:**<br>\n",
    "    `h`: int, Forecast horizon. <br>\n",
    "    `model`: NeuralForecast model, instantiated model class from [architecture collection](https://nixtla.github.io/neuralforecast/models.pytorch.html).<br>\n",
    "    `S`: np.ndarray, dumming matrix of size (`base`, `bottom`) see HierarchicalForecast's [aggregate method](https://nixtla.github.io/hierarchicalforecast/utils.html#aggregate).<br>\n",
    "    `reconciliation`: str, HINT's reconciliation method from ['BottomUp', 'MinTraceOLS', 'MinTraceWLS'].<br>\n",
    "    `alias`: str, optional,  Custom name of the model.<br>\n",
    "    \"\"\"\n",
    "    def __init__(self,\n",
    "                 h: int,\n",
    "                 S: np.ndarray,\n",
    "                 model,\n",
    "                 reconciliation: str,\n",
    "                 alias: Optional[str] = None):\n",
    "        \n",
    "        if model.h != h:\n",
    "            raise Exception(f\"Model h {model.h} does not match HINT h {h}\")\n",
    "        \n",
    "        if not model.loss.is_distribution_output:\n",
    "            raise Exception(f\"The NeuralForecast model's loss {model.loss} is not a probabilistic objective\")\n",
    "        \n",
    "        self.h = h\n",
    "        self.model = model\n",
    "        self.early_stop_patience_steps = model.early_stop_patience_steps\n",
    "        self.S = S\n",
    "        self.reconciliation = reconciliation\n",
    "        self.loss = model.loss\n",
    "\n",
    "        available_reconciliations = dict(\n",
    "                                BottomUp=get_bottomup_P,\n",
    "                                MinTraceOLS=get_mintrace_ols_P,\n",
    "                                MinTraceWLS=get_mintrace_wls_P,\n",
    "                                Identity=get_identity_P,\n",
    "                                )\n",
    "\n",
    "        if reconciliation not in available_reconciliations:\n",
    "            raise Exception(f\"Reconciliation {reconciliation} not available\")\n",
    "\n",
    "        # Get SP matrix\n",
    "        self.reconciliation = reconciliation\n",
    "        if reconciliation== 'Identity':\n",
    "            self.SP = None\n",
    "        else:\n",
    "            P = available_reconciliations[reconciliation](S=S)\n",
    "            self.SP = S @ P\n",
    "\n",
    "        qs = torch.Tensor((np.arange(self.loss.num_samples)/self.loss.num_samples))\n",
    "        self.sample_quantiles = torch.nn.Parameter(qs, requires_grad=False)\n",
    "        self.alias = alias\n",
    "    \n",
    "    def __repr__(self):\n",
    "        return type(self).__name__ if self.alias is None else self.alias\n",
    "\n",
    "\n",
    "    def fit(self, dataset, val_size=0, test_size=0, random_seed=None, distributed_config=None):\n",
    "        \"\"\" HINT.fit\n",
    "\n",
    "        HINT trains on the entire hierarchical dataset, by minimizing a composite log likelihood objective.\n",
    "        HINT framework integrates `TemporalNorm` into the neural forecast architecture for a scale-decoupled \n",
    "        optimization that robustifies cross-learning the hierachy's series scales.\n",
    "\n",
    "        **Parameters:**<br>\n",
    "        `dataset`: NeuralForecast's `TimeSeriesDataset` see details [here](https://nixtla.github.io/neuralforecast/tsdataset.html)<br>\n",
    "        `val_size`: int, size of the validation set, (default 0).<br>\n",
    "        `test_size`: int, size of the test set, (default 0).<br>\n",
    "        `random_seed`: int, random seed for the prediction.<br>\n",
    "\n",
    "        **Returns:**<br>\n",
    "        `self`: A fitted base `NeuralForecast` model.<br>\n",
    "        \"\"\"\n",
    "        model = self.model.fit(dataset=dataset,\n",
    "                       val_size=val_size,\n",
    "                       test_size=test_size,\n",
    "                       random_seed=random_seed,\n",
    "                       distributed_config=distributed_config)\n",
    "\n",
    "        # Added attributes for compatibility with NeuralForecast core\n",
    "        self.futr_exog_list = self.model.futr_exog_list\n",
    "        self.hist_exog_list = self.model.hist_exog_list\n",
    "        self.stat_exog_list = self.model.stat_exog_list\n",
    "        return model\n",
    "\n",
    "    def predict(self, dataset, step_size=1, random_seed=None, **data_module_kwargs):\n",
    "        \"\"\" HINT.predict\n",
    "\n",
    "        After fitting a base model on the entire hierarchical dataset.\n",
    "        HINT restores the hierarchical aggregation constraints using \n",
    "        bootstrapped sample reconciliation.\n",
    "\n",
    "        **Parameters:**<br>\n",
    "        `dataset`: NeuralForecast's `TimeSeriesDataset` see details [here](https://nixtla.github.io/neuralforecast/tsdataset.html)<br>\n",
    "        `step_size`: int, steps between sequential predictions, (default 1).<br>\n",
    "        `random_seed`: int, random seed for the prediction.<br>\n",
    "        `**data_kwarg`: additional parameters for the dataset module.<br>\n",
    "\n",
    "        **Returns:**<br>\n",
    "        `y_hat`: numpy predictions of the `NeuralForecast` model.<br>\n",
    "        \"\"\"\n",
    "        # Non-reconciled predictions\n",
    "        if self.reconciliation=='Identity':\n",
    "            forecasts = self.model.predict(dataset=dataset, \n",
    "                                        step_size=step_size,\n",
    "                                        random_seed=random_seed,\n",
    "                                        **data_module_kwargs)\n",
    "            return forecasts\n",
    "\n",
    "        num_samples = self.model.loss.num_samples\n",
    "\n",
    "        # Hack to get samples by simulating quantiles (samples will be ordered)\n",
    "        # Mysterious parsing associated to default [mean,quantiles] output\n",
    "        quantiles_old = self.model.loss.quantiles\n",
    "        names_old = self.model.loss.output_names\n",
    "        self.model.loss.quantiles = self.sample_quantiles\n",
    "        self.model.loss.output_names = ['1'] * (1 + num_samples)\n",
    "        samples = self.model.predict(dataset=dataset, \n",
    "                                     step_size=step_size,\n",
    "                                     random_seed=random_seed,\n",
    "                                     **data_module_kwargs)\n",
    "        samples = samples[:,1:] # Eliminate mean from quantiles\n",
    "        self.model.loss.quantiles = quantiles_old\n",
    "        self.model.loss.output_names = names_old\n",
    "\n",
    "        # Hack requires to break quantiles correlations between samples\n",
    "        idxs = np.random.choice(num_samples, size=samples.shape, replace=True)\n",
    "        aux_col_idx = np.arange(len(samples))[:,None] * num_samples\n",
    "        idxs = idxs + aux_col_idx\n",
    "        samples = samples.flatten()[idxs]\n",
    "        samples = samples.reshape(dataset.n_groups, -1, self.h, num_samples)\n",
    "        \n",
    "        # Bootstrap Sample Reconciliation\n",
    "        # Default output [mean, quantiles]\n",
    "        samples = np.einsum('ij, jwhp -> iwhp', self.SP, samples)\n",
    "\n",
    "        sample_mean = np.mean(samples, axis=-1, keepdims=True)\n",
    "        sample_mean = sample_mean.reshape(-1, 1)\n",
    "\n",
    "        forecasts = np.quantile(samples, self.model.loss.quantiles, axis=-1)\n",
    "        forecasts = forecasts.transpose(1,2,3,0) # [...,samples]\n",
    "        forecasts = forecasts.reshape(-1, len(self.model.loss.quantiles))\n",
    "\n",
    "        forecasts = np.concatenate([sample_mean, forecasts], axis=-1)\n",
    "        return forecasts\n",
    "\n",
    "    def set_test_size(self, test_size):\n",
    "        self.model.test_size = test_size\n",
    "\n",
    "    def get_test_size(self):\n",
    "        return self.model.test_size\n",
    "\n",
    "    def save(self, path):\n",
    "        \"\"\" HINT.save\n",
    "\n",
    "        Save the HINT fitted model to disk.\n",
    "\n",
    "        **Parameters:**<br>\n",
    "        `path`: str, path to save the model.<br>\n",
    "        \"\"\"\n",
    "        self.model.save(path)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "show_doc(HINT, title_level=3)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "show_doc(HINT.fit, title_level=3)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "show_doc(HINT.predict, title_level=3)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# | hide\n",
    "# Unit test to check hierarchical coherence\n",
    "# Probabilistic coherent => Sample coherent => Mean coherence\n",
    "\n",
    "def sort_df_hier(Y_df, S_df):\n",
    "    # NeuralForecast core, sorts unique_id lexicographically\n",
    "    # by default, this class matches S_df and Y_hat_df order.    \n",
    "    Y_df.unique_id = Y_df.unique_id.astype('category')\n",
    "    Y_df.unique_id = Y_df.unique_id.cat.set_categories(S_df.index)\n",
    "    Y_df = Y_df.sort_values(by=['unique_id', 'ds'])\n",
    "    return Y_df\n",
    "\n",
    "# -----Create synthetic dataset-----\n",
    "np.random.seed(123)\n",
    "train_steps = 20\n",
    "num_levels = 7\n",
    "level = np.arange(0, 100, 0.1)\n",
    "qs = [[50-lv/2, 50+lv/2] for lv in level]\n",
    "quantiles = np.sort(np.concatenate(qs)/100)\n",
    "\n",
    "levels = ['Top', 'Mid1', 'Mid2', 'Bottom1', 'Bottom2', 'Bottom3', 'Bottom4']\n",
    "unique_ids = np.repeat(levels, train_steps)\n",
    "\n",
    "S = np.array([[1., 1., 1., 1.],\n",
    "              [1., 1., 0., 0.],\n",
    "              [0., 0., 1., 1.],\n",
    "              [1., 0., 0., 0.],\n",
    "              [0., 1., 0., 0.],\n",
    "              [0., 0., 1., 0.],\n",
    "              [0., 0., 0., 1.]])\n",
    "\n",
    "S_dict = {col: S[:, i] for i, col in enumerate(levels[3:])}\n",
    "S_df = pd.DataFrame(S_dict, index=levels)\n",
    "\n",
    "ds = pd.date_range(start='2018-03-31', periods=train_steps, freq='Q').tolist() * num_levels\n",
    "# Create Y_df\n",
    "y_lists = [S @ np.random.uniform(low=100, high=500, size=4) for i in range(train_steps)]\n",
    "y = [elem for tup in zip(*y_lists) for elem in tup]\n",
    "Y_df = pd.DataFrame({'unique_id': unique_ids, 'ds': ds, 'y': y})\n",
    "Y_df = sort_df_hier(Y_df, S_df)\n",
    "\n",
    "# ------Fit/Predict HINT Model------\n",
    "# Model + Distribution + Reconciliation\n",
    "nhits = NHITS(h=4,\n",
    "              input_size=4,\n",
    "              loss=GMM(n_components=2, quantiles=quantiles, num_samples=len(quantiles)),\n",
    "              max_steps=5,\n",
    "              early_stop_patience_steps=2,\n",
    "              val_check_steps=1,\n",
    "              scaler_type='robust',\n",
    "              learning_rate=1e-3)\n",
    "model = HINT(h=4, model=nhits, S=S, reconciliation='BottomUp')\n",
    "\n",
    "# Fit and Predict\n",
    "nf = NeuralForecast(models=[model], freq='Q')\n",
    "forecasts = nf.cross_validation(df=Y_df, val_size=4, n_windows=1)\n",
    "\n",
    "# ---Check Hierarchical Coherence---\n",
    "parent_children_dict = {0: [1, 2], 1: [3, 4], 2: [5, 6]}\n",
    "# check coherence for each horizon time step\n",
    "for _, df in forecasts.groupby('ds'):\n",
    "    hint_mean = df['HINT'].values\n",
    "    for parent_idx, children_list in parent_children_dict.items():\n",
    "        parent_value = hint_mean[parent_idx]\n",
    "        children_sum = hint_mean[children_list].sum()\n",
    "        np.testing.assert_allclose(children_sum, parent_value)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Usage Example"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In this example we will use HINT for the hierarchical forecast task, a multivariate regression problem with aggregation constraints. The aggregation constraints can be compactcly represented by the summing matrix $\\mathbf{S}_{[i][b]}$, the Figure belows shows an example.\n",
    "\n",
    "In this example we will make coherent predictions for the TourismL dataset. \n",
    "\n",
    "Outline<br>\n",
    "1. Import packages<br>\n",
    "2. Load hierarchical dataset<br>\n",
    "3. Fit and Predict HINT<br>\n",
    "4. Forecast Plot"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "![](imgs_models/hint_notation.png)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#| eval: false\n",
    "import numpy as np\n",
    "import matplotlib.pyplot as plt\n",
    "\n",
    "from neuralforecast.losses.pytorch import GMM, sCRPS\n",
    "from datasetsforecast.hierarchical import HierarchicalData\n",
    "\n",
    "# Auxiliary sorting\n",
    "def sort_df_hier(Y_df, S_df):\n",
    "    # NeuralForecast core, sorts unique_id lexicographically\n",
    "    # by default, this class matches S_df and Y_hat_df order.    \n",
    "    Y_df.unique_id = Y_df.unique_id.astype('category')\n",
    "    Y_df.unique_id = Y_df.unique_id.cat.set_categories(S_df.index)\n",
    "    Y_df = Y_df.sort_values(by=['unique_id', 'ds'])\n",
    "    return Y_df\n",
    "\n",
    "# Load TourismSmall dataset\n",
    "horizon = 12\n",
    "Y_df, S_df, tags = HierarchicalData.load('./data', 'TourismLarge')\n",
    "Y_df['ds'] = pd.to_datetime(Y_df['ds'])\n",
    "Y_df = sort_df_hier(Y_df, S_df)\n",
    "level = [80,90]\n",
    "\n",
    "# Instantiate HINT\n",
    "# BaseNetwork + Distribution + Reconciliation\n",
    "nhits = NHITS(h=horizon,\n",
    "              input_size=24,\n",
    "              loss=GMM(n_components=10, level=level),\n",
    "              max_steps=2000,\n",
    "              early_stop_patience_steps=10,\n",
    "              val_check_steps=50,\n",
    "              scaler_type='robust',\n",
    "              learning_rate=1e-3,\n",
    "              valid_loss=sCRPS(level=level))\n",
    "\n",
    "model = HINT(h=horizon, S=S_df.values,\n",
    "             model=nhits,  reconciliation='BottomUp')\n",
    "\n",
    "# Fit and Predict\n",
    "nf = NeuralForecast(models=[model], freq='MS')\n",
    "Y_hat_df = nf.cross_validation(df=Y_df, val_size=12, n_windows=1)\n",
    "Y_hat_df = Y_hat_df.reset_index()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#| eval: false\n",
    "# Plot coherent probabilistic forecast\n",
    "unique_id = 'TotalAll'\n",
    "Y_plot_df = Y_df[Y_df.unique_id==unique_id]\n",
    "plot_df = Y_hat_df[Y_hat_df.unique_id==unique_id]\n",
    "plot_df = Y_plot_df.merge(plot_df, on=['ds', 'unique_id'], how='left')\n",
    "n_years = 5\n",
    "\n",
    "plt.plot(plot_df['ds'][-12*n_years:], plot_df['y_x'][-12*n_years:], c='black', label='True')\n",
    "plt.plot(plot_df['ds'][-12*n_years:], plot_df['HINT'][-12*n_years:], c='purple', label='mean')\n",
    "plt.plot(plot_df['ds'][-12*n_years:], plot_df['HINT-median'][-12*n_years:], c='blue', label='median')\n",
    "plt.fill_between(x=plot_df['ds'][-12*n_years:],\n",
    "                 y1=plot_df['HINT-lo-90'][-12*n_years:].values,\n",
    "                 y2=plot_df['HINT-hi-90'][-12*n_years:].values,\n",
    "                 alpha=0.4, label='level 90')\n",
    "plt.legend()\n",
    "plt.grid()\n",
    "plt.plot()"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "python3",
   "language": "python",
   "name": "python3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}