{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"id": "524620c1",
"metadata": {},
"outputs": [],
"source": [
"#| default_exp common._base_auto"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "15392f6f",
"metadata": {},
"outputs": [],
"source": [
"#| hide\n",
"%load_ext autoreload\n",
"%autoreload 2"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "12fa25a4",
"metadata": {},
"source": [
"# Hyperparameter Optimization\n",
"\n",
"> Machine Learning forecasting methods are defined by many hyperparameters that control their behavior, with effects ranging from their speed and memory requirements to their predictive performance. For a long time, manual hyperparameter tuning prevailed. This approach is time-consuming, **automated hyperparameter optimization** methods have been introduced, proving more efficient than manual tuning, grid search, and random search.
The `BaseAuto` class offers shared API connections to hyperparameter optimization algorithms like [Optuna](https://docs.ray.io/en/latest/tune/examples/bayesopt_example.html), [HyperOpt](https://docs.ray.io/en/latest/tune/examples/hyperopt_example.html), [Dragonfly](https://docs.ray.io/en/latest/tune/examples/dragonfly_example.html) among others through `ray`, which gives you access to grid search, bayesian optimization and other state-of-the-art tools like hyperband.
Comprehending the impacts of hyperparameters is still a precious skill, as it can help guide the design of informed hyperparameter spaces that are faster to explore automatically."
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "e37fd67c",
"metadata": {},
"source": [
""
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "2508f7a9-1433-4ad8-8f2f-0078c6ed6c3c",
"metadata": {},
"outputs": [],
"source": [
"#| hide\n",
"from fastcore.test import test_eq\n",
"from nbdev.showdoc import show_doc"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "44065066-e72a-431f-938f-1528adef9fe8",
"metadata": {},
"outputs": [],
"source": [
"#| export\n",
"from copy import deepcopy\n",
"from os import cpu_count\n",
"\n",
"import torch\n",
"import pytorch_lightning as pl\n",
"\n",
"from ray import air, tune\n",
"from ray.tune.integration.pytorch_lightning import TuneReportCallback\n",
"from ray.tune.search.basic_variant import BasicVariantGenerator"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "45cecbda-68c8-4426-a186-9a2a94dcc54e",
"metadata": {},
"outputs": [],
"source": [
"#| exporti\n",
"class MockTrial:\n",
" def suggest_int(*args, **kwargs):\n",
" return 'int'\n",
" def suggest_categorical(*args, **kwargs):\n",
" return 'categorical'\n",
" def suggest_uniform(*args, **kwargs):\n",
" return 'uniform'\n",
" def suggest_loguniform(*args, **kwargs):\n",
" return 'loguniform'\n",
" def suggest_float(*args, **kwargs):\n",
" if 'log' in kwargs:\n",
" return 'quantized_log'\n",
" elif 'step' in kwargs:\n",
" return 'quantized_loguniform'\n",
" return 'float'"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4c253583-8239-4abe-8a04-0c0ba635d8a5",
"metadata": {},
"outputs": [],
"source": [
"#| export\n",
"class BaseAuto(pl.LightningModule):\n",
" \"\"\"\n",
" Class for Automatic Hyperparameter Optimization, it builds on top of `ray` to \n",
" give access to a wide variety of hyperparameter optimization tools ranging \n",
" from classic grid search, to Bayesian optimization and HyperBand algorithm.\n",
"\n",
" The validation loss to be optimized is defined by the `config['loss']` dictionary\n",
" value, the config also contains the rest of the hyperparameter search space.\n",
"\n",
" It is important to note that the success of this hyperparameter optimization\n",
" heavily relies on a strong correlation between the validation and test periods.\n",
"\n",
" Parameters\n",
" ----------\n",
" cls_model : PyTorch/PyTorchLightning model\n",
" See `neuralforecast.models` [collection here](https://nixtla.github.io/neuralforecast/models.html).\n",
" h : int\n",
" Forecast horizon\n",
" loss : PyTorch module\n",
" Instantiated train loss class from [losses collection](https://nixtla.github.io/neuralforecast/losses.pytorch.html).\n",
" valid_loss : PyTorch module\n",
" Instantiated valid loss class from [losses collection](https://nixtla.github.io/neuralforecast/losses.pytorch.html).\n",
" config : dict or callable\n",
" Dictionary with ray.tune defined search space or function that takes an optuna trial and returns a configuration dict.\n",
" search_alg : ray.tune.search variant or optuna.sampler\n",
" For ray see https://docs.ray.io/en/latest/tune/api_docs/suggestion.html\n",
" For optuna see https://optuna.readthedocs.io/en/stable/reference/samplers/index.html.\n",
" num_samples : int\n",
" Number of hyperparameter optimization steps/samples.\n",
" cpus : int (default=os.cpu_count())\n",
" Number of cpus to use during optimization. Only used with ray tune.\n",
" gpus : int (default=torch.cuda.device_count())\n",
" Number of gpus to use during optimization, default all available. Only used with ray tune.\n",
" refit_with_val : bool\n",
" Refit of best model should preserve val_size.\n",
" verbose : bool\n",
" Track progress.\n",
" alias : str, optional (default=None)\n",
" Custom name of the model.\n",
" backend : str (default='ray')\n",
" Backend to use for searching the hyperparameter space, can be either 'ray' or 'optuna'.\n",
" callbacks : list of callable, optional (default=None)\n",
" List of functions to call during the optimization process.\n",
" ray reference: https://docs.ray.io/en/latest/tune/tutorials/tune-metrics.html\n",
" optuna reference: https://optuna.readthedocs.io/en/stable/tutorial/20_recipes/007_optuna_callback.html\n",
" \"\"\"\n",
" def __init__(self, \n",
" cls_model,\n",
" h,\n",
" loss,\n",
" valid_loss,\n",
" config, \n",
" search_alg=BasicVariantGenerator(random_state=1),\n",
" num_samples=10,\n",
" cpus=cpu_count(),\n",
" gpus=torch.cuda.device_count(),\n",
" refit_with_val=False,\n",
" verbose=False,\n",
" alias=None,\n",
" backend='ray',\n",
" callbacks=None,\n",
" ):\n",
" super(BaseAuto, self).__init__()\n",
" self.save_hyperparameters() # Allows instantiation from a checkpoint from class\n",
"\n",
" if backend == 'ray':\n",
" if not isinstance(config, dict):\n",
" raise ValueError(\n",
" \"You have to provide a dict as `config` when using `backend='ray'`\"\n",
" )\n",
" config_base = deepcopy(config)\n",
" elif backend == 'optuna':\n",
" if not callable(config):\n",
" raise ValueError(\n",
" \"You have to provide a function that takes a trial and returns a dict as `config` when using `backend='optuna'`\"\n",
" )\n",
" # extract constant values from the config fn for validations\n",
" config_base = config(MockTrial())\n",
" else:\n",
" raise ValueError(f\"Unknown backend {backend}. The supported backends are 'ray' and 'optuna'.\")\n",
" if config_base.get('h', None) is not None:\n",
" raise Exception(\"Please use `h` init argument instead of `config['h']`.\")\n",
" if config_base.get('loss', None) is not None:\n",
" raise Exception(\"Please use `loss` init argument instead of `config['loss']`.\")\n",
" if config_base.get('valid_loss', None) is not None:\n",
" raise Exception(\"Please use `valid_loss` init argument instead of `config['valid_loss']`.\")\n",
" # This attribute helps to protect \n",
" # model and datasets interactions protections\n",
" if 'early_stop_patience_steps' in config_base.keys():\n",
" self.early_stop_patience_steps = 1\n",
" else:\n",
" self.early_stop_patience_steps = -1\n",
"\n",
" if callable(config):\n",
" # reset config_base here to save params to override in the config fn\n",
" config_base = {}\n",
"\n",
" # Add losses to config and protect valid_loss default\n",
" config_base['h'] = h\n",
" config_base['loss'] = loss\n",
" if valid_loss is None:\n",
" valid_loss = loss\n",
" config_base['valid_loss'] = valid_loss\n",
"\n",
" if isinstance(config, dict):\n",
" self.config = config_base \n",
" else:\n",
" def config_f(trial):\n",
" return {**config(trial), **config_base}\n",
" self.config = config_f \n",
" \n",
" self.h = h\n",
" self.cls_model = cls_model\n",
" self.loss = loss\n",
" self.valid_loss = valid_loss\n",
"\n",
" self.num_samples = num_samples\n",
" self.search_alg = search_alg\n",
" self.cpus = cpus\n",
" self.gpus = gpus\n",
" self.refit_with_val = refit_with_val\n",
" self.verbose = verbose\n",
" self.alias = alias\n",
" self.backend = backend\n",
" self.callbacks = callbacks\n",
"\n",
" # Base Class attributes\n",
" self.SAMPLING_TYPE = cls_model.SAMPLING_TYPE\n",
"\n",
" def __repr__(self):\n",
" return type(self).__name__ if self.alias is None else self.alias\n",
" \n",
" def _train_tune(self, config_step, cls_model, dataset, val_size, test_size):\n",
" \"\"\" BaseAuto._train_tune\n",
"\n",
" Internal function that instantiates a NF class model, then automatically\n",
" explores the validation loss (ptl/val_loss) on which the hyperparameter \n",
" exploration is based.\n",
"\n",
" **Parameters:**
\n",
" `config_step`: Dict, initialization parameters of a NF model.
\n",
" `cls_model`: NeuralForecast model class, yet to be instantiated.
\n",
" `dataset`: NeuralForecast dataset, to fit the model.
\n",
" `val_size`: int, validation size for temporal cross-validation.
\n",
" `test_size`: int, test size for temporal cross-validation.
\n",
" \"\"\"\n",
" metrics = {\"loss\": \"ptl/val_loss\", \"train_loss\": \"train_loss\"}\n",
" callbacks = [TuneReportCallback(metrics, on=\"validation_end\")]\n",
" if 'callbacks' in config_step.keys():\n",
" callbacks.extend(config_step['callbacks'])\n",
" config_step = {**config_step, **{'callbacks': callbacks}}\n",
"\n",
" # Protect dtypes from tune samplers\n",
" if 'batch_size' in config_step.keys():\n",
" config_step['batch_size'] = int(config_step['batch_size'])\n",
" if 'windows_batch_size' in config_step.keys():\n",
" config_step['windows_batch_size'] = int(config_step['windows_batch_size'])\n",
"\n",
" # Tune session receives validation signal\n",
" # from the specialized PL TuneReportCallback\n",
" _ = self._fit_model(cls_model=cls_model,\n",
" config=config_step,\n",
" dataset=dataset,\n",
" val_size=val_size,\n",
" test_size=test_size)\n",
"\n",
" def _tune_model(self, cls_model, dataset, val_size, test_size,\n",
" cpus, gpus, verbose, num_samples, search_alg, config):\n",
" train_fn_with_parameters = tune.with_parameters(\n",
" self._train_tune,\n",
" cls_model=cls_model,\n",
" dataset=dataset,\n",
" val_size=val_size,\n",
" test_size=test_size,\n",
" )\n",
"\n",
" # Device\n",
" if gpus > 0:\n",
" device_dict = {'gpu':gpus}\n",
" else:\n",
" device_dict = {'cpu':cpus}\n",
"\n",
" # on Windows, prevent long trial directory names\n",
" import platform\n",
" trial_dirname_creator=(lambda trial: f\"{trial.trainable_name}_{trial.trial_id}\") if platform.system() == 'Windows' else None\n",
"\n",
" tuner = tune.Tuner(\n",
" tune.with_resources(train_fn_with_parameters, device_dict),\n",
" run_config=air.RunConfig(callbacks=self.callbacks, verbose=verbose),\n",
" tune_config=tune.TuneConfig(\n",
" metric=\"loss\",\n",
" mode=\"min\",\n",
" num_samples=num_samples, \n",
" search_alg=search_alg,\n",
" trial_dirname_creator=trial_dirname_creator,\n",
" ),\n",
" param_space=config,\n",
" )\n",
" results = tuner.fit()\n",
" return results\n",
"\n",
" @staticmethod\n",
" def _ray_config_to_optuna(ray_config):\n",
" def optuna_config(trial):\n",
" out = {}\n",
" for k, v in ray_config.items():\n",
" if hasattr(v, 'sampler'):\n",
" sampler = v.sampler\n",
" if isinstance(sampler, tune.search.sample.Integer.default_sampler_cls):\n",
" v = trial.suggest_int(k, v.lower, v.upper)\n",
" elif isinstance(sampler, tune.search.sample.Categorical.default_sampler_cls):\n",
" v = trial.suggest_categorical(k, v.categories) \n",
" elif isinstance(sampler, tune.search.sample.Uniform):\n",
" v = trial.suggest_uniform(k, v.lower, v.upper)\n",
" elif isinstance(sampler, tune.search.sample.LogUniform):\n",
" v = trial.suggest_loguniform(k, v.lower, v.upper)\n",
" elif isinstance(sampler, tune.search.sample.Quantized):\n",
" if isinstance(sampler.get_sampler(), tune.search.sample.Float._LogUniform):\n",
" v = trial.suggest_float(k, v.lower, v.upper, log=True)\n",
" elif isinstance(sampler.get_sampler(), tune.search.sample.Float._Uniform):\n",
" v = trial.suggest_float(k, v.lower, v.upper, step=sampler.q)\n",
" else:\n",
" raise ValueError(f\"Couldn't translate {type(v)} to optuna.\")\n",
" out[k] = v\n",
" return out\n",
" return optuna_config\n",
"\n",
" def _optuna_tune_model(\n",
" self,\n",
" cls_model,\n",
" dataset,\n",
" val_size,\n",
" test_size,\n",
" verbose,\n",
" num_samples,\n",
" search_alg,\n",
" config,\n",
" distributed_config,\n",
" ):\n",
" import optuna\n",
"\n",
" def objective(trial):\n",
" user_cfg = config(trial)\n",
" cfg = deepcopy(user_cfg)\n",
" model = self._fit_model(\n",
" cls_model=cls_model,\n",
" config=cfg,\n",
" dataset=dataset,\n",
" val_size=val_size,\n",
" test_size=test_size,\n",
" distributed_config=distributed_config,\n",
" )\n",
" trial.set_user_attr('ALL_PARAMS', user_cfg)\n",
" metrics = model.metrics\n",
" trial.set_user_attr('METRICS', {\n",
" \"loss\": metrics[\"ptl/val_loss\"],\n",
" \"train_loss\": metrics[\"train_loss\"],\n",
" })\n",
" return trial.user_attrs['METRICS']['loss']\n",
"\n",
" if isinstance(search_alg, optuna.samplers.BaseSampler):\n",
" sampler = search_alg\n",
" else:\n",
" sampler = None\n",
"\n",
" study = optuna.create_study(sampler=sampler, direction='minimize')\n",
" study.optimize(\n",
" objective,\n",
" n_trials=num_samples,\n",
" show_progress_bar=verbose,\n",
" callbacks=self.callbacks,\n",
" )\n",
" return study\n",
"\n",
" def _fit_model(self, cls_model, config,\n",
" dataset, val_size, test_size, distributed_config=None):\n",
" model = cls_model(**config)\n",
" model = model.fit(\n",
" dataset,\n",
" val_size=val_size, \n",
" test_size=test_size,\n",
" distributed_config=distributed_config,\n",
" )\n",
" return model\n",
"\n",
" def fit(self, dataset, val_size=0, test_size=0, random_seed=None, distributed_config=None):\n",
" \"\"\" BaseAuto.fit\n",
"\n",
" Perform the hyperparameter optimization as specified by the BaseAuto configuration \n",
" dictionary `config`.\n",
"\n",
" The optimization is performed on the `TimeSeriesDataset` using temporal cross validation with \n",
" the validation set that sequentially precedes the test set.\n",
"\n",
" **Parameters:**
\n",
" `dataset`: NeuralForecast's `TimeSeriesDataset` see details [here](https://nixtla.github.io/neuralforecast/tsdataset.html)
\n",
" `val_size`: int, size of temporal validation set (needs to be bigger than 0).
\n",
" `test_size`: int, size of temporal test set (default 0).
\n",
" `random_seed`: int=None, random_seed for hyperparameter exploration algorithms, not yet implemented.
\n",
" **Returns:**
\n",
" `self`: fitted instance of `BaseAuto` with best hyperparameters and results
.\n",
" \"\"\"\n",
" #we need val_size > 0 to perform\n",
" #hyperparameter selection.\n",
" search_alg = deepcopy(self.search_alg)\n",
" val_size = val_size if val_size > 0 else self.h\n",
" if self.backend == 'ray':\n",
" if distributed_config is not None:\n",
" raise ValueError('distributed training is not supported for the ray backend.')\n",
" results = self._tune_model(\n",
" cls_model=self.cls_model,\n",
" dataset=dataset,\n",
" val_size=val_size,\n",
" test_size=test_size, \n",
" cpus=self.cpus,\n",
" gpus=self.gpus,\n",
" verbose=self.verbose,\n",
" num_samples=self.num_samples, \n",
" search_alg=search_alg, \n",
" config=self.config,\n",
" ) \n",
" best_config = results.get_best_result().config \n",
" else:\n",
" results = self._optuna_tune_model(\n",
" cls_model=self.cls_model,\n",
" dataset=dataset,\n",
" val_size=val_size, \n",
" test_size=test_size, \n",
" verbose=self.verbose,\n",
" num_samples=self.num_samples, \n",
" search_alg=search_alg, \n",
" config=self.config,\n",
" distributed_config=distributed_config,\n",
" )\n",
" best_config = results.best_trial.user_attrs['ALL_PARAMS']\n",
" self.model = self._fit_model(\n",
" cls_model=self.cls_model,\n",
" config=best_config,\n",
" dataset=dataset,\n",
" val_size=val_size * (1 - self.refit_with_val),\n",
" test_size=test_size,\n",
" distributed_config=distributed_config,\n",
" )\n",
" self.results = results\n",
"\n",
" # Added attributes for compatibility with NeuralForecast core\n",
" self.futr_exog_list = self.model.futr_exog_list\n",
" self.hist_exog_list = self.model.hist_exog_list\n",
" self.stat_exog_list = self.model.stat_exog_list\n",
" return self\n",
"\n",
" def predict(self, dataset, step_size=1, **data_kwargs):\n",
" \"\"\" BaseAuto.predict\n",
"\n",
" Predictions of the best performing model on validation.\n",
"\n",
" **Parameters:**
\n",
" `dataset`: NeuralForecast's `TimeSeriesDataset` see details [here](https://nixtla.github.io/neuralforecast/tsdataset.html)
\n",
" `step_size`: int, steps between sequential predictions, (default 1).
\n",
" `**data_kwarg`: additional parameters for the dataset module.
\n",
" `random_seed`: int=None, random_seed for hyperparameter exploration algorithms (not implemented).
\n",
" **Returns:**
\n",
" `y_hat`: numpy predictions of the `NeuralForecast` model.
\n",
" \"\"\"\n",
" return self.model.predict(dataset=dataset, \n",
" step_size=step_size, **data_kwargs)\n",
"\n",
" def set_test_size(self, test_size):\n",
" self.model.set_test_size(test_size)\n",
"\n",
" def get_test_size(self):\n",
" return self.model.test_size\n",
" \n",
" def save(self, path):\n",
" \"\"\" BaseAuto.save\n",
"\n",
" Save the fitted model to disk.\n",
"\n",
" **Parameters:**
\n",
" `path`: str, path to save the model.
\n",
" \"\"\"\n",
" self.model.save(path)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "2376ed06",
"metadata": {},
"outputs": [],
"source": [
"show_doc(BaseAuto, title_level=3)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "623ebb06",
"metadata": {},
"outputs": [],
"source": [
"show_doc(BaseAuto.fit, title_level=3)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "69d3c1ae",
"metadata": {},
"outputs": [],
"source": [
"show_doc(BaseAuto.predict, title_level=3)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "bbfd4e8f-2565-4f85-b615-7329a1ae3f43",
"metadata": {},
"outputs": [],
"source": [
"#| hide\n",
"import logging\n",
"import warnings\n",
"\n",
"import pytorch_lightning as pl"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "421db156-4ee6-420f-ac9e-f0ddc9781841",
"metadata": {},
"outputs": [],
"source": [
"#| hide\n",
"logging.getLogger(\"pytorch_lightning\").setLevel(logging.ERROR)\n",
"warnings.filterwarnings(\"ignore\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d1e776fb-fa7e-49c6-afd2-b30891c83a73",
"metadata": {},
"outputs": [],
"source": [
"#| hide\n",
"import optuna\n",
"import pandas as pd\n",
"from neuralforecast.models.mlp import MLP\n",
"from neuralforecast.utils import AirPassengersDF as Y_df\n",
"from neuralforecast.tsdataset import TimeSeriesDataset\n",
"from neuralforecast.losses.numpy import mae\n",
"from neuralforecast.losses.pytorch import MAE, MSE"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "8c26739d-c405-4700-a833-79c3a0fec497",
"metadata": {},
"outputs": [],
"source": [
"#| hide\n",
"Y_train_df = Y_df[Y_df.ds<='1959-12-31'] # 132 train\n",
"Y_test_df = Y_df[Y_df.ds>'1959-12-31'] # 12 test\n",
"\n",
"dataset, *_ = TimeSeriesDataset.from_df(Y_train_df)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "88148bbe-b4c1-41c3-8ce1-4f7695161d99",
"metadata": {},
"outputs": [],
"source": [
"class RayLogLossesCallback(tune.Callback):\n",
" def on_trial_complete(self, iteration, trials, trial, **info):\n",
" result = trial.last_result\n",
" print(40 * '-' + 'Trial finished' + 40 * '-')\n",
" print(f'Train loss: {result[\"train_loss\"]:.2f}. Valid loss: {result[\"loss\"]:.2f}')\n",
" print(80 * '-')"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "ae8912d7-9128-42ab-a581-5f63b6ea34eb",
"metadata": {},
"outputs": [],
"source": [
"config = {\n",
" \"hidden_size\": tune.choice([512]),\n",
" \"num_layers\": tune.choice([3, 4]),\n",
" \"input_size\": 12,\n",
" \"max_steps\": 10,\n",
" \"val_check_steps\": 5\n",
"}\n",
"auto = BaseAuto(h=12, loss=MAE(), valid_loss=MSE(), cls_model=MLP, config=config, num_samples=2, cpus=1, gpus=0, callbacks=[RayLogLossesCallback()])\n",
"auto.fit(dataset=dataset)\n",
"y_hat = auto.predict(dataset=dataset)\n",
"assert mae(Y_test_df['y'].values, y_hat[:, 0]) < 200"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "63d46d13-f0d0-4bc0-aba2-bd094a9a78c4",
"metadata": {},
"outputs": [],
"source": [
"def config_f(trial):\n",
" return {\n",
" \"hidden_size\": trial.suggest_categorical('hidden_size', [512]),\n",
" \"num_layers\": trial.suggest_categorical('num_layers', [3, 4]),\n",
" \"input_size\": 12,\n",
" \"max_steps\": 10,\n",
" \"val_check_steps\": 5\n",
" }\n",
"\n",
"class OptunaLogLossesCallback:\n",
" def __call__(self, study, trial):\n",
" metrics = trial.user_attrs['METRICS']\n",
" print(40 * '-' + 'Trial finished' + 40 * '-')\n",
" print(f'Train loss: {metrics[\"train_loss\"]:.2f}. Valid loss: {metrics[\"loss\"]:.2f}')\n",
" print(80 * '-')"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d979d9df-3a8d-4aab-aaa9-5b66067aef26",
"metadata": {},
"outputs": [],
"source": [
"auto2 = BaseAuto(h=12, loss=MAE(), valid_loss=MSE(), cls_model=MLP, config=config_f, search_alg=optuna.samplers.RandomSampler(), num_samples=2, backend='optuna', callbacks=[OptunaLogLossesCallback()])\n",
"auto2.fit(dataset=dataset)\n",
"assert isinstance(auto2.results, optuna.Study)\n",
"y_hat2 = auto2.predict(dataset=dataset)\n",
"assert mae(Y_test_df['y'].values, y_hat2[:, 0]) < 200"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "66ad2eec-dd93-4bc4-ae19-5df4199577be",
"metadata": {},
"outputs": [],
"source": [
"#| hide\n",
"Y_test_df['AutoMLP'] = y_hat\n",
"\n",
"pd.concat([Y_train_df, Y_test_df]).drop('unique_id', axis=1).set_index('ds').plot()"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "463d4dc0-b25a-4ce6-9172-5690dc979f0b",
"metadata": {},
"outputs": [],
"source": [
"#| hide\n",
"# Unit tests to guarantee that losses are correctly instantiated\n",
"import pandas as pd\n",
"from neuralforecast.models.mlp import MLP\n",
"from neuralforecast.utils import AirPassengersDF as Y_df\n",
"from neuralforecast.tsdataset import TimeSeriesDataset\n",
"from neuralforecast.losses.pytorch import MAE, MSE"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "882c8331-440a-4758-a56c-07a78c0b1603",
"metadata": {},
"outputs": [],
"source": [
"#| hide\n",
"# Unit tests to guarantee that losses are correctly instantiated\n",
"Y_train_df = Y_df[Y_df.ds<='1959-12-31'] # 132 train\n",
"Y_test_df = Y_df[Y_df.ds>'1959-12-31'] # 12 test\n",
"\n",
"dataset, *_ = TimeSeriesDataset.from_df(Y_train_df)\n",
"config = {\n",
" \"hidden_size\": tune.choice([512]),\n",
" \"num_layers\": tune.choice([3, 4]),\n",
" \"input_size\": 12,\n",
" \"max_steps\": 1,\n",
" \"val_check_steps\": 1\n",
"}\n",
"\n",
"# Test instantiation\n",
"auto = BaseAuto(h=12, loss=MAE(), valid_loss=MSE(), \n",
" cls_model=MLP, config=config, num_samples=2, cpus=1, gpus=0)\n",
"test_eq(str(type(auto.loss)), \"\")\n",
"test_eq(str(type(auto.valid_loss)), \"\")\n",
"\n",
"# Test validation default\n",
"auto = BaseAuto(h=12, loss=MSE(), valid_loss=None,\n",
" cls_model=MLP, config=config, num_samples=2, cpus=1, gpus=0)\n",
"test_eq(str(type(auto.loss)), \"\")\n",
"test_eq(str(type(auto.valid_loss)), \"\")"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "3c8e2d46",
"metadata": {},
"source": [
"### References\n",
"- [James Bergstra, Remi Bardenet, Yoshua Bengio, and Balazs Kegl (2011). \"Algorithms for Hyper-Parameter Optimization\". In: Advances in Neural Information Processing Systems. url: https://proceedings.neurips.cc/paper/2011/file/86e8f7ab32cfd12577bc2619bc635690-Paper.pdf](https://proceedings.neurips.cc/paper/2011/file/86e8f7ab32cfd12577bc2619bc635690-Paper.pdf)\n",
"- [Kirthevasan Kandasamy, Karun Raju Vysyaraju, Willie Neiswanger, Biswajit Paria, Christopher R. Collins, Jeff Schneider, Barnabas Poczos, Eric P. Xing (2019). \"Tuning Hyperparameters without Grad Students: Scalable and Robust Bayesian Optimisation with Dragonfly\". Journal of Machine Learning Research. url: https://arxiv.org/abs/1903.06694](https://arxiv.org/abs/1903.06694)\n",
"- [Lisha Li, Kevin Jamieson, Giulia DeSalvo, Afshin Rostamizadeh, Ameet Talwalkar (2016). \"Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization\". Journal of Machine Learning Research. url: https://arxiv.org/abs/1603.06560](https://arxiv.org/abs/1603.06560)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "267cbf1e",
"metadata": {},
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "python3",
"language": "python",
"name": "python3"
}
},
"nbformat": 4,
"nbformat_minor": 5
}