readme

f42429f6 · bailuo · f42429f6 · f42429f6 · f42429f6 · f42429f6
Commit f42429f6 authored Nov 19, 2025 by bailuo
20 changed files
--- a/timegpt-docs/fonts/ppneuemontreal-medium.otf
+++ b/timegpt-docs/fonts/ppneuemontreal-medium.otf
--- a/timegpt-docs/forecasting/evaluation/cross_validation.mdx
+++ b/timegpt-docs/forecasting/evaluation/cross_validation.mdx
+---
+title: "Cross-validation Tutorial"
+description: "Master time series cross-validation with TimeGPT. Complete Python tutorial for model validation, rolling-window techniques, and prediction intervals with code examples."
+icon: "check"
+---
+
+## What is Cross-validation?
+
+Time series cross-validation is essential for validating machine learning models and ensuring accurate forecasts. Unlike traditional k-fold cross-validation, time series validation requires specialized rolling-window techniques that respect temporal order. This comprehensive tutorial shows you how to perform cross-validation in Python using TimeGPT, including prediction intervals, exogenous variables, and model performance evaluation.
+
+One of the primary challenges in time series forecasting is the inherent uncertainty and variability over time, making it crucial to validate the accuracy and reliability of the models employed. Cross-validation, a robust model validation technique, is particularly adapted for this task, as it provides insights into the expected performance of a model on unseen data, ensuring the forecasts are reliable and resilient before being deployed in real-world scenarios.
+
+TimeGPT incorporates the `cross_validation` method, designed to streamline the validation process for [time series forecasting models](/forecasting/timegpt_quickstart). This functionality enables practitioners to rigorously test their forecasting models against historical data, with support for [prediction intervals](/forecasting/probabilistic/prediction_intervals) and [exogenous variables](/forecasting/exogenous-variables/numeric_features). This tutorial will guide you through the nuanced process of conducting cross-validation within the `NixtlaClient` class, ensuring your time series forecasting models are not just well-constructed, but also validated for trustworthiness and precision.
+
+### Why Use Cross-Validation for Time Series?
+
+Cross-validation provides several critical benefits for time series forecasting:
+
+- **Prevent overfitting**: Test model performance across multiple time periods
+- **Validate generalization**: Ensure forecasts work on unseen data
+- **Quantify uncertainty**: Generate prediction intervals for risk assessment
+- **Compare models**: Evaluate different forecasting approaches systematically
+- **Optimize hyperparameters**: Fine-tune model parameters with confidence
+
+
+## How to Perform Cross-validation with TimeGPT
+
+<Info>
+**Quick Summary**: Learn time series cross-validation with TimeGPT in Python. This tutorial covers rolling-window validation, prediction intervals, model performance metrics, and advanced techniques with real-world examples using the Peyton Manning dataset.
+</Info>
+
+[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Nixtla/nixtla/blob/main/nbs/docs/tutorials/08_cross_validation.ipynb)
+
+### Step 1: Import Packages and Initialize NixtlaClient
+
+First, we install and import the required packages and initialize the Nixtla client.
+
+We start off by initializing an instance of `NixtlaClient`.
+
+```python
+import pandas as pd
+from nixtla import NixtlaClient
+from IPython.display import display
+
+# Initialize TimeGPT client for cross-validation
+nixtla_client = NixtlaClient(
+    api_key='my_api_key_provided_by_nixtla'
+)
+```
+
+### Step 2: Load Example Data
+
+Use the Peyton Manning dataset as an example. The dataset can be loaded directly from Nixtla's S3 bucket:
+
+```python
+pm_df = pd.read_csv(
+    'https://datasets-nixtla.s3.amazonaws.com/peyton-manning.csv'
+)
+```
+
+<Info>If you are using your own data, ensure your data is properly formatted: you must have a time column (e.g., `ds`), a target column (e.g., `y`), and, if necessary, an identifier column (e.g., `unique_id`) for multiple time series.</Info>
+
+### Step 3: Implement Rolling-Window Cross-Validation
+
+The `cross_validation` method within the TimeGPT class is an advanced functionality crafted to perform systematic validation on time series forecasting models. This method necessitates a dataframe comprising time-ordered data and employs a rolling-window scheme to meticulously evaluate the model's performance across different time periods, thereby ensuring the model's reliability and stability over time. The animation below shows how TimeGPT performs cross-validation.
+
+<Frame caption="Rolling-window cross-validation conceptually splits your dataset into multiple training and validation sets over time.">
+  ![Rolling-window cross-validation](https://raw.githubusercontent.com/Nixtla/statsforecast/main/nbs/imgs/ChainedWindows.gif)
+</Frame>
+
+Key parameters include:
+
+- `freq`: Frequency of your data (e.g., `'D'` for daily). If not specified, it will be inferred.
+- `id_col`, `time_col`, `target_col`: Columns representing series ID, timestamps, and target values.
+- `n_windows`: Number of separate validation windows.
+- `step_size`: Step size between each validation window.
+- `h`: Forecast horizon (e.g., the number of days ahead to predict).
+
+In execution, `cross_validation` assesses the model's forecasting accuracy in each window, providing a robust view of the model's performance variability over time and potential overfitting. This detailed evaluation ensures the forecasts generated are not only accurate but also consistent across diverse temporal contexts.
+
+<Info>
+**Key Concepts**: Rolling-window cross-validation splits your dataset into multiple training and testing sets over time. Each window moves forward chronologically, training on historical data and validating on future periods. This approach mimics real-world forecasting scenarios where you predict forward in time.
+</Info>
+
+Use `cross_validation` on the Peyton Manning dataset:
+
+```python
+# Perform cross-validation with 5 windows and 7-day forecast horizon
+timegpt_cv_df = nixtla_client.cross_validation(
+    pm_df,
+    h=7,  # Forecast 7 days ahead
+    n_windows=5,  # Test across 5 different time periods
+    freq='D'  # Daily frequency
+)
+timegpt_cv_df.head()
+```
+
+The logs below indicate successful cross-validation calls and data preprocessing.
+
+
+```bash
+INFO:nixtla.nixtla_client:Validating inputs...
+INFO:nixtla.nixtla_client:Querying model metadata...
+INFO:nixtla.nixtla_client:Preprocessing dataframes...
+INFO:nixtla.nixtla_client:Restricting input...
+INFO:nixtla.nixtla_client:Calling Cross Validation Endpoint...
+```
+
+Cross-validation output includes the forecasted values (`TimeGPT`) aligned with historical values (`y`).
+
+| unique_id   | ds           | cutoff       | y          | TimeGPT    |
+| ----------- | ------------ | ------------ | ---------- | ---------- |
+| 0           | 2015-12-17   | 2015-12-16   | 7.591862   | 7.939553   |
+| 0           | 2015-12-18   | 2015-12-16   | 7.528869   | 7.887512   |
+| 0           | 2015-12-19   | 2015-12-16   | 7.171657   | 7.766617   |
+| 0           | 2015-12-20   | 2015-12-16   | 7.891331   | 7.931502   |
+| 0           | 2015-12-21   | 2015-12-16   | 8.360071   | 8.312632   |
+
+
+### Step 4: Plot Cross-Validation Results
+
+Visualize forecast performance for each cutoff period. Here's an example plotting the last 100 rows of actual data along with cross-validation forecasts for each cutoff.
+
+```python
+cutoffs = timegpt_cv_df['cutoff'].unique()
+
+for cutoff in cutoffs:
+    fig = nixtla_client.plot(
+        pm_df.tail(100),
+        timegpt_cv_df.query('cutoff == @cutoff').drop(columns=['cutoff', 'y']),
+    )
+    display(fig)
+```
+
+<Frame
+  caption="An example visualization of predicted vs. actual values in the Peyton Manning dataset."
+  >
+  ![Cross-validation Example](https://raw.githubusercontent.com/Nixtla/nixtla/readme_docs/nbs/_docs/docs/tutorials/08_cross_validation_files/figure-markdown_strict/cell-12-output-1.png)
+  ![Cross-validation Example](https://raw.githubusercontent.com/Nixtla/nixtla/readme_docs/nbs/_docs/docs/tutorials/08_cross_validation_files/figure-markdown_strict/cell-12-output-2.png)
+  ![Cross-validation Example](https://raw.githubusercontent.com/Nixtla/nixtla/readme_docs/nbs/_docs/docs/tutorials/08_cross_validation_files/figure-markdown_strict/cell-12-output-3.png)
+  ![Cross-validation Example](https://raw.githubusercontent.com/Nixtla/nixtla/readme_docs/nbs/_docs/docs/tutorials/08_cross_validation_files/figure-markdown_strict/cell-12-output-4.png)
+  ![Cross-validation Example](https://raw.githubusercontent.com/Nixtla/nixtla/readme_docs/nbs/_docs/docs/tutorials/08_cross_validation_files/figure-markdown_strict/cell-12-output-5.png)
+</Frame>
+
+### Step 5: Generate Prediction Intervals for Model Uncertainty
+
+It is also possible to generate prediction intervals during cross-validation. To do so, we simply use the `level` argument.
+
+```python
+timegpt_cv_df = nixtla_client.cross_validation(
+    pm_df,
+    h=7,
+    n_windows=5,
+    freq='D',
+    level=[80, 90],
+)
+timegpt_cv_df.head()
+```
+
+|   | unique_id | ds         | cutoff     | y        | TimeGPT  | TimeGPT-hi-80 | TimeGPT-hi-90 | TimeGPT-lo-80 | TimeGPT-lo-90 |
+|---|-----------|------------|------------|----------|----------|---------------|---------------|---------------|---------------|
+| 0 | 0         | 2015-12-17 | 2015-12-16 | 7.591862 | 7.939553 | 8.201465      | 8.314956      | 7.677642      | 7.564151      |
+| 1 | 0         | 2015-12-18 | 2015-12-16 | 7.528869 | 7.887512 | 8.175414      | 8.207470      | 7.599609      | 7.567553      |
+| 2 | 0         | 2015-12-19 | 2015-12-16 | 7.171657 | 7.766617 | 8.267363      | 8.386674      | 7.265871      | 7.146560      |
+| 3 | 0         | 2015-12-20 | 2015-12-16 | 7.891331 | 7.931502 | 8.205929      | 8.369983      | 7.657075      | 7.493020      |
+| 4 | 0         | 2015-12-21 | 2015-12-16 | 8.360071 | 8.312632 | 9.184893      | 9.625794      | 7.440371      | 6.999469      |
+
+Plot the prediction intervals for the cross-validation results.
+
+```python
+cutoffs = timegpt_cv_df['cutoff'].unique()
+for cutoff in cutoffs:
+    fig = nixtla_client.plot(
+        pm_df.tail(100), 
+        timegpt_cv_df.query('cutoff == @cutoff').drop(columns=['cutoff', 'y']),
+        level=[80, 90],
+        models=['TimeGPT']
+    )
+    display(fig)
+```
+
+<Frame
+  caption="An example visualization of predicted vs. actual values in the Peyton Manning dataset with prediction intervals."
+  >
+  ![Cross-validation Example with Prediction Intervals](https://raw.githubusercontent.com/Nixtla/nixtla/readme_docs/nbs/_docs/docs/tutorials/08_cross_validation_files/figure-markdown_strict/cell-14-output-1.png)
+  ![Cross-validation Example with Prediction Intervals](https://raw.githubusercontent.com/Nixtla/nixtla/readme_docs/nbs/_docs/docs/tutorials/08_cross_validation_files/figure-markdown_strict/cell-14-output-2.png)
+  ![Cross-validation Example with Prediction Intervals](https://raw.githubusercontent.com/Nixtla/nixtla/readme_docs/nbs/_docs/docs/tutorials/08_cross_validation_files/figure-markdown_strict/cell-14-output-3.png)
+  ![Cross-validation Example with Prediction Intervals](https://raw.githubusercontent.com/Nixtla/nixtla/readme_docs/nbs/_docs/docs/tutorials/08_cross_validation_files/figure-markdown_strict/cell-14-output-4.png)
+  ![Cross-validation Example with Prediction Intervals](https://raw.githubusercontent.com/Nixtla/nixtla/readme_docs/nbs/_docs/docs/tutorials/08_cross_validation_files/figure-markdown_strict/cell-14-output-5.png)
+</Frame>
+
+### Step 6: Enhance Forecasts with Exogenous Variables
+
+#### Time Features
+
+It is possible to include exogenous variables when performing cross-validation. Here we use the `date_features` parameter to create labels for each month. These features are then used by the model to make predictions during cross-validation.
+
+```python
+timegpt_cv_df = nixtla_client.cross_validation(
+    pm_df,
+    h=7,
+    n_windows=5,
+    freq='D',
+    date_features=['month'],
+)
+timegpt_cv_df.head()
+```
+
+|   | unique_id | ds         | cutoff     | y        | TimeGPT  | TimeGPT-hi-80 | TimeGPT-hi-90 | TimeGPT-lo-80 | TimeGPT-lo-90 |
+|---|-----------|------------|------------|----------|----------|---------------|---------------|---------------|---------------|
+| 0 | 0         | 2015-12-17 | 2015-12-16 | 7.591862 | 8.426320 | 8.721996      | 8.824101      | 8.130644      | 8.028540      |
+| 1 | 0         | 2015-12-18 | 2015-12-16 | 7.528869 | 8.049962 | 8.452083      | 8.658603      | 7.647842      | 7.441321      |
+| 2 | 0         | 2015-12-19 | 2015-12-16 | 7.171657 | 7.509098 | 7.984788      | 8.138017      | 7.033409      | 6.880180      |
+| 3 | 0         | 2015-12-20 | 2015-12-16 | 7.891331 | 7.739536 | 8.306914      | 8.641355      | 7.172158      | 6.837718      |
+| 4 | 0         | 2015-12-21 | 2015-12-16 | 8.360071 | 8.027471 | 8.722828      | 9.152306      | 7.332113      | 6.902636      |
+
+
+Plot the cross-validation results with the time features.
+
+```python
+cutoffs = timegpt_cv_df['cutoff'].unique()
+for cutoff in cutoffs:
+    fig = nixtla_client.plot(
+        pm_df.tail(100), 
+        timegpt_cv_df.query('cutoff == @cutoff').drop(columns=['cutoff', 'y']),
+        date_features=['month'],
+        models=['TimeGPT']
+    )
+    display(fig)
+```
+
+<Frame
+  caption="An example visualization of predicted vs. actual values in the Peyton Manning dataset with time features."
+  >
+  ![Cross-validation Example with Time Features](https://raw.githubusercontent.com/Nixtla/nixtla/readme_docs/nbs/_docs/docs/tutorials/08_cross_validation_files/figure-markdown_strict/cell-16-output-1.png)
+  ![Cross-validation Example with Time Features](https://raw.githubusercontent.com/Nixtla/nixtla/readme_docs/nbs/_docs/docs/tutorials/08_cross_validation_files/figure-markdown_strict/cell-16-output-2.png)
+  ![Cross-validation Example with Time Features](https://raw.githubusercontent.com/Nixtla/nixtla/readme_docs/nbs/_docs/docs/tutorials/08_cross_validation_files/figure-markdown_strict/cell-16-output-3.png)
+  ![Cross-validation Example with Time Features](https://raw.githubusercontent.com/Nixtla/nixtla/readme_docs/nbs/_docs/docs/tutorials/08_cross_validation_files/figure-markdown_strict/cell-16-output-4.png)
+  ![Cross-validation Example with Time Features](https://raw.githubusercontent.com/Nixtla/nixtla/readme_docs/nbs/_docs/docs/tutorials/08_cross_validation_files/figure-markdown_strict/cell-16-output-5.png)
+</Frame>
+
+#### Dynamic Features
+
+Additionally you can pass dynamic exogenous variables to better inform TimeGPT about the data. You just simply have to add the exogenous regressors after the target column.
+
+```python
+Y_df = pd.read_csv('https://raw.githubusercontent.com/Nixtla/transfer-learning-time-series/main/datasets/electricity.csv')
+X_df = pd.read_csv('https://raw.githubusercontent.com/Nixtla/transfer-learning-time-series/main/datasets/exogenous-vars-electricity.csv')
+df = Y_df.merge(X_df)
+```
+
+Now let's cross validate `TimeGPT` considering this information
+
+```python
+timegpt_cv_df_x = nixtla_client.cross_validation(
+    df.groupby('unique_id').tail(100 * 48), 
+    h=48, 
+    n_windows=2,
+    level=[80, 90]
+)
+cutoffs = timegpt_cv_df_x.query('unique_id == "BE"')['cutoff'].unique()
+for cutoff in cutoffs:
+    fig = nixtla_client.plot(
+        df.query('unique_id == "BE"').tail(24 * 7), 
+        timegpt_cv_df_x.query('cutoff == @cutoff & unique_id == "BE"').drop(columns=['cutoff', 'y']),
+        models=['TimeGPT'],
+        level=[80, 90],
+    )
+    display(fig)
+```
+
+<Frame
+  caption="An example visualization of predicted vs. actual values in the electricity dataset with dynamic exogenous variables."
+  >
+  ![Cross-validation Example with Dynamic Exogenous Variables](https://raw.githubusercontent.com/Nixtla/nixtla/readme_docs/nbs/_docs/docs/tutorials/08_cross_validation_files/figure-markdown_strict/cell-19-output-2.png)
+  ![Cross-validation Example with Dynamic Exogenous Variables](https://raw.githubusercontent.com/Nixtla/nixtla/readme_docs/nbs/_docs/docs/tutorials/08_cross_validation_files/figure-markdown_strict/cell-19-output-3.png)
+</Frame>
+
+### Step 7: Long-Horizon Forecasting with TimeGPT
+
+Also, you can generate cross validation for different instances of `TimeGPT` using the `model` argument. Here we use the base model and the model for long-horizon forecasting.
+
+```python
+timegpt_cv_df_x_long_horizon = nixtla_client.cross_validation(
+    df.groupby('unique_id').tail(100 * 48), 
+    h=48, 
+    n_windows=2,
+    level=[80, 90],
+    model='timegpt-1-long-horizon',
+)
+timegpt_cv_df_x_long_horizon.columns = timegpt_cv_df_x_long_horizon.columns.str.replace('TimeGPT', 'TimeGPT-LongHorizon')
+timegpt_cv_df_x_models = timegpt_cv_df_x_long_horizon.merge(timegpt_cv_df_x)
+cutoffs = timegpt_cv_df_x_models.query('unique_id == "BE"')['cutoff'].unique()
+for cutoff in cutoffs:
+    fig = nixtla_client.plot(
+        df.query('unique_id == "BE"').tail(24 * 7), 
+        timegpt_cv_df_x_models.query('cutoff == @cutoff & unique_id == "BE"').drop(columns=['cutoff', 'y']),
+        models=['TimeGPT', 'TimeGPT-LongHorizon'],
+        level=[80, 90],
+    )
+    display(fig)
+```
+
+<Frame
+  caption="An example visualization of predicted vs. actual values in the electricity dataset with dynamic exogenous variables and long horizon forecasting."
+  >
+  ![Cross-validation Example with Long Horizon Forecasting](https://raw.githubusercontent.com/Nixtla/nixtla/readme_docs/nbs/_docs/docs/tutorials/08_cross_validation_files/figure-markdown_strict/cell-20-output-2.png)
+  ![Cross-validation Example with Long Horizon Forecasting](https://raw.githubusercontent.com/Nixtla/nixtla/readme_docs/nbs/_docs/docs/tutorials/08_cross_validation_files/figure-markdown_strict/cell-20-output-3.png)
+</Frame>
+
+## Frequently Asked Questions
+
+**What is time series cross-validation?**
+
+Time series cross-validation is a model validation technique that uses rolling windows to evaluate forecasting accuracy while preserving temporal order, ensuring reliable predictions on unseen data.
+
+**How is time series cross-validation different from k-fold cross-validation?**
+
+Unlike k-fold cross-validation which randomly shuffles data, time series cross-validation maintains temporal order using techniques like walk-forward validation and expanding windows to prevent data leakage.
+
+**What are the key parameters for cross-validation in TimeGPT?**
+
+Key parameters include `h` (forecast horizon), `n_windows` (number of validation windows), `step_size` (window increment), and `level` (prediction interval confidence levels).
+
+**How do you evaluate cross-validation results?**
+
+Evaluate results by comparing forecasted values against actual values across multiple time windows, analyzing prediction intervals, and calculating metrics like MAE, RMSE, and MAPE.
+
+## Conclusion
+
+You've mastered time series cross-validation with TimeGPT, including rolling-window validation, prediction intervals, exogenous variables, and long-horizon forecasting. These model validation techniques ensure your forecasts are accurate, reliable, and production-ready.
+
+### Next Steps in Model Validation
+
+- Explore [evaluation metrics](/forecasting/evaluation/evaluation_metrics) to quantify forecast accuracy
+- Learn about [fine-tuning TimeGPT](/forecasting/fine-tuning/steps) for domain-specific data
+- Apply cross-validation to [multiple time series](/data_requirements/multiple_series)
+
+Ready to validate your forecasts at scale? [Start your TimeGPT trial](https://dashboard.nixtla.io/) and implement robust cross-validation today.
\ No newline at end of file
--- a/timegpt-docs/forecasting/evaluation/evaluation_metrics.mdx
+++ b/timegpt-docs/forecasting/evaluation/evaluation_metrics.mdx
+---
+title: "Evaluation Metrics"
+description: "Learn to select the right evaluation metrics to measure the performance of TimeGPT."
+icon: "vial"
+---
+
+Selecting the right evaluation metric is crucial, as it guides the selection of the best settings for TimeGPT to ensure the model is making accurate forecasts.
+
+## Overview of Common Evaluation Metrics
+
+The following table summarizes the common evaluation metrics used in forecasting depending on the type of forecasts. It also indicates when to use and when to avoid a particular metric.
+
+| Metric | Types of forecast      | Properties                                                                                                                                                           | When to avoid                                   |
+| ------ | ---------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------- |
+| MAE    | Point forecast         | <ul><li>robust to outliers</li><li>easy to interpret</li><li>same units as the data</li></ul>                                                                        | When averaging over series of different scales  |
+| MSE    | Point forecast         | <ul><li>penalizes large errors</li><li>not the same units as the data</li><li>sensitive to outliers</li></ul>                                                        | There are unrepresentative outliers in the data |
+| RMSE   | Point forecast         | <ul><li>penalizes large errors</li><li>same units as the data</li><li>sensitive to outliers</li></ul>                                                                | There are unrepresentative outliers in the data |
+| MAPE   | Point forecast         | <ul><li>expressed as a percentage</li><li>easy to interpret</li><li>favors under-forecasts</li></ul>                                                                 | When data has zero values                       |
+| sMAPE  | Point forecast         | <ul><li>robust to over- and under-forecasts</li><li>expressed as a percentage</li><li>easy to interpret</li></ul>                                                    | When data has zero values                       |
+| MASE   | Point forecast         | <ul><li>like the MAE, but scaled by the naive forecast</li><li>inherently compares to a simple benchmark</li><li>requires technical knowledge to interpret</li></ul> | There is only one series to evaluate            |
+| CRPS   | Probabilistic forecast | <ul><li>generalizaed MAE for probabilistic forecasts</li><li>requires technical knowledge to interpret</li></ul>                                                     | When evaluating point forecasts                 |
+
+In the following sections, we dive deeper into each metric. Note that all of these metrics can be used to evaluate the forecasts of TimeGPT using the _utilsforecast_ library. For more information, read our tutorial on [evaluating TimeGPT with utilsforecast](/forecasting/evaluation/evaluation_utilsforecast).
+
+## Mean Absolute Error (MAE)
+
+The mean absolute error simply averages the absolute distance between the forecasts and the actual values.
+
+It is a good evaluation metric that works in the vast majority of forecasting tasks. It is robust to outliers, meaning that it will not magnifiy large errors, and it is expressed as the same units as the data, making it easy to interpret.
+
+Simply be careful when average the MAE over multiple series of different scales, since then a series with smaller values might bring down the MAE, while a series with larger values will bring it up.
+
+## Mean Squared Error (MSE)
+
+The mean squared error squares the forecast errors before averaging them, which heavily penalizes large errors while giving less weight to small ones.
+
+As such, it is not robust to outliers since a single large error can dramatically inflate the MSE value. Additionally, the units are squared (e.g., dollars²), making it difficult to interpret in practical terms.
+
+Avoid MSE when your data contains outliers or when you need an easily interpretable metric. It's best used in optimization contexts where you specifically want to penalize large errors more severely.
+
+## Root Mean Squared Error (RMSE)
+
+The root mean squared error is simply the square root of the MSE, bringing the metric back to the original units of the data while preserving MSE's property of penalizing large errors.
+
+RMSE is more interpretable than MSE since it's expressed in the same units as your data.
+
+You should avoid RMSE when outliers are present or when you want equal treatment of all errors.
+
+## Mean Absolute Percentage Error (MAPE)
+
+The mean absolute percentage error expresses forecast errors as percentages of the actual values, making it scale-independent and easy to interpret.
+
+MAPE is excellent for comparing forecast accuracy across different time series with varying scales. It's intuitive and easily understood in business contexts.
+
+Avoid MAPE when your data contains zero or near-zero values (causes division by zero) or when you have intermittent demand patterns.
+
+Not that it's also asymmetric, penalizing positive errors (over-forecasts) more heavily than negative errors (under-forecasts).
+
+## Symmetric Mean Absolute Percentage Error (sMAPE)
+
+The symmetric mean absolute percentage error attempts to address MAPE's asymmetry by using the average of actual and forecast values in the denominator, making it more balanced between over- and under-forecasts.
+sMAPE is more stable than MAPE and less prone to extreme values. It's still scale-independent and relatively easy to interpret, though not as intuitive as MAPE.
+
+Avoid sMAPE when dealing with zero values or when the sum of actual and forecast values approaches zero. While more symmetric than MAPE, it's still not perfectly symmetric and can behave unexpectedly in edge cases.
+
+## Mean Absolute Scaled Error (MASE)
+
+The mean absolute scaled error scales forecast errors relative to the average error of a naive seasonal forecast, providing a scale-independent measure that's robust and interpretable.
+
+MASE is excellent for comparing forecasts across different time series and scales. A MASE value less than 1 indicates your forecast is better than the naive benchmark, while values greater than 1 indicate worse performance.
+
+It's robust to outliers and handles zero values well.
+
+While it is a good metric to compare across multiple series, it might not make sense for you to compare against naive forecasts, and it does require some technical knowledge to interpret correctly.
+
+## Continuous Ranked Probability Score (CRPS)
+
+The continuous ranked probability score measures the distance between the entire forecast distribution and the observed value, making it ideal for evaluating probabilistic forecasts.
+
+CRPS is a proper scoring rule that reduces to MAE when dealing with deterministic forecasts, making it a natural extension for probabilistic forecasting. It's expressed in the same units as the original data and provides a comprehensive evaluation of forecast distributions, rewarding both accuracy and good uncertainty quantification.
+
+CRPS is specifically designed for probabilistic forecasts, so avoid it when you only have point forecasts. It's also more computationally intensive to calculate than simpler metrics and may be less intuitive for stakeholders unfamiliar with probabilistic forecasting concepts.
+
+## Evaluating TimeGPT
+
+To learn how to use any of the metrics outlined above to evaluate the forecasts of TimeGPT, read our tutorial on [evaluating TimeGPT with utilsforecast](/forecasting/evaluation/evaluation_utilsforecast).
--- a/timegpt-docs/forecasting/evaluation/evaluation_utilsforecast.mdx
+++ b/timegpt-docs/forecasting/evaluation/evaluation_utilsforecast.mdx
+---
+title: "Evaluation Pipeline"
+description: "Learn how to evaluate TimeGPT model performance using tools in utilforecast"
+icon: "square-root-variable"
+---
+
+## Overview
+
+After generating forecasts with TimeGPT, the next step is to evaluate how accurate those forecasts are. The evaluate function from the utilsforecast library provides a fast and flexible way to assess model performance using a wide range of metrics. This pipeline works seamlessly with TimeGPT and other forecasting models.  
+With the evaluation pipeline, you can easily select models and define metrics like MAE, MSE, or MAPE to benchmark forecasting performance.
+
+## Step-to-Step Guide
+
+### Step 1. Import Required Packages
+
+Start by importing the necessary libraries and initializing the `NixtlaClient` with your API key.
+
+```python
+import pandas as pd
+from nixtla import NixtlaClient
+from functools import partial
+from utilsforecast.evaluation import evaluate
+from utilsforecast.losses import mae, mse, rmse, mape, smape, mase, scaled_crps
+
+nixtla_client = NixtlaClient(api_key='your_api_key_here')
+```
+
+### Step 2. Load and Prepare the Dataset
+
+For this example, we use the Air Passenger dataset, which records monthly totals of international airline passengers. We'll load the dataset, format the timestamps, and split the data into a training set and a test set. The last 12 months are used for testing.
+
+```python
+df = pd.read_csv('https://raw.githubusercontent.com/Nixtla/transfer-learning-time-series/main/datasets/air_passengers.csv')
+df['unique_id'] = 'passengers'
+df['timestamp'] = pd.to_datetime(df['timestamp'])
+```
+
+```python
+df_train = df.iloc[:-12]
+df_test = df.iloc[-12:]
+```
+
+### Step 3. Generate Forecast with TimeGPT
+
+Next, we will:
+
+- Use the training set to generate a 12-step forecast with TimeGPT.
+- Merge the forecast with the test set for evaluation.
+
+```python
+fcst_timegpt = nixtla_client.forecast(df = df_train,
+                                      h=12,
+                                      time_col = 'timestamp',
+                                      target_col = 'value',
+                                      level=[80, 95])
+fcst_timegpt = fcst_timegpt.merge(df_test, on = ['timestamp','unique_id'])
+```
+
+### Step 4. Define Models and Metrics for Evaluation
+
+Next, we define the models to evaluate and the metrics to use. For more information about supported metrics, refer to the [evaluation metrics tutorial](forecasting/evaluation/evaluation_metrics) .
+
+```python
+models = ['TimeGPT']
+metrics = [
+           mae,
+           mse,
+           rmse,
+           mape,
+           smape,
+           partial(mase, seasonality=12),
+           scaled_crps
+           ]
+```
+
+### Step 5. Run the Evaluation
+
+Finally, call the evaluate function with your merged forecast results. Include `train_df` for metrics that need the training data and `level` if using probabilistic metrics.
+
+```python
+evaluation = evaluate(
+    fcst_timegpt,
+    target_col = 'value',
+    time_col = 'timestamp',
+    metrics=metrics,
+    models=model,
+    train_df=df_train,
+    level=[80, 95]
+)
+```
+
+| unique_id  | metric      | TimeGPT  |
+| ---------- | ----------- | -------- |
+| passengers | mae         | 12.67930 |
+| passengers | mse         | 213.9358 |
+| passengers | rmse        | 14.62654 |
+| passengers | mape        | 0.026964 |
+| passengers | smape       | 0.013527 |
+| passengers | mase        | 0.416397 |
+| passengers | scaled_crps | 0.008991 |
--- a/timegpt-docs/forecasting/exogenous-variables/categorical_features.mdx
+++ b/timegpt-docs/forecasting/exogenous-variables/categorical_features.mdx
+---
+title: "Categorical Variables"
+description: "Learn how to incorporate external categorical variables in your TimeGPT forecasts to improve accuracy."
+icon: "input-text"
+---
+
+## What Are Categorical Variables?
+
+Categorical variables are external factors that take on a limited range of discrete values, grouping observations by categories. For example, "Sporting" or "Cultural" events in a dataset describing product demand.
+
+By capturing unique external conditions, categorical variables enhance the predictive power of your model and can reduce forecasting error. They are easy to incorporate by merging each time series data point with its corresponding categorical data.
+
+This tutorial demonstrates how to incorporate categorical (discrete) variables into TimeGPT forecasts.
+
+## How to Use Categorical Variables in TimeGPT
+
+[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Nixtla/nixtla/blob/main/nbs/docs/tutorials/03_categorical_variables.ipynb)
+
+### Step 1: Import Packages and Initialize the Nixtla Client
+
+Make sure you have the necessary libraries installed: pandas, nixtla, and datasetsforecast.
+
+```python
+import pandas as pd
+import os
+
+from nixtla import NixtlaClient
+from datasetsforecast.m5 import M5
+
+# Initialize the Nixtla Client
+nixtla_client = NixtlaClient(
+    api_key='my_api_key_provided_by_nixtla'
+)
+```
+
+### Step 2: Load M5 Data
+
+We use the **M5 dataset** — a collection of daily product sales demands across 10 US stores — to showcase how categorical variables can improve forecasts.
+
+Start by loading the M5 dataset and converting the date columns to datetime objects.
+
+```python
+Y_df, X_df, _ = M5.load(directory=os.getcwd())
+
+Y_df['ds'] = pd.to_datetime(Y_df['ds'])
+X_df['ds'] = pd.to_datetime(X_df['ds'])
+
+Y_df.head(10)
+```
+
+| unique_id | ds | y |
+|-----------|----|---|
+| FOODS_1_001_CA_1 | 2011-01-29 | 3.0 |
+| FOODS_1_001_CA_1 | 2011-01-30 | 0.0 |
+| FOODS_1_001_CA_1 | 2011-01-31 | 0.0 |
+| FOODS_1_001_CA_1 | 2011-02-01 | 1.0 |
+| FOODS_1_001_CA_1 | 2011-02-02 | 4.0 |
+| FOODS_1_001_CA_1 | 2011-02-03 | 2.0 |
+| FOODS_1_001_CA_1 | 2011-02-04 | 0.0 |
+| FOODS_1_001_CA_1 | 2011-02-05 | 2.0 |
+| FOODS_1_001_CA_1 | 2011-02-06 | 0.0 |
+| FOODS_1_001_CA_1 | 2011-02-07 | 0.0 |
+
+Extract the categorical columns from the X_df dataframe.
+
+```python
+X_df = X_df[['unique_id', 'ds', 'event_type_1']]
+X_df.head(10)
+```
+
+| unique_id | ds | event_type_1 |
+|-----------|----|--------------|
+| FOODS_1_001_CA_1 | 2011-01-29 | nan |
+| FOODS_1_001_CA_1 | 2011-01-30 | nan |
+| FOODS_1_001_CA_1 | 2011-01-31 | nan |
+| FOODS_1_001_CA_1 | 2011-02-01 | nan |
+| FOODS_1_001_CA_1 | 2011-02-02 | nan |
+| FOODS_1_001_CA_1 | 2011-02-03 | nan |
+| FOODS_1_001_CA_1 | 2011-02-04 | nan |
+| FOODS_1_001_CA_1 | 2011-02-05 | nan |
+| FOODS_1_001_CA_1 | 2011-02-06 | Sporting |
+| FOODS_1_001_CA_1 | 2011-02-07 | nan |
+
+Notice that there is a Sporting event on February 6, 2011, listed under `event_type_1`.
+
+### Step 3: Prepare Data for Forecasting
+
+We'll select a specific product to demonstrate how to incorporate categorical features into TimeGPT forecasts.
+
+#### Select a High-Selling Product and Merge Data
+
+Start by selecting a high-selling product and merging the data.
+
+```python
+product = 'FOODS_3_090_CA_3'
+
+Y_df_product = Y_df.query('unique_id == @product')
+X_df_product = X_df.query('unique_id == @product')
+
+df = Y_df_product.merge(X_df_product)
+df.head(10)
+```
+
+| unique_id | ds | y | event_type_1 |
+|-----------|----|---|--------------|
+| FOODS_3_090_CA_3 | 2011-01-29 | 108.0 | nan |
+| FOODS_3_090_CA_3 | 2011-01-30 | 132.0 | nan |
+| FOODS_3_090_CA_3 | 2011-01-31 | 102.0 | nan |
+| FOODS_3_090_CA_3 | 2011-02-01 | 120.0 | nan |
+| FOODS_3_090_CA_3 | 2011-02-02 | 106.0 | nan |
+| FOODS_3_090_CA_3 | 2011-02-03 | 123.0 | nan |
+| FOODS_3_090_CA_3 | 2011-02-04 | 279.0 | nan |
+| FOODS_3_090_CA_3 | 2011-02-05 | 175.0 | nan |
+| FOODS_3_090_CA_3 | 2011-02-06 | 186.0 | Sporting |
+| FOODS_3_090_CA_3 | 2011-02-07 | 120.0 | nan |
+
+#### One-Hot Encode Categorical Events
+
+Encode categorical variables using one-hot encoding. One-hot encoding transforms each category into a separate column containing binary indicators (0 or 1).
+
+```python
+event_type_1_ohe = pd.get_dummies(df['event_type_1'], dtype=int)
+
+df = pd.concat([df, event_type_1_ohe], axis=1)
+df = df.drop(columns=['event_type_1'])
+
+df.tail(10)
+```
+
+| unique_id | ds | y | Cultural | National | Religious | Sporting | nan |
+|-----------|----|---|----------|----------|-----------|-----------|-----|
+| FOODS_3_090_CA_3 | 2016-06-10 | 140.0 | 0 | 0 | 0 | 0 | 1 |
+| FOODS_3_090_CA_3 | 2016-06-11 | 151.0 | 0 | 0 | 0 | 0 | 1 |
+| FOODS_3_090_CA_3 | 2016-06-12 | 87.0 | 0 | 0 | 0 | 0 | 1 |
+| FOODS_3_090_CA_3 | 2016-06-13 | 67.0 | 0 | 0 | 0 | 0 | 1 |
+| FOODS_3_090_CA_3 | 2016-06-14 | 50.0 | 0 | 0 | 0 | 0 | 1 |
+| FOODS_3_090_CA_3 | 2016-06-15 | 58.0 | 0 | 0 | 0 | 0 | 1 |
+| FOODS_3_090_CA_3 | 2016-06-16 | 116.0 | 0 | 0 | 0 | 0 | 1 |
+| FOODS_3_090_CA_3 | 2016-06-17 | 124.0 | 0 | 0 | 0 | 0 | 1 |
+| FOODS_3_090_CA_3 | 2016-06-18 | 167.0 | 0 | 0 | 0 | 0 | 1 |
+| FOODS_3_090_CA_3 | 2016-06-19 | 118.0 | 0 | 0 | 0 | 1 | 0 |
+
+#### Prepare Future External Variables
+
+Select future external variables for Feb 1-7, 2016.
+
+```python
+future_ex_vars_df = df.drop(columns=['y']).query("ds >= '2016-02-01' & ds <= '2016-02-07'")
+```
+
+Separate training data before Feb 1, 2016.
+
+```python
+df_train = df.query("ds < '2016-02-01'")
+df_train.tail(10)
+```
+
+| unique_id | ds | y | Cultural | National | Religious | Sporting | nan |
+|-----------|----|---|----------|----------|-----------|-----------|-----|
+| FOODS_3_090_CA_3 | 2016-01-22 | 94.0 | 0 | 0 | 0 | 0 | 1 |
+| FOODS_3_090_CA_3 | 2016-01-23 | 144.0 | 0 | 0 | 0 | 0 | 1 |
+| FOODS_3_090_CA_3 | 2016-01-24 | 146.0 | 0 | 0 | 0 | 0 | 1 |
+| FOODS_3_090_CA_3 | 2016-01-25 | 87.0 | 0 | 0 | 0 | 0 | 1 |
+| FOODS_3_090_CA_3 | 2016-01-26 | 73.0 | 0 | 0 | 0 | 0 | 1 |
+| FOODS_3_090_CA_3 | 2016-01-27 | 62.0 | 0 | 0 | 0 | 0 | 1 |
+| FOODS_3_090_CA_3 | 2016-01-28 | 64.0 | 0 | 0 | 0 | 0 | 1 |
+| FOODS_3_090_CA_3 | 2016-01-29 | 102.0 | 0 | 0 | 0 | 0 | 1 |
+| FOODS_3_090_CA_3 | 2016-01-30 | 113.0 | 0 | 0 | 0 | 0 | 1 |
+| FOODS_3_090_CA_3 | 2016-01-31 | 98.0 | 0 | 0 | 0 | 0 | 1 |
+
+
+### Step 4: Forecast Product Demand
+
+To evaluate the impact of categorical variables, we'll forecast product demand with and without them.
+
+#### Forecast Without Categorical Variables
+
+```python
+timegpt_fcst_without_cat_vars_df = nixtla_client.forecast(
+    df=df_train,
+    h=7,
+    level=[80, 90]
+)
+
+timegpt_fcst_without_cat_vars_df.head()
+```
+
+| unique_id | ds | TimeGPT | TimeGPT-lo-90 | TimeGPT-lo-80 | TimeGPT-hi-80 | TimeGPT-hi-90 |
+|-----------|----|---------|---------------|---------------|---------------|---------------|
+| FOODS_3_090_CA_3 | 2016-02-01 | 73.304092 | 53.449049 | 54.795078 | 91.813107 | 93.159136 |
+| FOODS_3_090_CA_3 | 2016-02-02 | 66.335518 | 47.510669 | 50.274136 | 82.396899 | 85.160367 |
+| FOODS_3_090_CA_3 | 2016-02-03 | 65.881630 | 36.218617 | 41.388896 | 90.374364 | 95.544643 |
+| FOODS_3_090_CA_3 | 2016-02-04 | 72.371864 | -26.683115 | 25.097362 | 119.646367 | 171.426844 |
+| FOODS_3_090_CA_3 | 2016-02-05 | 95.141045 | -2.084882 | 34.027078 | 156.255011 | 192.366971 |
+
+Visualize the forecast without categorical variables.
+
+```python
+nixtla_client.plot(
+    df[['unique_id', 'ds', 'y']].query("ds <= '2016-02-07'"),
+    timegpt_fcst_without_cat_vars_df,
+    max_insample_length=28,
+)
+```
+
+<Frame caption="Forecast with categorical variables">
+  ![Forecast with categorical variables](https://raw.githubusercontent.com/Nixtla/nixtla/readme_docs/nbs/_docs/docs/tutorials/03_categorical_variables_files/figure-markdown_strict/cell-19-output-1.png)
+</Frame>
+
+TimeGPT already provides a reasonable forecast, but it seems to somewhat underforecast the peak on the 6th of February 2016 - the day before the Super Bowl.
+
+#### Forecast With Categorical Variables
+
+```python
+timegpt_fcst_with_cat_vars_df = nixtla_client.forecast(
+    df=df_train,
+    X_df=future_ex_vars_df,
+    h=7,
+    level=[80, 90]
+)
+
+timegpt_fcst_with_cat_vars_df.head()
+```
+
+| unique_id | ds | TimeGPT | TimeGPT-lo-90 | TimeGPT-lo-80 | TimeGPT-hi-80 | TimeGPT-hi-90 |
+|-----------|----|---------|---------------|---------------|---------------|---------------|
+| FOODS_3_090_CA_3 | 2016-02-01 | 70.661271 | -0.204378 | 14.593348 | 126.729194 | 141.526919 |
+| FOODS_3_090_CA_3 | 2016-02-02 | 65.566941 | -20.394326 | 11.654239 | 119.479643 | 151.528208 |
+| FOODS_3_090_CA_3 | 2016-02-03 | 68.510010 | -33.713710 | 6.732952 | 130.287069 | 170.733731 |
+| FOODS_3_090_CA_3 | 2016-02-04 | 75.417710 | -40.974649 | 4.751767 | 146.083653 | 191.810069 |
+| FOODS_3_090_CA_3 | 2016-02-05 | 97.340302 | -57.385361 | 18.253812 | 176.426792 | 252.065965 |
+
+Visualize the forecast with categorical variables.
+
+```python
+# Visualize the forecast with categorical variables
+nixtla_client.plot(
+    df[['unique_id', 'ds', 'y']].query("ds <= '2016-02-07'"),
+    timegpt_fcst_with_cat_vars_df,
+    max_insample_length=28,
+)
+```
+<Frame caption="Forecast with categorical variables">
+  ![Forecast with categorical variables](https://raw.githubusercontent.com/Nixtla/nixtla/readme_docs/nbs/_docs/docs/tutorials/03_categorical_variables_files/figure-markdown_strict/cell-21-output-1.png)
+</Frame>
+
+## 5. Evaluate Forecast Accuracy
+
+Finally, we calculate the **Mean Absolute Error (MAE)** for the forecasts with and without categorical variables.
+
+
+```python
+# Create target dataframe
+df_target = df[['unique_id', 'ds', 'y']].query("ds >= '2016-02-01' & ds <= '2016-02-07'")
+
+# Rename forecast columns
+timegpt_fcst_without_cat_vars_df = timegpt_fcst_without_cat_vars_df.rename(columns={'TimeGPT': 'TimeGPT-without-cat-vars'})
+timegpt_fcst_with_cat_vars_df = timegpt_fcst_with_cat_vars_df.rename(columns={'TimeGPT': 'TimeGPT-with-cat-vars'})
+
+# Merge forecasts with target dataframe
+df_target = df_target.merge(timegpt_fcst_without_cat_vars_df[['unique_id', 'ds', 'TimeGPT-without-cat-vars']])
+df_target = df_target.merge(timegpt_fcst_with_cat_vars_df[['unique_id', 'ds', 'TimeGPT-with-cat-vars']])
+
+# Compute errors
+mean_absolute_errors = mae(df_target, ['TimeGPT-without-cat-vars', 'TimeGPT-with-cat-vars'])
+```
+
+
+| unique_id        | TimeGPT-without-cat-vars | TimeGPT-with-cat-vars |
+|------------------|--------------------------|-----------------------|
+| FOODS_3_090_CA_3 | 24.285649                | 20.028514             |
+
+Including categorical variables noticeably improves forecast accuracy, reducing MAE by about 20%.
+
+## Conclusion
+
+Categorical variables are powerful additions to TimeGPT forecasts, helping capture valuable external factors. By properly encoding these variables and merging them with your time series, you can significantly enhance predictive performance.
+
+Continue exploring more advanced techniques or different datasets to further improve your TimeGPT forecasting models.
\ No newline at end of file
--- a/timegpt-docs/forecasting/exogenous-variables/date_features.mdx
+++ b/timegpt-docs/forecasting/exogenous-variables/date_features.mdx
+---
+title: "Date/Time Features"
+description: "Learn how to incorporate date/time features into your forecasts to improve performance."
+icon: "clock"
+---
+
+## Why incorporate Date/Time Features in your Forecasts
+
+Many time series display patterns that repeat based on the calendar like demand
+increasing on weekends, sales peaking at the end of the month, or traffic
+varying by hour of the day. Recognizing and capturing these time-based patterns
+can be a powerful way to improve forecasting accuracy.
+
+While you can forecast a time series based solely on its historical values,
+adding additional date/time related features, such as the day of the
+week, month, quarter, or hour, can often enhance the model's performance. These
+features can be especially useful when your dataset lacks exogenous variables,
+but they can also complement external regressors when available.
+
+In this tutorial, we'll walk through how to incorporate these date/time features
+into TimeGPT to boost the accuracy of your forecasts.
+
+
+## How to incorporate Date/Time Features in your Forecasts
+
+[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Nixtla/nixtla/blob/main/nbs/docs/tutorials/date_features.ipynb)
+
+### Step 1: Import Packages
+
+Import the necessary libraries and initialize the Nixtla client.
+
+```python
+import numpy as np
+import pandas as pd
+from nixtla import NixtlaClient
+
+# For forecast evaluation
+from utilsforecast.evaluation import evaluate
+from utilsforecast.losses import mae, rmse
+```
+
+You can instantiate the `NixtlaClient` class providing your authentication API key.
+
+```python
+nixtla_client = NixtlaClient(
+    # defaults to os.environ.get("NIXTLA_API_KEY")
+    api_key='my_api_key_provided_by_nixtla'
+)
+```
+
+### Step 2: Load Data
+
+In this notebook, we use hourly electricity prices as our example dataset, which
+consists of 5 time series, each with approximately 1700 data points. For
+demonstration purposes, we focus on the German electricity price series. The
+time series is split, with the last 240 steps (10 days) set aside as the test set.
+
+For simplicity, we will also demonstrate this tutorial without the use of any
+additional exogenous variables, but you could extend this same technique for
+datasets that have exogenous variables.
+
+```python
+df = pd.read_csv(
+    'https://raw.githubusercontent.com/Nixtla/transfer-learning-time-series/main/datasets/electricity-short-with-ex-vars.csv'
+)
+df['ds'] = pd.to_datetime(df['ds'])
+df_sub = df.query('unique_id == "DE"')[['unique_id','ds','y']]
+```
+
+
+```python
+df_train = df_sub.query('ds < "2017-12-21"')
+df_test = df_sub.query('ds >= "2017-12-21"')
+df_train.shape, df_test.shape
+```
+
+```bash
+((1440, 3), (240, 3))
+```
+
+```python
+nixtla_client.plot(df_train, df_test.rename(columns={'y': 'test'}))
+```
+
+<Frame>
+  ![Train Test Split](/images/docs/tutorials-exogenous/date_features_train_test_split.png)
+</Frame>
+
+
+
+### Step 3: Forecasting
+
+#### Without Datetime Features
+
+First, we forecast the univariate time series without the use of datetime
+features.
+
+```python
+fcst_timegpt_no_dt = nixtla_client.forecast(
+    df = df_train,
+    h=24*10,
+    model="timegpt-1-long-horizon"
+)
+```
+
+We will rename the forecast column for this approach, so that we can distinguish
+it from forecasts created using other methods later.
+
+```python
+fcst_timegpt_no_dt.rename(columns={"TimeGPT": "TimeGPT_no_dt"}, inplace=True)
+```
+
+#### With Inbuilt Datetime Features
+
+Next, let's forecast the same univariate time series with datetime features.
+This can be done by specifying the `date_features` argument. The
+data is hourly, so both the hour of the day (`hour`) and the day of
+the week (`dayofweek`) may impact the usage.
+
+For example, the usage may peak in the afternoon and drop off at night. It can
+also differ between the weekdays and weekends due to working and holiday
+patterns. Including these features can help the model make better forecasts.
+
+> NOTE:
+> 1. In order to show how these features are created, we can add the
+`feature_contribution` agrument. This is just for demonstration purposes in this
+tutorial and not truly needed to forecast with datetime features.
+> 2. If you have a weekly frequency dataset, you can use
+`date_features = ["week", "month", "year"]` or a subset of these features.
+> 3. If you have a monthly frequency dataset, you can use
+`date_features = ["month", "year"]` or a subset of these features.
+
+```python
+fcst_timegpt_dt_no_ohe = nixtla_client.forecast(
+    df = df_train,
+    h=24*10,
+    model="timegpt-1-long-horizon",
+    date_features=['hour', 'dayofweek'],
+    feature_contributions=True
+)
+```
+
+```python
+shap_df = nixtla_client.feature_contributions
+shap_df.head()
+```
+
+|   | unique_id |                  ds |   TimeGPT |       hour | dayofweek | base_value |
+|--:|----------:|--------------------:|----------:|-----------:|----------:|-----------:|
+| 0 |        DE | 2017-12-21 00:00:00 | 34.945976 | -12.797431 |  4.236599 |  43.506810 |
+| 1 |        DE | 2017-12-21 01:00:00 | 33.700954 | -14.274811 |  4.168986 |  43.806778 |
+| 2 |        DE | 2017-12-21 02:00:00 | 32.120293 | -15.785894 |  4.123096 |  43.783092 |
+| 3 |        DE | 2017-12-21 03:00:00 | 32.544914 | -15.623017 |  4.542475 |  43.625454 |
+| 4 |        DE | 2017-12-21 04:00:00 | 33.698105 | -14.559433 |  4.525819 |  43.731720 |
+
+As we can see, two new exogenous features (`hour` and `dayofweek`) got added to
+the dataset and the forecast utilized these features.
+
+However, we need to ensure that the model treats each hour (0, 1, 2, ..., 23)
+and each day (0, 1, 2, ..., 6) as a categorical variable and not as a numerical
+variable. If treated numerically, the model may exaggerate differences (e.g.,
+hour 23 might appear 23 times more influential than hour 1), which doesn't
+reflect real patterns. Electricity usage at hour 23 is typically similar to
+hour 1, and day 6 usage often resembles day 0.
+
+To avoid this distortion, we one-hot encode these variables using the
+`date_features_to_one_hot` argument. This creates a separate exogenous feature
+for each hour and each day, allowing the model to capture their effects
+independently.
+
+```python
+fcst_timegpt_dt = nixtla_client.forecast(
+    df = df_train,
+    h=24*10,
+    model="timegpt-1-long-horizon",
+    date_features=['hour', 'dayofweek'],
+    date_features_to_one_hot=['hour', 'dayofweek'],
+    feature_contributions=True
+)
+```
+
+```python
+shap_df = nixtla_client.feature_contributions
+shap_df.head()
+```
+
+|   | unique_id | ds |             TimeGPT |    hour_0 |     hour_1 |     hour_2 |     hour_3 |     hour_4 |     hour_5 |   hour_6 |      ... | hour_22 |  hour_23 | dayofweek_0 | dayofweek_1 | dayofweek_2 | dayofweek_3 | dayofweek_4 | dayofweek_5 | dayofweek_6 | base_value |           |
+|--:|----------:|---:|--------------------:|----------:|-----------:|-----------:|-----------:|-----------:|-----------:|---------:|---------:|--------:|---------:|------------:|------------:|------------:|------------:|------------:|------------:|------------:|-----------:|-----------|
+| 0 |         0 | DE | 2017-12-21 00:00:00 | 35.248108 | -13.396377 |   0.387143 |   0.423001 |   0.392672 |   0.373034 | 0.333778 | 0.147671 |     ... | 0.271507 |    0.393282 |    0.472389 |   -0.377321 |   -0.548429 |   -0.101086 |   -0.133001 |    1.455560 |   2.975230 | 44.333805 |
+| 1 |         1 | DE | 2017-12-21 01:00:00 | 34.400800 |   0.358443 | -14.488875 |   0.389985 |   0.359990 |   0.341219 | 0.320964 | 0.135058 |     ... | 0.266497 |    0.391259 |    0.445456 |   -0.306117 |   -0.436959 |   -0.172850 |   -0.151865 |    1.533456 |   3.022358 | 44.539093 |
+| 2 |         2 | DE | 2017-12-21 02:00:00 | 33.175526 |   0.375983 |   0.372809 | -15.824338 |   0.348533 |   0.351379 | 0.317832 | 0.123833 |     ... | 0.273698 |    0.410714 |    0.417348 |   -0.279551 |   -0.342991 |   -0.171547 |   -0.142890 |    1.532721 |   3.042772 | 44.515614 |
+| 3 |         3 | DE | 2017-12-21 03:00:00 | 33.205390 |   0.368333 |   0.366936 |   0.372584 | -15.880591 |   0.346306 | 0.319877 | 0.136488 |     ... | 0.276705 |    0.416273 |    0.508190 |   -0.274014 |   -0.339005 |   -0.176228 |   -0.152890 |    1.588364 |   3.095226 | 44.391410 |
+| 4 |         4 | DE | 2017-12-21 04:00:00 | 34.689583 |   0.363581 |   0.363459 |   0.393807 |   0.362043 | -14.755774 | 0.314718 | 0.141911 |     ... | 0.274819 |    0.402653 |    0.531417 |   -0.277548 |   -0.360688 |   -0.159342 |   -0.169762 |    1.692538 |   3.165733 | 44.505848 |
+
+As we can see above, this now creates a separate feature for each hour of the
+day and each day of the week.
+
+> NOTE: With one hot encoding, the number of features can increase by a lot.
+This is especially true if you have weekly frequency data and you are using
+`date_feature=["week"]` because this leads to 52 features being created after
+one hot encoding. Please make sure that your dataset has enough datapoints or
+else the model will overfit to the data. You can increase the number of
+datapoints in the dataset by increasing the available history for your time
+series, or increasing the number of unique time series that share a common
+pattern in your dataset.
+
+```python
+fcst_timegpt_dt.rename(columns={"TimeGPT": "fcst_timegpt_dt"}, inplace=True)
+```
+
+#### With Custom Datetime Features
+
+In the example above, we saw how to incorporate the inbuilt datetime features
+into the forecast. However, as seen above, in some cases, it may not be feasible
+to one hot encode the datetime features since it may lead to a large number of
+features for the dataset size. In that case, we can create a custom datetime
+feature and use it in the forecast.
+
+In this example, we will create a sine/cosine encoder for the week which is a
+popular technique to encode datetime features due to their circular nature
+described above (e.g. hour 23 behavior is close to hour 0 behavior, week 52
+behavior is very close to week 1 behavior, etc.).
+
+
+```python
+class SinCosWeekOfYear:
+    """
+    Adds sine and cosine features for each week of the year. This is useful for
+    models that can benefit from understanding the periodicity of weeks in a year.
+    """
+    def __call__(self, dates: pd.DatetimeIndex):
+        df = pd.DataFrame(index=dates)
+        # Get week of year (1 to 53)
+        weeks = np.array([date.isocalendar().week for date in dates])
+
+        # Calculate sine and cosine features
+        df["week_sin"] = np.sin((2 * np.pi) * (weeks-1) / 53).round(4)
+        df["week_cos"] = np.cos((2 * np.pi) * (weeks-1) / 53).round(4)
+        return df
+
+    def __name__(self):
+        return "SinCosWeekOfYear"
+
+# Example usage
+dates = pd.date_range(start='2023-01-01', periods=55, freq='W-MON')
+sin_cos_week = SinCosWeekOfYear()
+features = sin_cos_week(dates)
+features.tail()
+```
+
+|            | week_sin | week_cos |
+|-----------:|---------:|---------:|
+| 2023-12-18 |  -0.3482 |   0.9374 |
+| 2023-12-25 |  -0.2349 |   0.9720 |
+| 2024-01-01 |   0.0000 |   1.0000 |
+| 2024-01-08 |   0.1183 |   0.9930 |
+| 2024-01-15 |   0.2349 |   0.9720 |
+
+As we can see above, because of the cyclical encoding of the datetime feature,
+the encoded values (`week_sin` and `week_cos`) for week 2023-12-25 (week 52)
+is very close to 2024-01-01 (week 1). This will ensure that the learned features
+for week 52 will be close to those for week 1. This has also helped us get the
+feature cardinality down from 53 (in case of one hot encoding) to only 2 features.
+
+In our example, we have the hour feature wich has a relatively high cardinality
+after one hot encoding. Let's encode this with sine and cosine features and use
+this instead of the one hot encoding.
+
+```python
+class SinCosHourOfDay:
+    """
+    Adds sine and cosine features for each hour of the day. This is useful for
+    models that can benefit from understanding the periodicity of hours in a day.
+    """
+    def __call__(self, dates: pd.DatetimeIndex):
+        df = pd.DataFrame(index=dates)
+        # Get hour of day (0 to 23)
+        hours = np.array([date.hour for date in dates])
+
+        # Calculate sine and cosine features
+        df["hour_sin"] = np.sin((2 * np.pi) * (hours) / 24).round(4)
+        df["hour_cos"] = np.cos((2 * np.pi) * (hours) / 24).round(4)
+        return df
+
+    def __name__(self):
+        return "SinCosHourOfDay"
+
+# Example usage
+dates = pd.date_range(start='2023-01-01 00:00', periods=26, freq='h')
+sin_cos_hour = SinCosHourOfDay()
+features = sin_cos_hour(dates)
+features.tail()
+```
+
+|                     | hour_sin | hour_cos |
+|--------------------:|---------:|---------:|
+| 2023-01-01 21:00:00 |  -0.7071 |   0.7071 |
+| 2023-01-01 22:00:00 |  -0.5000 |   0.8660 |
+| 2023-01-01 23:00:00 |  -0.2588 |   0.9659 |
+| 2023-01-02 00:00:00 |   0.0000 |   1.0000 |
+| 2023-01-02 01:00:00 |   0.2588 |   0.9659 |
+
+In order to use this custom datetime feature, we can simply pass an instance of
+the class to the `date_features` argument. Since this is alreay encoded, we do
+not need to include it in the `date_features_to_one_hot` argument.
+
+```python
+fcst_timegpt_dt_custom = nixtla_client.forecast(
+    df = df_train,
+    h=24*10,
+    model="timegpt-1-long-horizon",
+    date_features=[SinCosHourOfDay(), 'dayofweek'],
+    date_features_to_one_hot=['dayofweek'],
+    feature_contributions=True
+)
+```
+
+```python
+shap_df = nixtla_client.feature_contributions
+shap_df.head()
+```
+
+|   | unique_id |                  ds |   TimeGPT |  hour_sin |   hour_cos | dayofweek_0 | dayofweek_1 | dayofweek_2 | dayofweek_3 | dayofweek_4 | dayofweek_5 | dayofweek_6 | base_value |
+|--:|----------:|--------------------:|----------:|----------:|-----------:|------------:|------------:|------------:|------------:|------------:|------------:|------------:|-----------:|
+| 0 |        DE | 2017-12-21 00:00:00 | 35.801600 | -3.609636 |  -9.003666 |    0.805974 |   -0.424078 |   -0.343238 |   -0.428668 |   -0.055370 |    1.462214 |    3.295479 |  44.102590 |
+| 1 |        DE | 2017-12-21 01:00:00 | 34.419390 | -3.824628 | -10.493365 |    0.714771 |   -0.400898 |   -0.282606 |   -0.331269 |   -0.115753 |    1.539153 |    3.245723 |  44.368263 |
+| 2 |        DE | 2017-12-21 02:00:00 | 32.892105 | -4.959243 | -10.772224 |    0.712402 |   -0.439891 |   -0.261654 |   -0.207954 |   -0.191223 |    1.481960 |    3.206257 |  44.323673 |
+| 3 |        DE | 2017-12-21 03:00:00 | 32.727295 | -5.161374 | -10.812295 |    0.771099 |   -0.417504 |   -0.262543 |   -0.146066 |   -0.258350 |    1.578070 |    3.268950 |  44.167310 |
+| 4 |        DE | 2017-12-21 04:00:00 | 34.121994 | -3.687167 | -11.353230 |    0.846524 |   -0.387008 |   -0.278475 |   -0.169525 |   -0.255498 |    1.788180 |    3.362950 |  44.255240 |
+
+As we can see above, the hour has now gotten encoded using the sine and cosine
+features instead of the one hot encoding.
+
+```python
+fcst_timegpt_dt_custom.rename(columns={"TimeGPT": "fcst_timegpt_dt_custom"}, inplace=True)
+```
+
+### Step 4: Compare Results
+
+#### Visual Comparison
+
+Let's compare the results visually first. For this, we will merge all the
+forecasts together. This is why we had renamed the forecast columns above so
+that we can distinguish the forecasts generated by the different methods.
+
+```python
+all_fcst = (
+    fcst_timegpt_no_dt
+    .merge(fcst_timegpt_dt, on=['unique_id', 'ds'])
+    .merge(fcst_timegpt_dt_custom, on=['unique_id', 'ds'])
+)
+all_fcst.head()
+```
+
+|   | unique_id |                  ds | TimeGPT_no_dt | fcst_timegpt_dt | fcst_timegpt_dt_custom |
+|--:|----------:|--------------------:|--------------:|----------------:|-----------------------:|
+| 0 |        DE | 2017-12-21 00:00:00 |     34.340740 |       35.248108 |              35.801600 |
+| 1 |        DE | 2017-12-21 01:00:00 |     34.376488 |       34.400800 |              34.419390 |
+| 2 |        DE | 2017-12-21 02:00:00 |     32.215570 |       33.175526 |              32.892105 |
+| 3 |        DE | 2017-12-21 03:00:00 |     34.485695 |       33.205390 |              32.727295 |
+| 4 |        DE | 2017-12-21 04:00:00 |     34.359673 |       34.689583 |              34.121994 |
+
+
+```python
+nixtla_client.plot(df_sub, all_fcst)
+```
+
+<Frame>
+  ![Train Test Split](/images/docs/tutorials-exogenous/date_features_result_comparison.png)
+</Frame>
+
+Visually looking at the results shows that the forecast with the datetime
+features is closer to the actuals as compared to the forecast without the
+datetime features.
+
+#### Metric Comparison
+
+Next, let's compare the forecast with the actual data quantitatively. We will
+use two common metrics - `MAE` and `RMSE` for this purpose.
+
+```python
+all_fcst_with_actuals = (
+    df_test[["unique_id", "ds", "y"]]
+    .merge(all_fcst, on=['unique_id', 'ds'])
+)
+all_fcst_with_actuals.head()
+```
+
+|   | unique_id |                  ds |     y | TimeGPT_no_dt | fcst_timegpt_dt | fcst_timegpt_dt_custom |
+|--:|----------:|--------------------:|------:|--------------:|----------------:|-----------------------:|
+| 0 |        DE | 2017-12-21 00:00:00 | 33.09 |     34.340740 |       35.248108 |              35.801600 |
+| 1 |        DE | 2017-12-21 01:00:00 | 35.26 |     34.376488 |       34.400800 |              34.419390 |
+| 2 |        DE | 2017-12-21 02:00:00 | 31.88 |     32.215570 |       33.175526 |              32.892105 |
+| 3 |        DE | 2017-12-21 03:00:00 | 33.04 |     34.485695 |       33.205390 |              32.727295 |
+| 4 |        DE | 2017-12-21 04:00:00 | 33.60 |     34.359673 |       34.689583 |              34.121994 |
+
+
+```python
+metrics = [mae, rmse]
+
+evaluation = evaluate(
+    all_fcst_with_actuals,
+    metrics=metrics,
+)
+evaluation
+```
+
+|   | unique_id | metric | TimeGPT_no_dt | fcst_timegpt_dt | fcst_timegpt_dt_custom |
+|--:|----------:|-------:|--------------:|----------------:|-----------------------:|
+| 0 |        DE |    mae |     27.527012 |       21.644545 |              21.139603 |
+| 1 |        DE |   rmse |     33.478168 |       28.099654 |              27.616988 |
+
+As we can see, the addition of the datetime features improved the forecasting
+metrics compared to the baseline model created without these features.
+
+## Conclusion
+
+As demonstrated in this tutorial
+
+1. Providing datetime features to the model during forecasting can improve the
+metrics substantially.
+2. However, users must be careful of the cardinality of the features after
+datetime features have been added. If the feature cardinality is too large for
+the dataset, it may lead to overfitting.
+3. In case of high cardinality, users may consider a custom encoding approach
+as demonstrated.
--- a/timegpt-docs/forecasting/exogenous-variables/holiday_and_special_dates.mdx
+++ b/timegpt-docs/forecasting/exogenous-variables/holiday_and_special_dates.mdx
+---
+title: "Holidays & Special Dates"
+description: "Guide to using holiday calendar variables and special dates to improve forecast accuracy in time series."
+icon: "calendar"
+---
+
+## What Are Holiday Variables and Special Dates?
+
+Special dates, such as holidays, promotions, or significant events, often cause notable deviations from normal patterns in your time series. By incorporating these special dates into your forecasting model, you can better capture these expected variations and improve prediction accuracy.
+
+## How to Add Holiday Variables and Special Dates
+
+[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Nixtla/nixtla/blob/main/nbs/docs/tutorials/02_holidays.ipynb)
+
+### Step 1: Import Packages
+
+Import the required libraries and initialize the Nixtla client.
+
+
+```python
+import pandas as pd
+from nixtla import NixtlaClient
+```
+
+
+```python
+nixtla_client = NixtlaClient(
+    api_key='my_api_key_provided_by_nixtla'
+)
+```
+
+### Step 2: Load Data
+
+We use a Google Trends dataset on "chocolate" with monthly frequency:
+
+```python
+df = pd.read_csv('https://raw.githubusercontent.com/Nixtla/transfer-learning-time-series/main/datasets/google_trend_chocolate.csv')
+df['month'] = pd.to_datetime(df['month']).dt.to_period('M').dt.to_timestamp('M')
+
+df.head()
+```
+
+|       | month        | chocolate   |
+| ----- | ------------ | ----------- |
+| 0     | 2004-01-31   | 35          |
+| 1     | 2004-02-29   | 45          |
+| 2     | 2004-03-31   | 28          |
+| 3     | 2004-04-30   | 30          |
+| 4     | 2004-05-31   | 29          |
+
+### Step 3: Create a Future Dataframe
+
+When adding exogenous variables (like holidays) to time series forecasting, we need a future DataFrame because:
+
+- Historical data already exists: Our training data contains past values of both the target variable and exogenous features
+- Future exogenous features are known: Unlike the target variable, we can determine future values of exogenous features (like holidays) in advance
+
+For example, we know that Christmas will occur on December 25th next year, so we can include this information in our future DataFrame to help the model understand seasonal patterns during the forecast period.
+
+Start with creating a future DataFrame with 14 months of dates starting from May 2024.
+
+```python
+# Create future Dataframe for adding US holidays
+start_date = "2024-05"
+dates = pd.date_range(start=start_date, periods=14, freq="ME")
+
+dates = dates.to_period("M").to_timestamp("M")
+
+future_df = pd.DataFrame(dates, columns=["month"])
+future_df.tail()
+```
+
+|    | month               |
+|----| --------------------|
+|  9 | 2025-02-28 00:00:00 |
+| 10 | 2025-03-31 00:00:00 |
+| 11 | 2025-04-30 00:00:00 |
+| 12 | 2025-05-31 00:00:00 |
+| 13 | 2025-06-30 00:00:00 |
+
+### Step 4: Forecast with Holidays and Special Dates
+
+TimeGPT automatically generates standard date-based features (like month, day of week, etc.) during forecasting. For more specialized temporal patterns, you can manually add holiday indicators to both your historical and future datasets.
+
+#### Create a Function to Add Date Features
+
+To make it easier to add date features to a DataFrame, we'll create the `add_date_features_to_DataFrame` function that takes:
+
+- A pandas DataFrame
+- A date extractor function, which can be `CountryHolidays` or `SpecialDates`
+- A time column name
+
+```python
+def add_date_features_to_dataframe(df, date_extractor, time_col="month", freq="ME"):
+    # Create a copy of the DataFrame
+    df = df.copy()
+
+    # Ensure time column is datetime
+    datetime_types = ["datetime64[ns]", "datetime64[us]", "datetime64[ms]"]
+    if df[time_col].dtype.name not in datetime_types:
+        raise ValueError(
+            f"Column '{time_col}' must be datetime type, got {df[time_col].dtype}"
+        )
+
+    # Create date range
+    dates_range = pd.date_range(
+        start=df[time_col].min(), end=df[time_col].max(), freq="D"
+    )
+
+    # Get date feature indicators and resample to specified frequency
+    features_df = date_extractor(dates_range)
+    features = features_df.resample(freq).max()
+    features = features.reset_index(names=time_col)
+
+    # Merge with input DataFrame
+    result_df = df.merge(features)
+
+    return result_df
+```
+
+#### Add Holiday Features
+
+To add holiday features, we'll use the `CountryHolidays` class to compute US holidays and merge them into the future DataFrame.
+
+```python
+from nixtla.date_features import CountryHolidays
+
+us_holidays = CountryHolidays(countries=["US"])
+
+future_df_holidays = add_date_features_to_DataFrame(future_df, us_holidays)
+
+print(f"Future DataFrame shape: {future_df_holidays.shape}")
+future_df_holidays.head()
+```
+
+|    | month               |   US_New Year's Day |   US_Memorial Day |   US_Juneteenth National Independence Day |   US_Independence Day |   US_Labor Day |   US_Veterans Day |   US_Thanksgiving Day |   US_Christmas Day |   US_Martin Luther King Jr. Day |   US_Washington's Birthday |   US_Columbus Day |
+|---:|:--------------------|--------------------:|------------------:|------------------------------------------:|----------------------:|---------------:|------------------:|----------------------:|-------------------:|--------------------------------:|---------------------------:|------------------:|
+|  0 | 2024-05-31 00:00:00 |                   0 |                 0 |                                         0 |                     0 |              0 |                 0 |                     0 |                  0 |                               0 |                          0 |                 0 |
+|  1 | 2024-06-30 00:00:00 |                   0 |                 0 |                                         1 |                     0 |              0 |                 0 |                     0 |                  0 |                               0 |                          0 |                 0 |
+|  2 | 2024-07-31 00:00:00 |                   0 |                 0 |                                         0 |                     1 |              0 |                 0 |                     0 |                  0 |                               0 |                          0 |                 0 |
+|  3 | 2024-08-31 00:00:00 |                   0 |                 0 |                                         0 |                     0 |              0 |                 0 |                     0 |                  0 |                               0 |                          0 |                 0 |
+|  4 | 2024-09-30 00:00:00 |                   0 |                 0 |                                         0 |                     0 |              1 |                 0 |                     0 |                  0 |                               0 |                          0 |                 0 |
+
+This DataFrame now includes columns for each identified US holiday as binary indicators.
+
+Next, add holiday indicators to the historical DataFrame.
+
+```python
+df_with_holidays = add_date_features_to_DataFrame(df, us_holidays)
+
+df_with_holidays.tail()
+```
+
+|     | month               |   chocolate |   US_New Year's Day |   US_New Year's Day (observed) |   US_Memorial Day |   US_Independence Day |   US_Independence Day (observed) |   US_Labor Day |   US_Veterans Day |   US_Thanksgiving Day |   US_Christmas Day |   US_Christmas Day (observed) |   US_Martin Luther King Jr. Day |   US_Washington's Birthday |   US_Columbus Day |   US_Veterans Day (observed) |   US_Juneteenth National Independence Day |   US_Juneteenth National Independence Day (observed) |
+|----:|:--------------------|------------:|--------------------:|-------------------------------:|------------------:|----------------------:|---------------------------------:|---------------:|------------------:|----------------------:|-------------------:|------------------------------:|--------------------------------:|---------------------------:|------------------:|-----------------------------:|------------------------------------------:|-----------------------------------------------------:|
+| 239 | 2023-12-31 00:00:00 |          90 |                   0 |                              0 |                 0 |                     0 |                                0 |              0 |                 0 |                     0 |                  1 |                             0 |                               0 |                          0 |                 0 |                            0 |                                         0 |                                                    0 |
+| 240 | 2024-01-31 00:00:00 |          64 |                   1 |                              0 |                 0 |                     0 |                                0 |              0 |                 0 |                     0 |                  0 |                             0 |                               1 |                          0 |                 0 |                            0 |                                         0 |                                                    0 |
+| 241 | 2024-02-29 00:00:00 |          66 |                   0 |                              0 |                 0 |                     0 |                                0 |              0 |                 0 |                     0 |                  0 |                             0 |                               0 |                          1 |                 0 |                            0 |                                         0 |                                                    0 |
+| 242 | 2024-03-31 00:00:00 |          59 |                   0 |                              0 |                 0 |                     0 |                                0 |              0 |                 0 |                     0 |                  0 |                             0 |                               0 |                          0 |                 0 |                            0 |                                         0 |                                                    0 |
+| 243 | 2024-04-30 00:00:00 |          51 |                   0 |                              0 |                 0 |                     0 |                                0 |              0 |                 0 |                     0 |                  0 |                             0 |                               0 |                          0 |                 0 |                            0 |                                         0 |                                                    0 |
+
+Now, your historical DataFrame also contains holiday flags for each month.
+
+Finally, forecast with the holiday features.
+
+```python
+fcst_df_holidays = nixtla_client.forecast(
+    df=df_with_holidays,
+    h=14,
+    freq="ME",
+    time_col="month",
+    target_col="chocolate",
+    X_df=future_df_holidays,
+    model="timegpt-1-long-horizon",
+    hist_exog_list=[
+        "US_New Year's Day (observed)",
+        "US_Independence Day (observed)",
+        "US_Christmas Day (observed)",
+        "US_Veterans Day (observed)",
+        "US_Juneteenth National Independence Day (observed)",
+    ],
+    feature_contributions=True, # for shapley values
+)
+
+fcst_df_holidays.head()
+```
+
+Plot the forecast with holiday effects.
+
+```python
+nixtla_client.plot(
+    df_with_holidays,
+    fcst_df_holidays,
+    time_col='month',
+    target_col='chocolate',
+)
+```
+
+![Forecast plot including holiday effects](https://raw.githubusercontent.com/Nixtla/nixtla/readme_docs/nbs/_docs/docs/tutorials/02_holidays_files/figure-markdown_strict/cell-16-output-1.png)
+
+We can then plot the weights of each holiday to see which are more important in forecasting the interest in chocolate. We will use the [SHAP library](https://shap.readthedocs.io/en/latest/) to plot the weights.
+
+> For more details on how to use the shap library, see our [tutorial on model interpretability](/forecasting/exogenous-variables/interpretability_with_shap).
+
+
+```python
+import shap
+import matplotlib.pyplot as plt
+
+
+def plot_shap_values(ds_column, title):
+    shap_df = nixtla_client.feature_contributions
+    shap_columns = shap_df.columns.difference(
+        ["unique_id", ds_column, "TimeGPT", "base_value"]
+    )
+
+    shap_obj = shap.Explanation(
+        values=shap_df[shap_columns].values,
+        base_values=shap_df["base_value"].values,
+        feature_names=shap_columns,
+    )
+
+    shap.plots.bar(shap_obj, max_display=len(shap_columns), show=False)
+
+    plt.title(title)
+    plt.show()
+
+plot_shap_values(ds_column="month", title="SHAP values for holidays")
+```
+
+![Holiday-related feature weights](/images/docs/tutorials-exogenous/holiday_weights.png)
+
+The SHAP values reveal that Christmas, Independence Day, and Labor Day have the strongest influence on chocolate interest forecasting. These holidays show the highest feature importance weights, indicating they significantly impact consumer behavior patterns. This aligns with expectations since these are major US holidays associated with gift-giving, celebrations, and seasonal consumption patterns that drive chocolate sales.
+
+#### Add Special Dates
+
+Beyond country holidays, you can create custom special dates with `SpecialDates`. These can represent unique one-time events or recurring patterns on specific dates of your choice.
+
+Assume we already have a future DataFrame with monthly dates. We'll create Valentine's Day and Halloween as custom special dates and add them to the future DataFrame.
+
+```python
+from nixtla.date_features import SpecialDates
+
+# Generate special dates programmatically for the full data range (2004-2025)
+valentine_dates = [f"{year}-02-14" for year in range(2004, 2026)]
+halloween_dates = [f"{year}-10-31" for year in range(2004, 2026)]
+
+# Define custom special dates - chocolate-related seasonal events
+special_dates = SpecialDates(
+    special_dates={
+        "Valentine_season": valentine_dates,
+        "Halloween_season": halloween_dates,
+    }
+)
+
+# Apply special dates to future data
+future_df_special = add_date_features_to_DataFrame(future_df, special_dates)
+
+future_df_special.head()
+```
+
+|    | month               |   Valentine_season |   Halloween_season |
+|---:|:--------------------|--------------------:|------------------:|
+|  0 | 2024-05-31 00:00:00 |                   0 |                 0 |
+|  1 | 2024-06-30 00:00:00 |                   0 |                 0 |
+|  2 | 2024-07-31 00:00:00 |                   0 |                 0 |
+|  3 | 2024-08-31 00:00:00 |                   0 |                 0 |
+|  4 | 2024-09-30 00:00:00 |                   0 |                 0 |
+
+We will also add custom special dates to the historical DataFrame.
+
+```python
+# Apply special dates to historical data as well
+df_special = add_date_features_to_DataFrame(df, special_dates)
+
+df_special.tail()
+```
+
+|     | month               |   chocolate |   Valentine_season |   Halloween_season |
+|----:|:--------------------|------------:|--------------------:|------------------:|
+| 239 | 2023-12-31 00:00:00 |          90 |                   0 |                 0 |
+| 240 | 2024-01-31 00:00:00 |          64 |                   0 |                 0 |
+| 241 | 2024-02-29 00:00:00 |          66 |                   1 |                 0 |
+| 242 | 2024-03-31 00:00:00 |          59 |                   0 |                 0 |
+| 243 | 2024-04-30 00:00:00 |          51 |                   0 |                 0 |
+
+Now, forecast with the special date features.
+
+```python
+fcst_df_special = nixtla_client.forecast(
+    df=df_special,
+    h=14,
+    freq="M",
+    time_col="month",
+    target_col="chocolate",
+    X_df=future_df_special,
+    model="timegpt-1-long-horizon",
+    feature_contributions=True,
+)
+```
+
+Plot the forecast with special date effects.
+
+```python
+nixtla_client.plot(
+    df_special,
+    fcst_df_special,
+    time_col='month',
+    target_col='chocolate',
+)
+```
+
+![Forecast plot including special date effects](/images/docs/tutorials-exogenous/special_date_forecast.png)
+
+Examine the feature importance of the special dates.
+
+```python
+plot_shap_values(ds_column="month", title="SHAP values for special dates")
+```
+
+![Special date feature weights](/images/docs/tutorials-exogenous/special_date_weights.png)
+
+The SHAP values reveal that Valentine's Day has the strongest positive impact on chocolate sales forecasts. This aligns with consumer behavior patterns, as chocolate is a popular gift choice during Valentine's Day celebrations.
+
+<Check>
+Congratulations! You have successfully integrated holiday and special date features into your time series forecasts. Use these steps as a starting point for further experimentation with advanced date features.
+</Check>
--- a/timegpt-docs/forecasting/exogenous-variables/interpretability_with_shap.mdx
+++ b/timegpt-docs/forecasting/exogenous-variables/interpretability_with_shap.mdx
+---
+title: "Model Interpretability"
+description: "Learn how to interpret model predictions using SHAP values to understand the impact of exogenous variables."
+icon: "square-root-variable"
+---
+
+## What Are SHAP Values?
+
+SHAP (SHapley Additive exPlanation) values use game theory concepts to explain how each feature influences machine learning forecasts. They're particularly useful when working with exogenous (external) variables, letting you understand contributions both at individual prediction steps and across entire forecast horizons.
+
+SHAP values can be seamlessly combined with visualization methods from the [SHAP](https://shap.readthedocs.io/en/latest/) Python package for powerful plots and insights. Before proceeding, make sure you understand forecasting with exogenous features. For reference, see our [tutorial on exogenous variables](/forecasting/exogenous-variables/numeric_features).
+
+## How to Use SHAP Values for TimeGPT
+
+[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Nixtla/nixtla/blob/main/nbs/docs/tutorials/21_shap_values.ipynb)
+
+## Install SHAP
+
+Install the SHAP library.
+
+```bash
+pip install shap
+```
+
+For more details, visit the [official SHAP documentation](https://shap.readthedocs.io/en/latest/).
+
+### Step 1: Import Packages
+
+Import the necessary libraries and initialize the Nixtla client.
+
+```python
+import pandas as pd
+from nixtla import NixtlaClient
+
+nixtla_client = NixtlaClient(
+    api_key='my_api_key_provided_by_nixtla'  # Or use os.environ.get("NIXTLA_API_KEY")
+)
+```
+
+### Step 2: Load Data
+
+We'll use exogenous variables (covariates) to enhance electricity market forecasting accuracy. The widely known EPF dataset is available at
+[this link](https://zenodo.org/records/4624805). It contains hourly prices and relevant exogenous factors for five different electricity markets.
+
+For this tutorial, we'll focus on the Belgian electricity market (BE). The data includes:
+- Hourly prices (y)
+- Forecasts for load (Exogenous1) and generation (Exogenous2)
+- Day-of-week indicators (one-hot encoded)
+
+If your data relies on factors such as weather, holiday calendars, marketing, or other elements, ensure they're similarly structured.
+
+```python
+market = "BE"
+
+df = pd.read_csv(
+    'https://raw.githubusercontent.com/Nixtla/transfer-learning-time-series/main/datasets/electricity-short-with-ex-vars.csv'
+)
+
+df.head()
+```
+
+| unique_id | ds | y | Exogenous1 | Exogenous2 | day_0 | day_1 | day_2 | day_3 | day_4 | day_5 | day_6 |
+|-----------|----|---|------------|------------|-------|-------|-------|-------|-------|-------|-------|
+| BE        | 2016-10-22 00:00:00 | 70.00 | 57253.0 | 49593.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 |
+| BE        | 2016-10-22 01:00:00 | 37.10 | 51887.0 | 46073.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 |
+| BE        | 2016-10-22 02:00:00 | 37.10 | 51896.0 | 44927.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 |
+| BE        | 2016-10-22 03:00:00 | 44.75 | 48428.0 | 44483.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 |
+| BE        | 2016-10-22 04:00:00 | 37.10 | 46721.0 | 44338.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 |
+
+### Step 3: Forecast with Exogenous Variables
+
+To make forecasts with exogenous variables, you must have future data for these variables available at the time of prediction.
+
+
+Before generating forecasts, ensure you have (or can generate) future exogenous values. Below, we load future exogenous features to obtain 24-step-ahead predictions:
+
+```python
+future_ex_vars_df = pd.read_csv(
+    'https://raw.githubusercontent.com/Nixtla/transfer-learning-time-series/main/datasets/electricity-short-future-ex-vars.csv'
+)
+
+future_ex_vars_df.head()
+```
+
+| unique_id | ds | Exogenous1 | Exogenous2 | day_0 | day_1 | day_2 | day_3 | day_4 | day_5 | day_6 |
+|-----------|----|------------|------------|-------|-------|-------|-------|-------|-------|-------|
+| BE        | 2016-12-31 00:00:00 | 70318.0 | 64108.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 |
+| BE        | 2016-12-31 01:00:00 | 67898.0 | 62492.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 |
+| BE        | 2016-12-31 02:00:00 | 68379.0 | 61571.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 |
+| BE        | 2016-12-31 03:00:00 | 64972.0 | 60381.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 |
+| BE        | 2016-12-31 04:00:00 | 62900.0 | 60298.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 |
+
+Next, create forecasts using the Nixtla API:
+
+```python
+timegpt_fcst_ex_vars_df = nixtla_client.forecast(
+    df=df,
+    X_df=future_ex_vars_df,
+    h=24,
+    level=[80, 90],
+    feature_contributions=True
+)
+
+timegpt_fcst_ex_vars_df.head()
+```
+
+| unique_id | ds | TimeGPT | TimeGPT-hi-80 | TimeGPT-hi-90 | TimeGPT-lo-80 | TimeGPT-lo-90 |
+|-----------|----|---------|---------------|---------------|---------------|---------------|
+| BE        | 2016-12-31 00:00:00 | 51.632830 | 61.598820 | 66.088295 | 41.666843 | 37.177372 |
+| BE        | 2016-12-31 01:00:00 | 45.750877 | 54.611988 | 60.176445 | 36.889767 | 31.325312 |
+| BE        | 2016-12-31 02:00:00 | 39.650543 | 46.256210 | 52.842808 | 33.044876 | 26.458277 |
+| BE        | 2016-12-31 03:00:00 | 34.000072 | 44.015310 | 47.429000 | 23.984835 | 20.571144 |
+| BE        | 2016-12-31 04:00:00 | 33.785370 | 43.140503 | 48.581240 | 24.430239 | 18.989498 |
+
+
+### Step 4: Extract SHAP Values
+
+After forecasting, you can retrieve SHAP values to see how each feature contributed to the model's predictions.
+
+```python
+shap_df = nixtla_client.feature_contributions
+shap_df = shap_df.query("unique_id == @market")
+
+shap_df.head()
+```
+
+### Step 5: Visualization with SHAP
+
+Visualizing SHAP values helps interpret the impact of exogenous features in detail. Below, we demonstrate three common SHAP plots.
+
+#### Bar Plot
+
+Use a bar plot to see the average impact of each feature across predictions:
+
+```python
+import shap
+import matplotlib.pyplot as plt
+
+shap_columns = shap_df.columns.difference(['unique_id', 'ds', 'TimeGPT', 'base_value'])
+
+shap_obj = shap.Explanation(
+    values=shap_df[shap_columns].values,
+    base_values=shap_df['base_value'].values,
+    feature_names=shap_columns
+)
+
+shap.plots.bar(
+    shap_obj,
+    max_display=len(shap_columns),
+    show=False
+)
+
+plt.title(f'SHAP values for {market}')
+plt.show()
+```
+
+<Frame caption="Bar Plot">
+  ![Bar Plot](https://raw.githubusercontent.com/Nixtla/nixtla/readme_docs/nbs/_docs/docs/tutorials/21_shap_values_files/figure-markdown_strict/cell-14-output-1.png)
+</Frame>
+
+#### Waterfall Plot
+
+A waterfall plot shows how each feature contributes to a single prediction step. Here, we select the earliest date for illustration:
+
+```python
+selected_ds = shap_df['ds'].min()
+
+filtered_df = shap_df[shap_df['ds'] == selected_ds]
+
+shap_obj = shap.Explanation(
+    values=filtered_df[shap_columns].values.flatten(),
+    base_values=filtered_df['base_value'].values[0],
+    feature_names=shap_columns
+)
+
+shap.plots.waterfall(shap_obj, show=False)
+
+plt.title(f'Waterfall Plot: {market}, date: {selected_ds}')
+plt.show()
+```
+
+<Frame caption="Waterfall Plot">
+  ![Waterfall Plot](https://raw.githubusercontent.com/Nixtla/nixtla/readme_docs/nbs/_docs/docs/tutorials/21_shap_values_files/figure-markdown_strict/cell-15-output-1.png)
+</Frame>
+
+#### Heatmap
+
+Visualize how feature impacts vary across each forecast step. This often reveals time-dependent effects of certain variables.
+
+```python
+shap_obj = shap.Explanation(
+    values=shap_df[shap_columns].values,
+    feature_names=shap_columns
+)
+
+shap.plots.heatmap(shap_obj, show=False)
+
+plt.title(f'SHAP Heatmap (Unique ID: {market})')
+plt.show()
+```
+
+<Frame caption="SHAP Heatmap">
+  ![SHAP Heatmap](https://raw.githubusercontent.com/Nixtla/nixtla/readme_docs/nbs/_docs/docs/tutorials/21_shap_values_files/figure-markdown_strict/cell-16-output-1.png)
+</Frame>
+
--- a/timegpt-docs/forecasting/exogenous-variables/numeric_features.mdx
+++ b/timegpt-docs/forecasting/exogenous-variables/numeric_features.mdx
+---
+title: "Numeric Variables"
+description: "Learn how to incorporate external numeric variables to improve your forecasting accuracy."
+icon: "binary"
+---
+
+## What Are Exogenous Variables?
+
+Exogenous variables or external factors are crucial in time series forecasting 
+as they provide additional information that might influence the prediction. 
+These variables could include holiday markers, marketing spending, weather data, 
+or any other external data that correlate with the time series data you are 
+forecasting.
+
+For example, if you're forecasting ice cream sales, temperature data could serve 
+as a useful exogenous variable. On hotter days, ice cream sales may increase.
+
+
+## How to Use Exogenous Variables
+
+
+[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Nixtla/nixtla/blob/main/nbs/docs/tutorials/01_exogenous_variables_reworked.ipynb)
+
+To incorporate exogenous variables in TimeGPT, you'll need to pair each point
+in your time series data with the corresponding external data.
+
+### Step 1: Import Packages
+
+Import the required libraries and initialize the Nixtla client.
+
+
+```python
+import pandas as pd
+from nixtla import NixtlaClient
+
+nixtla_client = NixtlaClient(
+    # defaults to os.environ.get("NIXTLA_API_KEY")
+    api_key="my_api_key_provided_by_nixtla"
+)
+```
+
+### Step 2: Load Dataset
+
+In this tutorial, we'll predict day-ahead electricity prices. The dataset contains:
+
+- Hourly electricity prices (`y`) from various markets (identified by `unique_id`)
+- Exogenous variables (`Exogenous1` to `day_6`)
+
+```python
+df = pd.read_csv("https://raw.githubusercontent.com/Nixtla/transfer-learning-time-series/main/datasets/electricity-short-with-ex-vars.csv")
+df.head()
+```
+
+| unique_id   | ds                    | y       | Exogenous1   | Exogenous2   | day_0   | day_1   | day_2   | day_3   | day_4   | day_5   | day_6   |
+| ----------- | --------------------- | ------- | ------------ | ------------ | ------- | ------- | ------- | ------- | ------- | ------- | ------- |
+| BE          | 2016-10-22 00:00:00   | 70.00   | 57253.0      | 49593.0      | 0.0     | 0.0     | 0.0     | 0.0     | 0.0     | 1.0     | 0.0     |
+| BE          | 2016-10-22 01:00:00   | 37.10   | 51887.0      | 46073.0      | 0.0     | 0.0     | 0.0     | 0.0     | 0.0     | 1.0     | 0.0     |
+| BE          | 2016-10-22 02:00:00   | 37.10   | 51896.0      | 44927.0      | 0.0     | 0.0     | 0.0     | 0.0     | 0.0     | 1.0     | 0.0     |
+| BE          | 2016-10-22 03:00:00   | 44.75   | 48428.0      | 44483.0      | 0.0     | 0.0     | 0.0     | 0.0     | 0.0     | 1.0     | 0.0     |
+| BE          | 2016-10-22 04:00:00   | 37.10   | 46721.0      | 44338.0      | 0.0     | 0.0     | 0.0     | 0.0     | 0.0     | 1.0     | 0.0     |
+
+
+### Step 3: Forecast without Exogenous Variables
+
+First, let's create a baseline forecast without using any exogenous variables.
+
+```python
+timegpt_fcst_no_ex_vars = nixtla_client.forecast(
+    df=df[["unique_id", "ds", "y"]],
+    h=24,
+    level=[80, 90]
+)
+```
+
+### Step 4: Forecasting with Exogenous Variables
+
+Next, let's create a forecast using the exogenous variables. To make a forecast 
+using exogenous variables, you need to provide historical and future exogenous 
+values. Below is an example dataset containing future exogenous variables. Note 
+that it only contains the future exogenous variable values not the target 
+variable `y`. We need to forecast this target variable using the exogenous 
+variables provided.
+
+```python
+future_ex_vars_df = pd.read_csv("https://raw.githubusercontent.com/Nixtla/transfer-learning-time-series/main/datasets/electricity-short-future-ex-vars.csv")
+future_ex_vars_df.head()
+```
+
+| unique_id | ds                   | Exogenous1 | Exogenous2 | day_0 | day_1 | day_2 | day_3 | day_4 | day_5 | day_6 |
+|-----------|----------------------|------------|------------|-------|-------|-------|-------|-------|-------|-------|
+| BE        | 2016-12-31 00:00:00  | 70318.0    | 64108.0    | 0.0   | 0.0   | 0.0   | 0.0   | 0.0   | 1.0   | 0.0   |
+| BE        | 2016-12-31 01:00:00  | 67898.0    | 62492.0    | 0.0   | 0.0   | 0.0   | 0.0   | 0.0   | 1.0   | 0.0   |
+| BE        | 2016-12-31 02:00:00  | 68379.0    | 61571.0    | 0.0   | 0.0   | 0.0   | 0.0   | 0.0   | 1.0   | 0.0   |
+| BE        | 2016-12-31 03:00:00  | 64972.0    | 60381.0    | 0.0   | 0.0   | 0.0   | 0.0   | 0.0   | 1.0   | 0.0   |
+| BE        | 2016-12-31 04:00:00  | 62900.0    | 60298.0    | 0.0   | 0.0   | 0.0   | 0.0   | 0.0   | 1.0   | 0.0   |
+
+Ensure you maintain consistent data formatting and columns in both historical 
+and future exogenous datasets (e.g., dates, unique_id, variable names).
+
+```python
+timegpt_fcst_ex_vars = nixtla_client.forecast(
+    df=df,
+    X_df=future_ex_vars_df,
+    h=24,
+    level=[80, 90]
+)
+```
+
+### Step 5: Forecast Visualization
+
+Once you have generated your forecasts, you can visualize the results to compare 
+forecasts between the two methods above.
+
+```python
+timegpt_fcst_no_ex_vars.rename(columns={"TimeGPT": "TimeGPT_no_ex_vars"}, inplace=True)
+timegpt_fcst_ex_vars.rename(columns={"TimeGPT": "TimeGPT_ex_vars"}, inplace=True)
+
+all_forecasts = (
+    timegpt_fcst_no_ex_vars
+    .merge(
+        timegpt_fcst_ex_vars,
+        how='outer',
+        on=["unique_id", "ds"]
+    )
+)
+```
+
+```python
+nixtla_client.plot(
+    df[["unique_id", "ds", "y"]],
+    all_forecasts,
+    max_insample_length=1000,
+)
+```
+
+<Frame>
+  ![Forecast chart](/images/docs/exo_no_exo_comparison.png)
+</Frame>
+
+## Key Takeaways
+
+- Exogenous variables enrich time series forecasting.
+- Ensure proper alignment of historical and future exogenous data.
+
+## Next Steps
+
+  Congratulations! You have mastered the fundamentals of adding exogenous 
+  variables to your TimeGPT forecasts. Keep refining your approach by 
+  
+- Exploring feature engineering to create domain-specific exogenous data.
+- Experimenting with different modeling approaches for external variables.
+- Validating forecast accuracy by comparing with real future data.
--- a/timegpt-docs/forecasting/fine-tuning/custom_loss.mdx
+++ b/timegpt-docs/forecasting/fine-tuning/custom_loss.mdx
+---
+title: "Fine-tuning with a Specific Loss Function"
+description: "Learn how to fine-tune a model using specific loss functions, configure the Nixtla client, and evaluate performance improvements."
+icon: "gear"
+---
+
+## Fine-tuning with a Specific Loss Function
+
+When you fine-tune, the model trains on your dataset to tailor predictions to
+your specific dataset. You can specify the loss function to be used during
+fine-tuning using the `finetune_loss` argument. Below are the available loss 
+functions:
+
+* `"default"`: A proprietary function robust to outliers.
+* `"mae"`: Mean Absolute Error
+* `"mse"`: Mean Squared Error
+* `"rmse"`: Root Mean Squared Error
+* `"mape"`: Mean Absolute Percentage Error
+* `"smape"`: Symmetric Mean Absolute Percentage Error
+
+
+## How to Fine-tune with a Specific Loss Function
+
+[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Nixtla/nixtla/blob/main/nbs/docs/tutorials/07_loss_function_finetuning.ipynb)
+
+### Step 1: Import Packages and Initialize Client
+
+First, we import the required packages and initialize the Nixtla client.
+
+```python
+import pandas as pd
+from nixtla import NixtlaClient
+from utilsforecast.losses import mae, mse, rmse, mape, smape
+```
+
+```python
+nixtla_client = NixtlaClient(
+    # defaults to os.environ.get("NIXTLA_API_KEY")
+    api_key='my_api_key_provided_by_nixtla'
+)
+```
+
+### Step 2: Load Data
+
+Load your data and prepare it for fine-tuning. Here, we will demonstrate using
+an example dataset of air passenger counts.
+
+```python
+df = pd.read_csv('https://raw.githubusercontent.com/Nixtla/transfer-learning-time-series/main/datasets/air_passengers.csv')
+df.insert(loc=0, column='unique_id', value=1)
+
+df.head()
+```
+
+|       | unique_id   | timestamp    | value   |
+| ----- | ----------- | ------------ | ------- |
+| 0     | 1           | 1949-01-01   | 112     |
+| 1     | 1           | 1949-02-01   | 118     |
+| 2     | 1           | 1949-03-01   | 132     |
+| 3     | 1           | 1949-04-01   | 129     |
+| 4     | 1           | 1949-05-01   | 121     |
+
+
+### Step 3: Fine-Tune the Model
+
+Let's fine-tune the model on a dataset using the mean absolute error (MAE).
+
+For that, we simply pass the appropriate string representing the loss function
+to the `finetune_loss` parameter of the `forecast` method.
+
+```python
+timegpt_fcst_finetune_mae_df = nixtla_client.forecast(
+    df=df,
+    h=12,
+    finetune_steps=10,
+    finetune_loss='mae',   # Select desired loss function
+    time_col='timestamp',
+    target_col='value',
+)
+```
+
+After training completes, you can visualize the forecast:
+
+```python
+nixtla_client.plot(
+    df,
+    timegpt_fcst_finetune_mae_df,
+    time_col='timestamp',
+    target_col='value',
+)
+```
+
+<Frame>
+  ![Fine tuning with MAE](https://raw.githubusercontent.com/Nixtla/nixtla/readme_docs/nbs/_docs/docs/tutorials/07_loss_function_finetuning_files/figure-markdown_strict/cell-12-output-1.png)
+</Frame>
+
+## Explanation of Loss Functions
+
+Now, depending on your data, you will use a specific error metric to accurately
+evaluate your forecasting model's performance.
+
+Below is a non-exhaustive guide on which metric to use depending on your use case.
+
+**Mean absolute error (MAE)**
+
+<img src="https://latex.codecogs.com/svg.image?\mathrm{MAE}(\mathbf{y}_{\tau}, \mathbf{\hat{y}}_{\tau}) = \frac{1}{H} \sum^{t+H}_{\tau=t+1} |y_{\tau} - \hat{y}_{\tau}|" />
+
+- Robust to outliers
+- Easy to understand
+- You care equally about all error sizes
+- Same units as your data
+
+**Mean squared error (MSE)**
+
+<img src="https://latex.codecogs.com/svg.image?\mathrm{MSE}(\mathbf{y}_{\tau}, \mathbf{\hat{y}}_{\tau}) = \frac{1}{H} \sum^{t+H}_{\tau=t+1} (y_{\tau} - \hat{y}_{\tau})^{2}" />
+
+- You want to penalize large errors more than small ones
+- Sensitive to outliers
+- Used when large errors must be avoided
+- *Not* the same units as your data
+
+**Root mean squared error (RMSE)**
+
+<img src="https://latex.codecogs.com/svg.image?\mathrm{RMSE}(\mathbf{y}_{\tau}, \mathbf{\hat{y}}_{\tau}) = \sqrt{\frac{1}{H} \sum^{t+H}_{\tau=t+1} (y_{\tau} - \hat{y}_{\tau})^{2}}" />
+
+- Brings the MSE back to original units of data
+- Penalizes large errors more than small ones
+
+**Mean absolute percentage error (MAPE)**
+
+<img src="https://latex.codecogs.com/svg.image?\mathrm{MAPE}(\mathbf{y}_{\tau}, \mathbf{\hat{y}}_{\tau}) = \frac{1}{H} \sum^{t+H}_{\tau=t+1} \frac{|y_{\tau}-\hat{y}_{\tau}|}{|y_{\tau}|}" />
+
+- Easy to understand for non-technical stakeholders
+- Expressed as a percentage
+- Heavier penalty on positive errors over negative errors
+- To be avoided if your data has values close to 0 or equal to 0
+
+**Symmetric mean absolute percentage error (sMAPE)**
+
+<img src="https://latex.codecogs.com/svg.image?\mathrm{SMAPE}_{2}(\mathbf{y}_{\tau}, \mathbf{\hat{y}}_{\tau}) = \frac{1}{H} \sum^{t+H}_{\tau=t+1} \frac{|y_{\tau}-\hat{y}_{\tau}|}{|y_{\tau}|+|\hat{y}_{\tau}|}" />
+
+- Fixes bias of MAPE
+- Equally sensitive to over and under forecasting
+- To be avoided if your data has values close to 0 or equal to 0
+
+With TimeGPT, you can choose your loss function during fine-tuning as to
+maximize the model's performance metric for your particular use case.
+
+## Experimentation with Loss Function
+
+Let's run a small experiment to see how each loss function improves their
+associated metric when compared to the default setting.
+
+Let's split the dataset into training and testing sets.
+
+```python
+train = df[:-36]
+test = df[-36:]
+```
+
+Next, let's compute the forecasts with the various loss functions.
+
+```python
+losses = ['default', 'mae', 'mse', 'rmse', 'mape', 'smape']
+
+test = test.copy()
+
+for loss in losses:
+    preds_df = nixtla_client.forecast(
+    df=train,
+    h=36,
+    finetune_steps=10,
+    finetune_loss=loss,
+    time_col='timestamp',
+    target_col='value')
+
+    preds = preds_df['TimeGPT'].values
+
+    test.loc[:,f'TimeGPT_{loss}'] = preds
+```
+
+Great! We have predictions from TimeGPT using all the different loss functions.
+We can evaluate the performance using their associated metric and measure the
+improvement.
+
+```python
+loss_fct_dict = {
+    "mae": mae,
+    "mse": mse,
+    "rmse": rmse,
+    "mape": mape,
+    "smape": smape
+}
+
+pct_improv = []
+
+for loss in losses[1:]:
+    evaluation = loss_fct_dict[f'{loss}'](test, models=['TimeGPT_default', f'TimeGPT_{loss}'], id_col='unique_id', target_col='value')
+    pct_diff = (evaluation['TimeGPT_default'] - evaluation[f'TimeGPT_{loss}']) / evaluation['TimeGPT_default'] * 100
+    pct_improv.append(round(pct_diff, 2))
+```
+
+```python
+data = {
+    'mae': pct_improv[0].values,
+    'mse': pct_improv[1].values,
+    'rmse': pct_improv[2].values,
+    'mape': pct_improv[3].values,
+    'smape': pct_improv[4].values
+}
+
+metrics_df = pd.DataFrame(data)
+metrics_df.index = ['Metric improvement (%)']
+
+metrics_df
+```
+
+|                          | mae    | mse    | rmse   | mape    | smape   |
+| ------------------------ | ------ | ------ | ------ | ------- | ------- |
+| Metric improvement (%)   | 8.54   | 0.31   | 0.64   | 31.02   | 7.36    |
+
+
+From the table above, we can see that using a specific loss function during
+fine-tuning will improve its associated error metric when compared to the
+default loss function.
+
+In this example, using the MAE as the loss function improves the metric by
+8.54% when compared to using the default loss function.
+
+That way, depending on your use case and performance metric, you can use the
+appropriate loss function to maximize the accuracy of the forecasts.
\ No newline at end of file
--- a/timegpt-docs/forecasting/fine-tuning/depth.mdx
+++ b/timegpt-docs/forecasting/fine-tuning/depth.mdx
+---
+title: "Controlling the Level of Fine-Tuning"
+description: "Learn how to use the finetune_depth parameter to control the extent of fine-tuning in TimeGPT models."
+icon: "gear"
+---
+
+## Controlling the Level of Fine-Tuning
+It is possible to control the depth of fine-tuning with the `finetune_depth`
+parameter.
+
+`finetune_depth` takes values among `[1, 2, 3, 4, 5]`. By default, it is set to
+1, which means that a small set of the model's parameters are being adjusted,
+whereas a value of 5 fine-tunes the maximum amount of parameters.
+
+Increasing `finetune_depth` also increases the time to generate predictions.
+While it can generate better results, we must be careful to not overfit the
+model, in which case the predictions may not be as accurate.
+
+Let's run a small experiment to see how `finetune_depth` impacts the performance.
+
+
+## How to Control the Level of Fine-Tuning
+
+[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Nixtla/nixtla/blob/main/nbs/docs/tutorials/23_finetune_depth_finetuning.ipynb)
+
+### Step 1: Import Packages
+
+First, we import the required packages and initialize the Nixtla client
+
+```python
+import pandas as pd
+from nixtla import NixtlaClient
+from utilsforecast.losses import mae, mse
+from utilsforecast.evaluation import evaluate
+```
+
+```python
+nixtla_client = NixtlaClient(
+    # defaults to os.environ.get("NIXTLA_API_KEY")
+    api_key='my_api_key_provided_by_nixtla'
+)
+```
+
+### Step 2: Load Data
+
+Next, load the dataset
+
+```python
+df = pd.read_csv(
+    'https://raw.githubusercontent.com/Nixtla/transfer-learning-time-series/main/datasets/air_passengers.csv'
+)
+df.head()
+```
+
+Now, we split the data into a training and test set so that we can measure the
+performance of the model as we vary `finetune_depth`.
+
+```python
+train = df[:-24]
+test = df[-24:]
+```
+
+### Step 3: Fine-Tuning With finetune_depth
+
+As mentioned above, `finetune_depth` controls how many parameters from TimeGPT
+are fine-tuned on your particular dataset. If the value is set to 1, only a few
+parameters are fine-tuned. Setting it to 5 means that all parameters of the
+model will be fine-tuned.
+
+Using a large value for `finetune_depth` can lead to better performances for
+large datasets with complex patterns. However, it can also lead to overfitting,
+in which case the accuracy of the forecasts may degrade, as we will see from the
+small experiment below.
+
+```python
+depths = [1, 2, 3, 4, 5]
+
+test = test.copy()
+
+for depth in depths:
+    preds_df = nixtla_client.forecast(
+        df=train,
+        h=24,
+        finetune_steps=5,
+        finetune_depth=depth,
+        time_col='timestamp',
+        target_col='value'
+    )
+
+    preds = preds_df['TimeGPT'].values
+    test.loc[:, f'TimeGPT_depth{depth}'] = preds
+```
+
+Evaluate the forecasts using MAE and MSE metrics:
+
+```python
+test['unique_id'] = 0
+
+evaluation = evaluate(
+    test,
+    metrics=[mae, mse],
+    time_col="timestamp",
+    target_col="value"
+)
+evaluation
+```
+
+| unique_id | metric | TimeGPT_depth1 | TimeGPT_depth2 | TimeGPT_depth3 | TimeGPT_depth4 | TimeGPT_depth5 |
+|-----------|--------|----------------|----------------|----------------|----------------|----------------|
+| 0         | mae    | 22.675540      | 17.908963      | 21.318518      | 24.745096      | 28.734302      |
+| 0         | mse    | 677.254283     | 461.320852     | 676.202126     | 991.835359     | 1119.722602    |
+
+
+From the result above, we can see that a `finetune_depth` of 2 achieves the best
+results since it has the lowest MAE and MSE.
+
+Also notice that with a `finetune_depth` of 4 and 5, the performance degrades,
+which is a clear sign of overfitting.
+
+Thus, keep in mind that fine-tuning can be a bit of trial and error. You might
+need to adjust the number of `finetune_steps` and the level of `finetune_depth`
+based on your specific needs and the complexity of your data. Usually, a higher
+`finetune_depth` works better for large datasets. In this specific tutorial,
+since we were forecasting a single series with a very short dataset, increasing
+the depth led to overfitting.
+
+It's recommended to monitor the model's performance during fine-tuning and
+adjust as needed. Be aware that more `finetune_steps` and a larger value of
+`finetune_depth` may lead to longer training times and could potentially lead
+to overfitting if not managed properly.
\ No newline at end of file
--- a/timegpt-docs/forecasting/fine-tuning/save_reuse_delete_finetuned_models.mdx
+++ b/timegpt-docs/forecasting/fine-tuning/save_reuse_delete_finetuned_models.mdx
+---
+title: "Re-using fine-tuned models"
+description: "Learn how to save, fine-tune, list, and delete TimeGPT models to optimize forecasting."
+icon: "gear"
+---
+
+## Re-using Fine-tuned Models
+
+Reusing previously fine-tuned TimeGPT models can help reduce computation time
+and costs while maintaining or improving forecast accuracy. This guide walks you
+through the steps to save, fine-tune, list, and delete your TimeGPT models effectively.
+
+
+## How to Re-use Fine-tuned Models
+
+[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Nixtla/nixtla/blob/main/nbs/docs/tutorials/061_reusing_finetuned_models.ipynb)
+
+
+### Step 1: Import Packages
+
+First, we import the required packages and initialize the Nixtla client
+
+```python
+import pandas as pd
+from nixtla import NixtlaClient
+from utilsforecast.losses import rmse
+from utilsforecast.evaluation import evaluate
+```
+
+```python
+nixtla_client = NixtlaClient(
+    # defaults to os.environ["NIXTLA_API_KEY"]
+    api_key='my_api_key_provided_by_nixtla'
+)
+```
+
+### Step 2: Load Data
+
+Load the forecasting dataset and prepare the train/validation split.
+
+```python
+df = pd.read_parquet('https://datasets-nixtla.s3.amazonaws.com/m4-hourly.parquet')
+
+h = 48
+
+valid = df.groupby('unique_id', observed=True).tail(h)
+train = df.drop(valid.index)
+
+train.head()
+```
+
+|       | unique_id   | ds    | y       |
+| ----- | ----------- | ----- | ------- |
+| 0     | H1          | 1     | 605.0   |
+| 1     | H1          | 2     | 586.0   |
+| 2     | H1          | 3     | 586.0   |
+| 3     | H1          | 4     | 559.0   |
+| 4     | H1          | 5     | 511.0   |
+
+### Step 3: Zero-shot forecast
+
+We can try forecasting without any finetuning to see how well TimeGPT does.
+
+```python
+fcst_kwargs = {
+    'df': train,
+    'freq': 1,
+    'model': 'timegpt-1-long-horizon'
+}
+
+fcst = nixtla_client.forecast(h=h, **fcst_kwargs)
+
+zero_shot_eval = evaluate(fcst.merge(valid), metrics=[rmse], agg_fn='mean')
+zero_shot_eval
+```
+
+| metric   | TimeGPT       |
+| -------- | ------------- |
+| rmse     | 1504.474342   |
+
+
+### Step 4: Fine-tune the model
+
+We can now fine-tune TimeGPT a little and save our model for later use. We can
+define the ID that we want that model to have by providing it through `output_model_id`.
+This ID is also returned as the output of the `finetune` method.
+
+```python
+first_model_id = 'my-first-finetuned-model'
+
+nixtla_client.finetune(output_model_id=first_model_id, **fcst_kwargs)
+```
+
+```bash
+'my-first-finetuned-model'
+```
+
+We can now forecast using this fine-tuned model by providing its ID through
+the `finetuned_model_id` argument.
+
+```python
+first_finetune_fcst = nixtla_client.forecast(
+    h=h,
+    finetuned_model_id=first_model_id,
+    **fcst_kwargs
+)
+
+first_finetune_eval = evaluate(
+    first_finetune_fcst.merge(valid),
+    metrics=[rmse],
+    agg_fn='mean'
+)
+
+zero_shot_eval.merge(
+    first_finetune_eval,
+    on=['metric'],
+    suffixes=('_zero_shot', '_first_finetune')
+)
+```
+
+| metric   | TimeGPT_zero_shot   | TimeGPT_first_finetune   |
+| -------- | ------------------- | ------------------------ |
+| rmse     | 1504.474342         | 1472.024619              |
+
+We can see the error was reduced.
+
+### Step 5: Further fine-tune the model
+
+We can now take this model and fine-tune it a bit further by using the
+`NixtlaClient.finetune` method but providing our already fine-tuned model as
+`finetuned_model_id`, which will take that model and fine-tune it a bit more.
+We can also change the fine-tuning settings, like using `finetune_depth=3`, for
+example. As before, the new finetuned model ID is returned by the `finetune` method.
+
+```python
+second_model_id = nixtla_client.finetune(
+    finetuned_model_id=first_model_id,
+    finetune_depth=3,
+    **fcst_kwargs
+)
+
+second_model_id
+```
+
+```bash
+'468b13fb-4b26-447a-bd87-87a64b50d913'
+```
+
+Since we didn't provide `output_model_id` this time, it got assigned an UUID.
+
+We can now use this model to forecast.
+
+```python
+second_finetune_fcst = nixtla_client.forecast(
+    h=h,
+    finetuned_model_id=second_model_id,
+    **fcst_kwargs
+)
+
+second_finetune_eval = evaluate(
+    second_finetune_fcst.merge(valid),
+    metrics=[rmse],
+    agg_fn='mean'
+)
+
+first_finetune_eval.merge(
+    second_finetune_eval,
+    on=['metric'],
+    suffixes=('_first_finetune', '_second_finetune')
+)
+```
+
+| metric   | TimeGPT_first_finetune   | TimeGPT_second_finetune   |
+| -------- | ------------------------ | ------------------------- |
+| rmse     | 1472.024619              | 1435.365211               |
+
+We can see the error was reduced a bit more.
+
+
+### Step 6: List fine-tuned models
+
+We can list our fine-tuned models with the `NixtlaClient.finetuned_models` method.
+
+```python
+finetuned_models = nixtla_client.finetuned_models()
+finetuned_models
+```
+
+```bash
+[FinetunedModel(id='468b13fb-4b26-447a-bd87-87a64b50d913', created_at=datetime.datetime(2024, 12, 30, 17, 57, 31, 241455, tzinfo=TzInfo(UTC)), created_by='user', base_model_id='my-first-finetuned-model', steps=10, depth=3, loss='default', model='timegpt-1-long-horizon', freq='MS'),
+ FinetunedModel(id='my-first-finetuned-model', created_at=datetime.datetime(2024, 12, 30, 17, 57, 16, 978907, tzinfo=TzInfo(UTC)), created_by='user', base_model_id='None', steps=10, depth=1, loss='default', model='timegpt-1-long-horizon', freq='MS')]
+```
+
+While that representation may be useful for programmatic use, in this exploratory
+setting it's nicer to see them as a dataframe, which we can get by providing `as_df=True`.
+
+```python
+nixtla_client.finetuned_models(as_df=True)
+```
+
+| id                                     | created_at                         | created_by   | base_model_id              | steps   | depth   | loss      | model                    | freq   |
+| -------------------------------------- | ---------------------------------- | ------------ | -------------------------- | ------- | ------- | --------- | ------------------------ | ------ |
+| 468b13fb-4b26-447a-bd87-87a64b50d913   | 2024-12-30 17:57:31.241455+00:00   | user         | my-first-finetuned-model   | 10      | 3       | default   | timegpt-1-long-horizon   | MS     |
+| my-first-finetuned-model               | 2024-12-30 17:57:16.978907+00:00   | user         | None                       | 10      | 1       | default   | timegpt-1-long-horizon   | MS     |
+
+
+We can see that the `base_model_id` of our second model is our first model,
+along with other metadata.
+
+
+### Step 7: Delete fine-tuned models
+
+In order to keep things organized, and since there's a limit of 50 fine-tuned
+models, you can delete models that weren't so promising to make room for more
+experiments. For example, we can delete our first finetuned model. Note that
+even though it was used as the base for our second model, they're saved
+independently so removing it won't affect our second model, except for the
+dangling metadata.
+
+```python
+nixtla_client.delete_finetuned_model(first_model_id)
+
+nixtla_client.finetuned_models(as_df=True)
+```
+
+| id                                     | created_at                         | created_by   | base_model_id              | steps   | depth   | loss      | model                    | freq   |
+| -------------------------------------- | ---------------------------------- | ------------ | -------------------------- | ------- | ------- | --------- | ------------------------ | ------ |
+| 468b13fb-4b26-447a-bd87-87a64b50d913   | 2024-12-30 17:57:31.241455+00:00   | user         | my-first-finetuned-model   | 10      | 3       | default   | timegpt-1-long-horizon   | MS     |
+
+> WARNING: Deleting a fine-tuned model is irreversible. Make sure to back up any
+necessary information before removal.
+
+## Conclusion
+
+Congratulations! You have successfully learned how to save, refine, and manage your fine-tuned TimeGPT models.
+This workflow helps optimize your forecasting pipelines by leveraging previously generated insights.
\ No newline at end of file
--- a/timegpt-docs/forecasting/fine-tuning/steps.mdx
+++ b/timegpt-docs/forecasting/fine-tuning/steps.mdx
+---
+title: "Fine-tuning Tutorial TimeGPT"
+description: "Adapt TimeGPT to your specific datasets for more accurate forecasts"
+icon: "sliders"
+---
+
+Fine-tuning is a powerful process for utilizing TimeGPT more effectively.
+Foundation models such as TimeGPT are pre-trained on vast amounts of data,
+capturing wide-ranging features and patterns. These models can then be
+specialized for specific contexts or domains. With fine-tuning, the model's
+parameters are refined to forecast a new task, allowing it to tailor its vast
+pre-existing knowledge towards the requirements of the new data. Fine-tuning
+thus serves as a crucial bridge, linking TimeGPT's broad capabilities to your
+tasks specificities.
+
+Concretely, the process of fine-tuning consists of performing a certain number
+of training iterations on your input data minimizing the forecasting error.
+The forecasts will then be produced with the updated model. To control the
+number of iterations, use the `finetune_steps` argument of the `forecast` method.
+
+## Tutorial
+
+[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Nixtla/nixtla/blob/main/nbs/docs/tutorials/06_finetuning.ipynb)
+
+### Step 1: Import Packages and Initialize Client
+
+First, we import the required packages and initialize the Nixtla client.
+
+```python
+import pandas as pd
+from nixtla import NixtlaClient
+from utilsforecast.losses import mae, mse
+from utilsforecast.evaluation import evaluate
+```
+
+Next, initialize the NixtlaClient instance, providing your API key (or rely on
+environment variables):
+
+```python initialize-client
+nixtla_client = NixtlaClient(
+    api_key='my_api_key_provided_by_nixtla'  # Defaults to os.environ.get("NIXTLA_API_KEY")
+)
+```
+
+### Step 2: Load Data
+
+Load the dataset from the provided CSV URL:
+
+```python load-data
+df = pd.read_csv(
+    "https://raw.githubusercontent.com/Nixtla/transfer-learning-time-series/main/datasets/air_passengers.csv"
+)
+df.head()
+```
+
+|       | timestamp    | value   |
+| ----- | ------------ | ------- |
+| 0     | 1949-01-01   | 112     |
+| 1     | 1949-02-01   | 118     |
+| 2     | 1949-03-01   | 132     |
+| 3     | 1949-04-01   | 129     |
+| 4     | 1949-05-01   | 121     |
+
+
+### Step 3: Fine-tune the Model
+
+Set the number of fine-tuning iterations with the **finetune_steps** parameter.
+Here, `finetune_steps=10` means the model will go through 10 iterations of
+training on your time series data.
+
+```python
+timegpt_fcst_finetune_df = nixtla_client.forecast(
+    df=df,
+    h=12,
+    finetune_steps=10,
+    time_col='timestamp',
+    target_col='value',
+)
+```
+
+Visualize forecasts to confirm performance:
+
+```python
+nixtla_client.plot(
+    df,
+    timegpt_fcst_finetune_df,
+    time_col='timestamp',
+    target_col='value',
+)
+```
+
+<Frame>
+  ![Forecast Plot](https://raw.githubusercontent.com/Nixtla/nixtla/readme_docs/nbs/_docs/docs/tutorials/06_finetuning_files/figure-markdown_strict/cell-12-output-1.png)
+</Frame>
+
+## Conclusion
+
+Keep in mind that fine-tuning can be a bit of trial and error. You might need to
+adjust the number of `finetune_steps` based on your specific needs and the
+complexity of your data. Usually, a larger value of `finetune_steps` works
+better for large datasets.
+
+It's recommended to monitor the model's performance during fine-tuning and
+adjust as needed. Be aware that more `finetune_steps` may lead to longer
+training times and could potentially lead to overfitting if not managed properly.
+
+Remember, fine-tuning is a powerful feature, but it should be used thoughtfully
+and carefully.
+
+## Additional Resources
+
+- For a detailed guide on using a specific loss function for fine-tuning, check out the
+[Fine-tuning with a specific loss function](/forecasting/fine-tuning/custom_loss)
+tutorial.
+- Also, read our detailed tutorial on [controlling the level of fine-tuning](/forecasting/fine-tuning/depth)
+using `finetune_depth`.
\ No newline at end of file
--- a/timegpt-docs/forecasting/forecasting-at-scale/computing_at_scale.mdx
+++ b/timegpt-docs/forecasting/forecasting-at-scale/computing_at_scale.mdx
+---
+title: "Distributed Forecasting with Spark, Dask & Ray"
+description: "Scale your time series forecasting with TimeGPT using Spark, Dask, or Ray. Learn distributed computing for millions of time series with Python code examples and best practices."
+icon: "microchip"
+---
+
+## Distributed Computing for Large-Scale Forecasting
+
+Handling large datasets is a common challenge in time series forecasting. For example, when working with retail data, you may need to forecast sales for 100,000+ products across hundreds of stores—generating millions of forecasts daily. Similarly, when dealing with electricity consumption data, you may need to predict consumption for millions of smart meters across multiple regions in real-time.
+
+### Why Distributed Computing for Forecasting?
+
+Distributed computing offers significant advantages for time series forecasting:
+
+- **Speed**: Reduce computation time by 10-100x compared to single-machine processing
+- **Scalability**: Handle datasets that don't fit in memory on a single machine
+- **Cost-efficiency**: Process more forecasts in less time, optimizing resource utilization
+- **Reliability**: Fault-tolerant processing ensures forecasts complete even if individual nodes fail
+
+Nixtla's **TimeGPT** enables you to efficiently handle expansive datasets by integrating distributed computing frameworks (**[Spark](https://spark.apache.org/)**, **[Dask](https://www.dask.org/)**, and **[Ray](https://www.ray.io/)** through **Fugue**) that parallelize forecasts across multiple time series and drastically reduce computation times.
+
+
+
+## Getting Started
+
+Before getting started, ensure you have your TimeGPT API key. Upon [registration](https://dashboard.nixtla.io/), you'll receive an email prompting you to confirm your signup. Once confirmed, access your dashboard and navigate to the **API Keys** section to retrieve your key. For detailed setup instructions, see the [Setting Up Your Authentication Key tutorial](/setup/setting_up_your_api_key).
+
+## How to Use TimeGPT with Distributed Computing Frameworks
+
+Using TimeGPT with distributed computing frameworks is straightforward. The process only slightly differs from non-distributed usage.
+
+### Step 1: Instantiate a NixtlaClient class
+
+```python
+from nixtla import NixtlaClient
+
+# Replace 'YOUR_API_KEY' with the key obtained from your Nixtla dashboard
+client = NixtlaClient(api_key="YOUR_API_KEY")
+```
+
+### Step 2: Load your data into a pandas DataFrame
+
+Make sure your data is properly formatted, with each time series uniquely identified (e.g., by store or product).
+
+```python
+import pandas as pd
+
+data = pd.read_csv("your_time_series_data.csv")
+```
+
+### Step 3: Initialize a distributed computing framework
+
+Currently, TimeGPT supports:
+
+- [Spark](/forecasting/forecasting-at-scale/spark)
+
+- [Dask](/forecasting/forecasting-at-scale/dask)
+
+- [Ray](/forecasting/forecasting-at-scale/ray)
+
+
+Follow the links above for examples on setting up each framework.
+
+### Step 4: Use NixtlaClient methods to forecast at scale
+
+Once your framework is initialized and your data is loaded, you can apply the forecasting methods:
+
+```python
+# Example function call within the distributed environment
+forecast_results = client.forecast(
+    data=data,
+    h=14     # horizon (e.g., 14 days)
+)
+```
+
+### Step 5: Stop the distributed computing framework
+
+When you're finished, you may need to terminate your Spark, Dask, or Ray session. This depends on your environment and setup.
+
+Parallelization in these frameworks operates across multiple time series within your dataset. Ensure each series is uniquely identified so the parallelization can be fully leveraged.
+
+## Real-World Use Cases
+
+Distributed forecasting with TimeGPT is essential for:
+
+- **Retail & E-commerce**: Forecast demand for 100,000+ SKUs across multiple locations simultaneously
+- **Energy & Utilities**: Predict consumption patterns for millions of smart meters in real-time
+- **Finance**: Generate forecasts for thousands of stocks, currencies, or commodities
+- **IoT & Manufacturing**: Process sensor data from thousands of devices for predictive maintenance
+- **Telecommunications**: Forecast network traffic across thousands of cell towers
+
+The distributed approach reduces forecast generation time from hours to minutes, enabling faster decision-making at scale.
+
+## Important Considerations
+
+### When to Use a Distributed Computing Framework
+
+Consider a distributed framework if your dataset:
+
+- Contains millions of observations across multiple time series
+- Cannot fit into memory on a single machine
+- Requires extensive processing time that is impractical on a single machine
+
+### Choosing the Right Framework
+
+When selecting Spark, Dask, or Ray, weigh your existing infrastructure and your team's expertise. Minimal code changes allow TimeGPT to work with each of these frameworks seamlessly. Pick the framework that aligns with your organization's tools and resources for the most efficient large-scale forecasting efforts.
+
+### Framework Comparison
+
+| Framework | Best For | Ideal Dataset Size | Learning Curve |
+|-----------|----------|-------------------|----------------|
+| **Spark** | Enterprise environments with existing Hadoop infrastructure | 100M+ observations | Medium |
+| **Dask** | Python-native workflows, easy scaling from pandas | 10M-100M observations | Low |
+| **Ray** | Machine learning pipelines, complex task dependencies | 10M+ observations | Medium |
+
+Each framework integrates seamlessly with TimeGPT through Fugue, requiring minimal code changes to scale from single-machine to distributed forecasting.
+
+### Best Practices
+
+To maximize the benefits of distributed forecasting:
+
+- **Distribute workloads efficiently**: Spread your forecasts across multiple compute nodes to handle huge datasets without exhausting memory or overwhelming single-machine resources.
+- **Use proper identifiers**: Ensure your data has distinct identifiers for each series. Correct labeling is crucial for successful multi-series parallel forecasts.
+
+## Frequently Asked Questions
+
+**Q: Which distributed framework should I choose for TimeGPT?**
+
+Choose **Spark** if you have existing Hadoop infrastructure, **Dask** if you're already using Python/pandas and want the easiest transition, or **Ray** if you're building complex ML pipelines.
+
+**Q: How much faster is distributed forecasting compared to single-machine?**
+
+Speed improvements typically range from 10-100x depending on your dataset size, number of time series, and cluster configuration. Datasets with more independent time series see greater parallelization benefits.
+
+**Q: Do I need to change my TimeGPT code to use distributed computing?**
+
+Minimal changes are required. After initializing your chosen framework (Spark/Dask/Ray), TimeGPT automatically detects and uses distributed processing. The API calls remain the same.
+
+**Q: Can I use distributed computing with fine-tuning and cross-validation?**
+
+Yes, TimeGPT supports distributed [fine-tuning](/forecasting/fine-tuning/steps) and [cross-validation](/forecasting/evaluation/cross_validation) across all supported frameworks.
+
+## Related Resources
+
+Explore more TimeGPT capabilities:
+- [Spark Integration Guide](/forecasting/forecasting-at-scale/spark) - Detailed Spark setup and examples
+- [Dask Integration Guide](/forecasting/forecasting-at-scale/dask) - Dask configuration for TimeGPT
+- [Ray Integration Guide](/forecasting/forecasting-at-scale/ray) - Ray distributed forecasting tutorial
+- [Fine-tuning TimeGPT](/forecasting/fine-tuning/steps) - Improve accuracy at scale
+- [Cross-Validation](/forecasting/evaluation/cross_validation) - Validate distributed forecasts
\ No newline at end of file
--- a/timegpt-docs/forecasting/forecasting-at-scale/dask.mdx
+++ b/timegpt-docs/forecasting/forecasting-at-scale/dask.mdx
+---
+title: "Time Series Forecasting with Dask"
+description: "Scale pandas workflows with Dask and TimeGPT for distributed time series forecasting. Learn to process 10M+ time series in Python with minimal code changes."
+icon: "server"
+---
+
+## Overview
+
+[Dask](https://www.dask.org/) is an open-source parallel computing library for Python that scales pandas workflows seamlessly. This guide explains how to use TimeGPT from Nixtla with Dask for distributed forecasting tasks.
+
+Dask is ideal when you're already using pandas and need to scale beyond single-machine memory limits—typically for datasets with 10-100 million observations across multiple time series. Unlike Spark, Dask requires minimal code changes from your existing pandas workflow.
+
+## Why Use Dask for Time Series Forecasting?
+
+Dask offers unique advantages for scaling time series forecasting:
+
+- **Pandas-like API**: Minimal code changes from your existing pandas workflows
+- **Easy scaling**: Convert pandas DataFrames to Dask with a single line of code
+- **Python-native**: Pure Python implementation, no JVM required (unlike Spark)
+- **Flexible deployment**: Run on your laptop or scale to a cluster
+- **Memory efficiency**: Process datasets larger than RAM through intelligent chunking
+
+Choose Dask when you need to scale from 10 million to 100 million observations and want the smoothest transition from pandas.
+
+**What you'll learn:**
+
+- Simplify distributed computing with Fugue
+- Run TimeGPT at scale on a Dask cluster
+- Seamlessly convert pandas DataFrames to Dask
+
+## Prerequisites
+
+Before proceeding, make sure you have an [API key from Nixtla](/setup/setting_up_your_api_key).
+
+## How to Use TimeGPT with Dask
+
+[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Nixtla/nixtla/blob/main/nbs/docs/tutorials/17_computing_at_scale_dask_distributed.ipynb)
+
+### Step 1: Install Fugue and Dask
+
+Fugue provides an easy-to-use interface for distributed computing over frameworks like Dask.
+
+You can install Fugue with:
+
+```bash
+pip install fugue[dask]
+```
+
+If running on a distributed Dask cluster, ensure the `nixtla` library is installed on all worker nodes.
+
+### Step 2: Load Your Data
+
+You can start by loading data into a pandas DataFrame. In this example, we use hourly electricity prices from multiple markets:
+
+```python
+import pandas as pd
+
+df = pd.read_csv(
+    'https://raw.githubusercontent.com/Nixtla/transfer-learning-time-series/main/datasets/electricity-short.csv',
+    parse_dates=['ds'],
+)
+df.head()
+```
+
+Example pandas DataFrame:
+
+    |       | unique_id   | ds                    | y       |
+| ----- | ----------- | --------------------- | ------- |
+| 0     | BE          | 2016-10-22 00:00:00   | 70.00   |
+| 1     | BE          | 2016-10-22 01:00:00   | 37.10   |
+| 2     | BE          | 2016-10-22 02:00:00   | 37.10   |
+| 3     | BE          | 2016-10-22 03:00:00   | 44.75   |
+| 4     | BE          | 2016-10-22 04:00:00   | 37.10   |
+
+### Step 3: Import Dask
+
+Convert the pandas DataFrame into a Dask DataFrame for parallel processing.
+
+```python
+import dask.dataframe as dd
+
+dask_df = dd.from_pandas(df, npartitions=2)
+dask_df
+```
+
+When converting to a Dask DataFrame, you can specify the number of partitions based on your data size or system resources.
+
+### Step 4: Use TimeGPT on Dask
+
+To use TimeGPT with Dask, provide a Dask DataFrame to Nixtla's client methods instead of a pandas DataFrame.
+
+
+Instantiate the `NixtlaClient` class to interact with Nixtla's API:
+
+```python
+from nixtla import NixtlaClient
+
+nixtla_client = NixtlaClient(
+    api_key='my_api_key_provided_by_nixtla'
+)
+```
+
+You can use any method from the `NixtlaClient`, such as `forecast` or `cross_validation`.
+
+<Tabs>
+  <Tab title="Forecast Example">
+    ```python Forecast with TimeGPT and Dask
+    fcst_df = nixtla_client.forecast(dask_df, h=12)
+    fcst_df.compute().head()
+    ```
+    |       | unique_id   | ds                    | TimeGPT     |
+| ----- | ----------- | --------------------- | ----------- |
+| 0     | BE          | 2016-12-31 00:00:00   | 45.190453   |
+| 1     | BE          | 2016-12-31 01:00:00   | 43.244446   |
+| 2     | BE          | 2016-12-31 02:00:00   | 41.958389   |
+| 3     | BE          | 2016-12-31 03:00:00   | 39.796486   |
+| 4     | BE          | 2016-12-31 04:00:00   | 39.204533   |
+
+
+  </Tab>
+  <Tab title="Cross-validation Example">
+    ```python Cross-validation with TimeGPT and Dask
+    cv_df = nixtla_client.cross_validation(
+        dask_df,
+        h=12,
+        n_windows=5,
+        step_size=2
+    )
+    cv_df.compute().head()
+    ```
+    |       | unique_id   | ds                    | cutoff                | TimeGPT     |
+| ----- | ----------- | --------------------- | --------------------- | ----------- |
+| 0     | BE          | 2016-12-30 04:00:00   | 2016-12-30 03:00:00   | 39.375439   |
+| 1     | BE          | 2016-12-30 05:00:00   | 2016-12-30 03:00:00   | 40.039215   |
+| 2     | BE          | 2016-12-30 06:00:00   | 2016-12-30 03:00:00   | 43.455849   |
+| 3     | BE          | 2016-12-30 07:00:00   | 2016-12-30 03:00:00   | 47.716408   |
+| 4     | BE          | 2016-12-30 08:00:00   | 2016-12-30 03:00:00   | 50.316650   |
+
+
+  </Tab>
+</Tabs>
+
+
+
+## Working with Exogenous Variables
+
+TimeGPT with Dask also supports exogenous variables. Refer to the [Exogenous Variables Tutorial](/forecasting/exogenous-variables/numeric_features) for details. Simply substitute pandas DataFrames with Dask DataFrames—the API remains identical.
+
+## Related Resources
+
+Explore more distributed forecasting options:
+- [Distributed Computing Overview](/forecasting/forecasting-at-scale/computing_at_scale) - Compare Spark, Dask, and Ray
+- [Spark Integration](/forecasting/forecasting-at-scale/spark) - For datasets with 100M+ observations
+- [Ray Integration](/forecasting/forecasting-at-scale/ray) - For ML pipeline integration
+- [Fine-tuning TimeGPT](/forecasting/fine-tuning/steps) - Improve accuracy at scale
+- [Cross-Validation](/forecasting/evaluation/cross_validation) - Validate distributed forecasts
--- a/timegpt-docs/forecasting/forecasting-at-scale/ray.mdx
+++ b/timegpt-docs/forecasting/forecasting-at-scale/ray.mdx
+---
+title: "Time Series Forecasting with Ray"
+description: "Scale machine learning pipelines with Ray and TimeGPT for distributed time series forecasting. Learn to integrate TimeGPT with Ray for complex ML workflows in Python."
+icon: "server"
+---
+
+## Overview
+
+[Ray](https://www.ray.io/) is an open-source unified compute framework that helps scale Python workloads for distributed computing. This guide demonstrates how to distribute TimeGPT forecasting jobs on top of Ray.
+
+Ray is ideal for machine learning pipelines with complex task dependencies and datasets with 10+ million observations. Its unified framework excels at orchestrating distributed ML workflows, making it perfect for integrating TimeGPT into broader AI applications.
+
+## Why Use Ray for Time Series Forecasting?
+
+Ray offers unique advantages for ML-focused time series forecasting:
+
+- **ML pipeline integration**: Seamlessly integrate TimeGPT into complex ML workflows with Ray Tune and Ray Serve
+- **Task parallelism**: Handle complex task dependencies beyond data parallelism
+- **Python-native**: Pure Python with minimal boilerplate code
+- **Flexible architecture**: Scale from laptop to cluster with the same code
+- **Actor model**: Stateful computations for advanced forecasting scenarios
+
+Choose Ray when you're building ML pipelines, need complex task orchestration, or want to integrate TimeGPT with other ML frameworks like PyTorch or TensorFlow.
+
+**What you'll learn:**
+
+- Install Fugue with Ray support for distributed computing
+- Initialize Ray clusters for distributed forecasting
+- Run TimeGPT forecasting and cross-validation on Ray
+
+## Prerequisites
+
+Before proceeding, make sure you have an [API key from Nixtla](/setup/setting_up_your_api_key).
+
+When executing on a distributed Ray cluster, ensure the `nixtla` library is installed on all workers.
+
+## How to Use TimeGPT with Ray
+
+[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Nixtla/nixtla/blob/main/nbs/docs/tutorials/19_computing_at_scale_ray_distributed.ipynb)
+
+### Step 1: Install Fugue and Ray
+
+Fugue provides an easy-to-use interface for distributed computation across frameworks like Ray.
+
+Install Fugue with Ray support:
+
+```bash
+pip install fugue[ray]
+```
+
+### Step 2: Load Your Data
+
+Load your dataset into a pandas DataFrame. This tutorial uses hourly electricity prices from various markets:
+
+```python
+import pandas as pd
+
+df = pd.read_csv(
+    'https://raw.githubusercontent.com/Nixtla/transfer-learning-time-series/main/datasets/electricity-short.csv',
+    parse_dates=['ds'],
+)
+df.head()
+```
+
+Example pandas DataFrame:
+
+|       | unique_id   | ds                    | y       |
+| ----- | ----------- | --------------------- | ------- |
+| 0     | BE          | 2016-10-22 00:00:00   | 70.00   |
+| 1     | BE          | 2016-10-22 01:00:00   | 37.10   |
+| 2     | BE          | 2016-10-22 02:00:00   | 37.10   |
+| 3     | BE          | 2016-10-22 03:00:00   | 44.75   |
+| 4     | BE          | 2016-10-22 04:00:00   | 37.10   |
+
+### Step 3: Initialize Ray
+
+Create a Ray cluster locally by initializing a head node. You can scale this to multiple machines in a real cluster environment.
+
+```python
+import ray
+from ray.cluster_utils import Cluster
+
+ray_cluster = Cluster(
+    initialize_head=True,
+    head_node_args={"num_cpus": 2}
+)
+
+ray.init(address=ray_cluster.address, ignore_reinit_error=True)
+
+# Convert your DataFrame to Ray format:
+ray_df = ray.data.from_pandas(df)
+ray_df
+```
+
+### Step 4: Use TimeGPT on Ray
+
+To use TimeGPT with Ray, provide a Ray Dataset to Nixtla's client methods instead of a pandas DataFrame. The API remains the same as local usage.
+
+Instantiate the `NixtlaClient` class to interact with Nixtla's API:
+
+```python
+from nixtla import NixtlaClient
+
+nixtla_client = NixtlaClient(
+    api_key='my_api_key_provided_by_nixtla'
+)
+```
+
+You can use any method from the `NixtlaClient`, such as `forecast` or `cross_validation`.
+
+<Tabs>
+  <Tab title="Forecast Example">
+    ```python
+    fcst_df = nixtla_client.forecast(ray_df, h=12)
+    fcst_df.to_pandas().tail()
+    ```
+
+    Public API models supported include `timegpt-1` (default) and `timegpt-1-long-horizon`. For long horizon forecasting, see the [long-horizon model tutorial](/forecasting/model-version/longhorizon_model).
+  </Tab>
+  <Tab title="Cross-validation Example">
+    ```python
+    cv_df = nixtla_client.cross_validation(
+        ray_df,
+        h=12,
+        freq='H',
+        n_windows=5,
+        step_size=2
+    )
+    cv_df.to_pandas().tail()
+    ```
+  </Tab>
+</Tabs>
+
+### Step 5: Shutdown Ray
+
+Always shut down Ray after you finish your tasks to free up resources:
+
+```python
+ray.shutdown()
+```
+
+## Working with Exogenous Variables
+
+TimeGPT with Ray also supports exogenous variables. Refer to the [Exogenous Variables Tutorial](/forecasting/exogenous-variables/numeric_features) for details. Simply substitute pandas DataFrames with Ray Datasets—the API remains identical.
+
+## Related Resources
+
+Explore more distributed forecasting options:
+- [Distributed Computing Overview](/forecasting/forecasting-at-scale/computing_at_scale) - Compare Spark, Dask, and Ray
+- [Spark Integration](/forecasting/forecasting-at-scale/spark) - For datasets with 100M+ observations
+- [Dask Integration](/forecasting/forecasting-at-scale/dask) - For datasets with 10M-100M observations
+- [Fine-tuning TimeGPT](/forecasting/fine-tuning/steps) - Improve accuracy at scale
+- [Cross-Validation](/forecasting/evaluation/cross_validation) - Validate distributed forecasts
\ No newline at end of file
--- a/timegpt-docs/forecasting/forecasting-at-scale/spark.mdx
+++ b/timegpt-docs/forecasting/forecasting-at-scale/spark.mdx
+---
+title: "Time Series Forecasting with Spark"
+description: "Scale enterprise time series forecasting with Spark and TimeGPT. Learn to process 100M+ observations across distributed clusters with Python and PySpark."
+icon: "server"
+---
+
+## Overview
+
+[Spark](https://spark.apache.org/) is an open-source distributed compute framework designed for large-scale data processing. This guide demonstrates how to use TimeGPT with Spark to perform forecasting and cross-validation across distributed clusters.
+
+Spark is ideal for enterprise environments with existing Hadoop infrastructure and datasets exceeding 100 million observations. Its robust distributed architecture handles massive-scale time series forecasting with fault tolerance and high performance.
+
+## Why Use Spark for Time Series Forecasting?
+
+Spark offers unique advantages for enterprise-scale time series forecasting:
+
+- **Enterprise-grade scalability**: Handle datasets with 100M+ observations across distributed clusters
+- **Hadoop integration**: Seamlessly integrate with existing HDFS and Hadoop ecosystems
+- **Fault tolerance**: Automatic recovery from node failures ensures reliable computation
+- **Mature ecosystem**: Leverage Spark's rich ecosystem of tools and libraries
+- **Multi-language support**: Work with Python (PySpark), Scala, or Java
+
+Choose Spark when you have enterprise infrastructure, datasets exceeding 100 million observations, or need robust fault tolerance for mission-critical forecasting.
+
+**What you'll learn:**
+
+- Install Fugue with Spark support for distributed computing
+- Convert pandas DataFrames to Spark DataFrames
+- Run TimeGPT forecasting and cross-validation on Spark clusters
+
+## Prerequisites
+
+Before proceeding, make sure you have an [API key from Nixtla](/setup/setting_up_your_api_key).
+
+If executing on a distributed Spark cluster, ensure the `nixtla` library is installed on all worker nodes for consistent execution.
+
+## How to Use TimeGPT with Spark
+
+[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Nixtla/nixtla/blob/main/nbs/docs/tutorials/16_computing_at_scale_spark_distributed.ipynb)
+
+### Step 1: Install Fugue and Spark
+
+Fugue provides a convenient interface to distribute Python code across frameworks like Spark.
+
+Install Fugue with Spark support:
+
+```bash
+pip install fugue[spark]
+```
+
+To work with TimeGPT, make sure you have the `nixtla` library installed as well.
+
+### Step 2: Load Your Data
+
+Load the dataset into a pandas DataFrame. In this example, we use hourly electricity price data from different markets:
+
+```python
+import pandas as pd
+
+df = pd.read_csv(
+    'https://raw.githubusercontent.com/Nixtla/transfer-learning-time-series/main/datasets/electricity-short.csv',
+    parse_dates=['ds'],
+)
+df.head()
+```
+
+### Step 3: Initialize Spark
+
+Create a Spark session and convert your pandas DataFrame to a Spark DataFrame:
+
+```python
+from pyspark.sql import SparkSession
+
+spark = SparkSession.builder.getOrCreate()
+
+spark_df = spark.createDataFrame(df)
+spark_df.show(5)
+```
+
+### Step 4: Use TimeGPT on Spark
+
+To use TimeGPT with Spark, provide a Spark DataFrame to Nixtla's client methods instead of a pandas DataFrame. The main difference from local usage is working with Spark DataFrames instead of pandas DataFrames.
+
+Instantiate the `NixtlaClient` class to interact with Nixtla's API:
+
+```python
+from nixtla import NixtlaClient
+
+nixtla_client = NixtlaClient(
+    api_key='my_api_key_provided_by_nixtla'
+)
+```
+
+You can use any method from the `NixtlaClient`, such as `forecast` or `cross_validation`.
+
+<Tabs>
+  <Tab title="Forecast Example">
+    ```python
+    fcst_df = nixtla_client.forecast(spark_df, h=12)
+    fcst_df.show(5)
+    ```
+
+    When using Azure AI endpoints, specify `model="azureai"`:
+
+    ```python
+    nixtla_client.forecast(
+        spark_df,
+        h=12,
+        model="azureai"
+    )
+    ```
+
+    The public API supports two models: `timegpt-1` (default) and `timegpt-1-long-horizon`. For long horizon forecasting, see the [long-horizon model tutorial](/forecasting/model-version/longhorizon_model).
+  </Tab>
+  <Tab title="Cross-validation Example">
+    ```python
+    cv_df = nixtla_client.cross_validation(
+        spark_df,
+        h=12,
+        n_windows=5,
+        step_size=2
+    )
+    cv_df.show(5)
+    ```
+  </Tab>
+</Tabs>
+
+### Step 5: Stop Spark
+
+After completing your tasks, stop the Spark session to free resources:
+
+```python
+spark.stop()
+```
+
+## Working with Exogenous Variables
+
+TimeGPT with Spark also supports exogenous variables. Refer to the [Exogenous Variables Tutorial](/forecasting/exogenous-variables/numeric_features) for details. Simply substitute pandas DataFrames with Spark DataFrames—the API remains identical.
+
+## Related Resources
+
+Explore more distributed forecasting options:
+- [Distributed Computing Overview](/forecasting/forecasting-at-scale/computing_at_scale) - Compare Spark, Dask, and Ray
+- [Dask Integration](/forecasting/forecasting-at-scale/dask) - For datasets with 10M-100M observations
+- [Ray Integration](/forecasting/forecasting-at-scale/ray) - For ML pipeline integration
+- [Fine-tuning TimeGPT](/forecasting/fine-tuning/steps) - Improve accuracy at scale
+- [Cross-Validation](/forecasting/evaluation/cross_validation) - Validate distributed forecasts
\ No newline at end of file
--- a/timegpt-docs/forecasting/improve_accuracy.mdx
+++ b/timegpt-docs/forecasting/improve_accuracy.mdx
+---
+title: "Improve Forecast Accuracy with TimeGPT"
+description: "Advanced techniques to enhance TimeGPT forecast accuracy for energy and electricity."
+icon: "bullseye"
+---
+[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Nixtla/nixtla/blob/main/nbs/docs/tutorials/22_how_to_improve_forecast_accuracy.ipynb)
+
+
+# Improve Forecast Accuracy with TimeGPT
+
+This guide demonstrates how to improve forecast accuracy using TimeGPT. We use hourly electricity price data from Germany as an illustrative example. Before you begin, make sure you have initialized the `NixtlaClient` object with your API key.
+
+## Forecasting Results Overview
+
+Below is a summary of our experiments and the corresponding accuracy improvements. We progressively refine forecasts by adding fine-tuning steps, adjusting loss functions, increasing the number of fine-tuned parameters, incorporating exogenous variables, and switching to a long-horizon model.
+
+| Steps   | Description                    | MAE    | MAE Improvement (%)   | RMSE   | RMSE Improvement (%)   |
+| ------- | ------------------------------ | ------ | --------------------- | ------ | ---------------------- |
+| 0       | Zero-Shot TimeGPT              | 18.5   | N/A                   | 20.0   | N/A                    |
+| 1       | Add Fine-Tuning Steps          | 11.5   | 38%                   | 12.6   | 37%                    |
+| 2       | Adjust Fine-Tuning Loss        | 9.6    | 48%                   | 11.0   | 45%                    |
+| 3       | Fine-tune More Parameters      | 9.0    | 51%                   | 11.3   | 44%                    |
+| 4       | Add Exogenous Variables        | 4.6    | 75%                   | 6.4    | 68%                    |
+| 5       | Switch to Long-Horizon Model   | 6.4    | 65%                   | 7.7    | 62%                    |
+---
+
+## Step-by-Step Guide
+
+
+
+### Step 1: Install and Import Packages  
+Make sure all necessary libraries are installed and imported. Then set up the Nixtla client (replace with your actual API key).
+
+```python 
+import numpy as np
+import pandas as pd
+from utilsforecast.evaluation import evaluate
+from utilsforecast.plotting import plot_series
+from utilsforecast.losses import mae, rmse
+from nixtla import NixtlaClient
+
+nixtla_client = NixtlaClient(
+    # api_key='my_api_key_provided_by_nixtla'
+)
+```
+
+
+### Step 2: Load the Dataset  
+We use hourly electricity price data from Germany (`unique_id == "DE"`). The final two days (`48` data points) form the test set.
+
+```python
+df = pd.read_csv('https://raw.githubusercontent.com/Nixtla/transfer-learning-time-series/main/datasets/electricity-short-with-ex-vars.csv')
+df['ds'] = pd.to_datetime(df['ds'])
+
+df_sub = df.query('unique_id == "DE"')
+
+df_train = df_sub.query('ds < "2017-12-29"')
+df_test = df_sub.query('ds >= "2017-12-29"')
+
+df_train.shape, df_test.shape
+```
+
+<Accordion title="Dataset Load Output">
+```bash Dataset Shape Output
+((1632, 12), (48, 12))
+```
+</Accordion>
+
+<Frame caption="Hourly electricity price for Germany (training period highlighted).">
+  ![Electricity Price Over Time](https://raw.githubusercontent.com/Nixtla/nixtla/readme_docs/nbs/_docs/docs/tutorials/22_how_to_improve_forecast_accuracy_files/figure-markdown_strict/cell-11-output-1.png)
+</Frame>
+
+
+### Step 3: Benchmark Forecast with TimeGPT  
+<Info>
+**Info:** We first generate a zero-shot forecast using TimeGPT, which captures overall trends but may struggle with short-term fluctuations.
+</Info>
+
+```python
+fcst_timegpt = nixtla_client.forecast(
+    df=df_train[['unique_id', 'ds', 'y']],
+    h=2*24,
+    target_col='y',
+    level=[90, 95]
+)
+```
+
+<Accordion title="Forecasting Log Output">
+```bash Forecast Logs
+[INFO logs here...]
+```
+</Accordion>
+
+#### Evaluation Metrics
+
+| unique_id | metric | TimeGPT |
+|-----------|--------|---------|
+| DE        | mae    | 18.519  |
+| DE        | rmse   | 20.038  |
+
+<Frame caption="Zero-shot TimeGPT Forecast">
+  ![TimeGPT Forecast for Germany](https://raw.githubusercontent.com/Nixtla/nixtla/readme_docs/nbs/_docs/docs/tutorials/22_how_to_improve_forecast_accuracy_files/figure-markdown_strict/cell-15-output-1.png)
+</Frame>
+
+
+### Step 4: Methods to Enhance Forecasting Accuracy  
+
+Use these following strategies to refine and improve your forecast:
+
+#### 4.1 Add Fine-tuning Steps
+Further fine-tuning typically reduces forecasting errors by adjusting the internal weights of the TimeGPT model, allowing it to better adapt to your specific data.
+
+```python
+fcst_df = nixtla_client.forecast(df=df_train[['unique_id', 'ds', 'y']],
+                                 h=24*2,
+                                 finetune_steps = 30,
+                                 level=[90, 95])
+```
+
+<Frame caption="Add 30 Fine-tuning Steps">
+  ![TimeGPT Forecast for Germany](https://raw.githubusercontent.com/Nixtla/nixtla/readme_docs/nbs/_docs/docs/tutorials/22_how_to_improve_forecast_accuracy_files/figure-markdown_strict/cell-18-output-1.png)
+</Frame>
+
+Evaluation result:
+
+| unique_id | metric | TimeGPT |
+|-----------|--------|---------|
+| DE        | mae    | 11.458  |
+| DE        | rmse   | 12.643  |
+
+
+#### 4.2 Fine-tune Using Different Loss Functions
+Trying different loss functions (e.g., `MAE`, `MSE`) can yield better results for specific use cases.
+```python 
+fcst_df = nixtla_client.forecast(df=df_train[['unique_id', 'ds', 'y']],
+                                 h=24*2,
+                                 finetune_steps = 30,
+                                 finetune_loss = 'mae',
+                                 level=[90, 95])
+```
+<Frame caption="Fine-tune with MAE loss function">
+  ![TimeGPT Forecast for Germany](https://raw.githubusercontent.com/Nixtla/nixtla/readme_docs/nbs/_docs/docs/tutorials/22_how_to_improve_forecast_accuracy_files/figure-markdown_strict/cell-21-output-1.png)
+</Frame>
+
+Evaluation result:
+
+| unique_id | metric | TimeGPT |
+|-----------|--------|---------|
+| DE        | mae    | 9.641  |
+| DE        | rmse   | 10.956  |
+
+
+#### 4.3 Adjust Number of Fine-tuned Parameters
+The finetune_depth parameter controls how many model layers are fine-tuned. It ranges from 1 (few parameters) to 5 (more parameters).
+```python 
+fcst_df = nixtla_client.forecast(df=df_train[['unique_id', 'ds', 'y']],
+                                 h=24*2,
+                                 finetune_steps = 30,
+                                 finetune_depth=2,
+                                 finetune_loss = 'mae',
+                                 level=[90, 95])
+```
+<Frame caption="Fine-tune with depth of 2">
+  ![TimeGPT Forecast for Germany](https://raw.githubusercontent.com/Nixtla/nixtla/readme_docs/nbs/_docs/docs/tutorials/22_how_to_improve_forecast_accuracy_files/figure-markdown_strict/cell-24-output-1.png)
+</Frame>
+
+Evaluation result:
+
+| unique_id | metric | TimeGPT |
+|-----------|--------|---------|
+| DE        | mae    | 9.002  |
+| DE        | rmse   | 11.348  |
+
+
+#### 4.4 Forecast with Exogenous Variables
+Incorporate external data (e.g., weather conditions) to boost predictive performance.
+```python 
+#import exogenous variables
+future_ex_vars_df = df_test.drop(columns = ['y'])
+future_ex_vars_df.head()
+
+#make forecast with historical and future exogenous variables
+fcst_df = nixtla_client.forecast(df=df_train,
+                                 X_df=future_ex_vars_df,
+                                 h=24*2,
+                                 level=[90, 95])
+```
+<Frame caption="Add exogenous variables">
+  ![TimeGPT Forecast for Germany](https://raw.githubusercontent.com/Nixtla/nixtla/readme_docs/nbs/_docs/docs/tutorials/22_how_to_improve_forecast_accuracy_files/figure-markdown_strict/cell-29-output-1.png)
+</Frame>
+
+Evaluation result:
+
+| unique_id | metric | TimeGPT |
+|-----------|--------|---------|
+| DE        | mae    | 4.603  |
+| DE        | rmse   | 6.359  |
+
+#### 4.5 Use a Long-Horizon Model
+For longer forecasting periods, models optimized for multi-step predictions tend to perform better. You can enable this by setting the model parameter to `timegpt-1-long-horizon`.
+```python
+fcst_df = nixtla_client.forecast(df=df_train[['unique_id', 'ds', 'y']],
+                                 h=24*2,
+                                 model = 'timegpt-1-long-horizon',
+                                 level=[90, 95])
+```
+<Frame caption="Use a Long-Horizon Model">
+  ![TimeGPT Forecast for Germany](https://raw.githubusercontent.com/Nixtla/nixtla/readme_docs/nbs/_docs/docs/tutorials/22_how_to_improve_forecast_accuracy_files/figure-markdown_strict/cell-32-output-1.png)
+</Frame>
+
+Evaluation result:
+| unique_id | metric | TimeGPT |
+|-----------|--------|---------|
+| DE        | mae    | 6.366  |
+| DE        | rmse   | 7.738  |
+
+
+
+### Step 5: Conclusion and Next Steps  
+Key takeaways:
+
+The following strategies offer consistent improvements in forecast accuracy. We recommend systematically experimenting with each approach to find the best combination for your data.
+<CardGroup>
+  <Card>
+
+      - Increase the number of fine-tuning steps.
+
+      - Experiment with different loss functions.
+
+      - Incorporate exogenous data.
+
+      - Switching to the long-horizon model for extended forecasting periods.
+
+  </Card>
+</CardGroup>
+
+<Check>
+**Success:** Small refinements—like adding exogenous data or adjusting fine-tuning parameters—can significantly improve your forecasting results.
+</Check>
+
+---
+
+## Result Summary
+
+
+| Steps   | Description                    | MAE    | MAE Improvement (%)   | RMSE   | RMSE Improvement (%)   |
+| ------- | ------------------------------ | ------ | --------------------- | ------ | ---------------------- |
+| 0       | Zero-Shot TimeGPT              | 18.5   | N/A                   | 20.0   | N/A                    |
+| 1       | Add Fine-Tuning Steps          | 11.5   | 38%                   | 12.6   | 37%                    |
+| 2       | Adjust Fine-Tuning Loss        | 9.6    | 48%                   | 11.0   | 45%                    |
+| 3       | Fine-tune More Parameters      | 9.0    | 51%                   | 11.3   | 44%                    |
+| 4       | Add Exogenous Variables        | 4.6    | 75%                   | 6.4    | 68%                    |
+| 5       | Switch to Long-Horizon Model   | 6.4    | 65%                   | 7.7    | 62%                    | 
\ No newline at end of file
--- a/timegpt-docs/forecasting/model-version/longhorizon_model.mdx
+++ b/timegpt-docs/forecasting/model-version/longhorizon_model.mdx
+---
+title: "Long-Horizon Forecasting with TimeGPT"
+description: "Master long-horizon time series forecasting in Python using TimeGPT. Learn to predict 2+ seasonal periods ahead with confidence intervals and uncertainty quantification."
+icon: "clock"
+---
+
+
+## What is Long-Horizon Forecasting?
+
+Long-horizon forecasting refers to predictions far into the future, typically exceeding two seasonal periods. For example, forecasting electricity demand 3 months ahead for hourly data, or predicting sales 2 years ahead for monthly data. The exact threshold depends on data frequency. The further you forecast, the more uncertainty you face.
+
+The key challenge with long-horizon forecasting is that these predictions extend so far into the future that they may be influenced by unforeseen factors not present in the initial dataset. This means long-horizon forecasts generally involve greater risk and uncertainty compared to short-term predictions.
+
+To address these unique challenges, Nixtla provides the specialized `timegpt-1-long-horizon` model in TimeGPT. You can access this model by simply specifying `model="timegpt-1-long-horizon"` when calling `nixtla_client.forecast`.
+
+## When to Use Long-Horizon Forecasting
+
+Long-horizon forecasting is ideal for:
+- **Supply chain planning**: Predict inventory needs 3-6 months ahead
+- **Financial forecasting**: Model quarterly or annual revenue projections
+- **Energy demand**: Forecast power consumption weeks or months in advance
+- **Climate modeling**: Predict seasonal weather patterns
+
+Use the `timegpt-1-long-horizon` model when your forecast horizon exceeds two complete seasonal cycles in your data.
+
+## How to Use the Long-Horizon Model
+
+[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Nixtla/nixtla/blob/main/nbs/docs/tutorials/04_longhorizon.ipynb)
+
+### Step 1: Import Packages
+
+Start by installing and importing the required packages, then initialize the Nixtla client:
+```python
+from nixtla import NixtlaClient
+from datasetsforecast.long_horizon import LongHorizon
+from utilsforecast.losses import mae
+
+nixtla_client = NixtlaClient(
+    api_key='my_api_key_provided_by_nixtla'  # defaults to os.environ.get("NIXTLA_API_KEY")
+)
+```
+
+### Step 2: Load the Data
+
+We'll demonstrate long-horizon forecasting using the ETTh1 dataset, which measures oil temperatures and load variations on an electricity transformer in China. Here, we only forecast oil temperatures (`y`):
+```python
+Y_df, *_ = LongHorizon.load(directory='./', group='ETTh1')
+
+Y_df.head()
+```
+
+|       | unique_id   | ds                    | y          |
+| ----- | ----------- | --------------------- | ---------- |
+| 0     | OT          | 2016-07-01 00:00:00   | 1.460552   |
+| 1     | OT          | 2016-07-01 01:00:00   | 1.161527   |
+| 2     | OT          | 2016-07-01 02:00:00   | 1.161527   |
+| 3     | OT          | 2016-07-01 03:00:00   | 0.862611   |
+| 4     | OT          | 2016-07-01 04:00:00   | 0.525227   |
+
+
+
+We'll set our horizon to 96 timestamps (4 days) for testing and use the previous 42 days as input to the model:
+
+```python
+test = Y_df[-96:]              # 96 timestamps (4 days × 24 hours/day)
+input_seq = Y_df[-1104:-96]    # 1008 timestamps (42 days × 24 hours/day)
+```
+
+### Step 3: Forecasting with the Long-Horizon Model
+
+TimeGPT's `timegpt-1-long-horizon` model is optimized for predictions far into the future. Specify it like so:
+
+```python
+fcst_df = nixtla_client.forecast(
+    df=input_seq,
+    h=96,
+    level=[90],
+    finetune_steps=10,
+    finetune_loss='mae',
+    model='timegpt-1-long-horizon',
+    time_col='ds',
+    target_col='y'
+    )
+```
+
+Next, plot the forecast along with 90% confidence intervals:
+
+```python
+nixtla_client.plot(
+    Y_df[-168:],
+    fcst_df,
+    models=['TimeGPT'],
+    level=[90],
+    time_col='ds',
+    target_col='y'
+)
+```
+
+<Frame caption="TimeGPT Long-Horizon Forecast with 90% Confidence Intervals">
+  ![Long-horizon time series forecast showing predicted oil temperature with 90% confidence intervals over 96 hours](https://raw.githubusercontent.com/Nixtla/nixtla/readme_docs/nbs/_docs/docs/tutorials/04_longhorizon_files/figure-markdown_strict/cell-14-output-1.png)
+</Frame>
+
+
+### Step 4: Evaluation
+
+Finally, assess forecast performance using Mean Absolute Error (MAE):
+
+```python
+test = test.copy()
+test.loc[:, 'TimeGPT'] = fcst_df['TimeGPT'].values
+
+evaluation = mae(
+    test,
+    models=['TimeGPT'],
+    id_col='unique_id',
+    target_col='y'
+)
+```
+
+Evaluation result:
+
+| unique_id | TimeGPT  |
+| --------- | -------- |
+| OT        | 0.145393 |
+
+
+The model achieves a MAE of approximately 0.146, indicating strong performance for these longer-range forecasts.
+
+## Frequently Asked Questions
+
+**Q: What's the difference between timegpt-1 and timegpt-1-long-horizon?**
+
+The `timegpt-1-long-horizon` model is specifically trained for extended forecast horizons (2+ seasonal periods), providing better accuracy for long-range predictions.
+
+**Q: How far ahead can I forecast with the long-horizon model?**
+
+The optimal horizon depends on your data frequency and patterns. Generally, the model performs well up to 4-6 seasonal cycles ahead.
+
+**Q: Can I use exogenous variables with long-horizon forecasting?**
+
+Yes, TimeGPT supports exogenous variables for improved long-horizon accuracy. See our [exogenous variables guide](/forecasting/exogenous-variables/numeric_features) for details.
+
+## Related Resources
+
+Learn more about TimeGPT capabilities:
+- [Fine-tuning TimeGPT](/forecasting/fine-tuning/steps) - Improve accuracy for your specific dataset
+- [Prediction Intervals](/forecasting/probabilistic/prediction_intervals) - Quantify forecast uncertainty
+- [Cross-Validation](/forecasting/evaluation/cross_validation) - Validate model performance
+- [Anomaly Detection](/anomaly_detection/historical_anomaly_detection) - Identify unusual patterns in time series
\ No newline at end of file
--- a/timegpt-docs/forecasting/probabilistic/introduction.mdx
+++ b/timegpt-docs/forecasting/probabilistic/introduction.mdx
+---
+title: "Uncertainty Quantification with TimeGPT"
+description: "Learn how to generate quantile forecasts and prediction intervals to capture uncertainty in your forecasts."
+icon: "question"
+---
+
+In time series forecasting, it is important to consider the full probability distribution of the predictions rather than a single point estimate. This provides a more accurate representation of the uncertainty around the forecasts and allows better decision-making.
+**TimeGPT** supports uncertainty quantification through quantile forecasts and prediction intervals.
+
+## Why Consider the Full Probability Distribution?
+
+When you focus on a single point prediction, you lose valuable information about the range of possible outcomes. By quantifying uncertainty, you can:
+
+      - Identify best-case and worst-case scenarios
+
+      - Improve risk management and contingency planning
+
+      - Gain confidence in decisions that rely on forecast accuracy
+
+## What You Will Learn
+
+<CardGroup>
+  <Card title="Quantile Forecasts" href="/forecasting/probabilistic/quantiles">
+    Learn how to compute quantile forecasts using **TimeGPT**.
+  </Card>
+  <Card title="Prediction Intervals" href="/forecasting/probabilistic/prediction_intervals">
+    Discover how to create prediction intervals with **TimeGPT**.
+  </Card>
+</CardGroup>
\ No newline at end of file