Commit f42429f6 authored by bailuo's avatar bailuo
Browse files

readme

parents
---
title: "Prediction Intervals"
description: "Learn how to create prediction intervals with TimeGPT"
icon: "chart-area"
---
## What Are Prediction Intervals?
A prediction interval provides a range where a future observation of a time series is expected to fall, with a specific level of probability.
For example, a 95% prediction interval means that the true future value is expected to lie within this range 95 times out of 100.
Wider intervals reflect greater uncertainty, while narrower intervals indicate higher confidence in the forecast.
With TimeGPT, you can easily generate prediction intervals for any confidence level between 0% and 100%.
These intervals are constructed using **[conformal prediction](https://en.wikipedia.org/wiki/Conformal_prediction)**, a distribution-free framework for uncertainty quantification.
Prediction intervals differ from confidence intervals:
- **Prediction Intervals**: Capture the uncertainty in future observations.
- **Confidence Intervals**: Quantify the uncertainty in the estimated model parameters (e.g., the mean).
As a result, prediction intervals are typically wider, as they account for both model and data variability.
## How to Generate Prediction Intervals
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Nixtla/nixtla/blob/main/nbs/docs/capabilities/forecast/10_prediction_intervals.ipynb)
### Step 1: Import Packages
Import the required packages and initialize the Nixtla client.
```python
import pandas as pd
from nixtla import NixtlaClient
nixtla_client = NixtlaClient(
api_key='my_api_key_provided_by_nixtla' # defaults to os.environ.get("NIXTLA_API_KEY")
)
```
### Step 2: Load Data
In this tutorial, we will use the Air Passengers dataset.
```python
df = pd.read_csv('https://raw.githubusercontent.com/Nixtla/transfer-learning-time-series/main/datasets/air_passengers.csv')
df.head()
```
| | timestamp | value |
| ----- | ------------ | ------- |
| 0 | 1949-01-01 | 112 |
| 1 | 1949-02-01 | 118 |
| 2 | 1949-03-01 | 132 |
| 3 | 1949-04-01 | 129 |
| 4 | 1949-05-01 | 121 |
### Step 3: Forecast with Prediction Intervals
To generate prediction intervals with TimeGPT, provide a list of desired confidence levels using the `level` argument.
Note that accepted values are between 0 and 100.
- Higher confidence levels provide more certainty that the true value will be captured, but result in wider, less precise intervals.
- Lower confidence levels provide less certainty that the true value will be captured, but result in narrower, more precise intervals.
```python
timegpt_fcst_pred_int_df = nixtla_client.forecast(
df=df,
h=12,
level=[80, 90, 99],
time_col='timestamp',
target_col='value',
)
timegpt_fcst_pred_int_df.head()
```
| timestamp | TimeGPT | TimeGPT-hi-80 | TimeGPT-hi-90 | TimeGPT-hi-99 | TimeGPT-lo-80 | TimeGPT-lo-90 | TimeGPT-lo-99 |
|-------------|---------|----------------|----------------|----------------|----------------|----------------|----------------|
| 1961-01-01 | 437.84 | 443.69 | 451.89 | 459.28 | 431.99 | 423.78 | 416.40 |
| 1961-02-01 | 426.06 | 439.42 | 444.43 | 448.94 | 412.70 | 407.70 | 403.19 |
| 1961-03-01 | 463.12 | 488.83 | 495.92 | 502.31 | 437.41 | 430.31 | 423.93 |
| 1961-04-01 | 478.24 | 507.77 | 509.72 | 511.47 | 448.72 | 446.77 | 445.02 |
| 1961-05-01 | 505.65 | 532.89 | 539.32 | 545.12 | 478.41 | 471.97 | 466.18 |
You can visualize the prediction intervals using the `plot` method. To do so, specify the confidence levels to display using the `level` argument.
```python
nixtla_client.plot(
df,
timegpt_fcst_pred_int_df,
time_col='timestamp',
target_col='value',
level=[80, 90, 99]
)
```
<img src="/images/docs/tutorials-uncertainty/prediction_intervals_fc.png"/>
### Step 4: Historical Forecast
You can also generate prediction intervals for historical forecasts by setting `add_history=True`.
```python
timegpt_fcst_pred_int_historical_df = nixtla_client.forecast(
df=df,
h=12,
level=[80, 90],
time_col='timestamp',
target_col='value',
add_history=True,
)
timegpt_fcst_pred_int_historical_df.head()
```
Plot the prediction intervals for the historical forecasts.
```python
nixtla_client.plot(
df,
timegpt_fcst_pred_int_historical_df,
time_col='timestamp',
target_col='value',
level=[80,90,99]
)
```
<img src="/images/docs/tutorials-uncertainty/prediction_intervals_historical.png"/>
### Step 5. Cross-Validation
You can use the `cross_validation` method to generate prediction intervals for each time window.
```python
cv_df = nixtla_client.cross_validation(
df=df,
h=12,
n_windows=4,
level=[80, 90, 99],
time_col='timestamp',
target_col='value'
)
cv_df.head()
```
After computing the forecasts, you can visualize the results for each cross-validation cutoff to better understand model performance over time.
```python
cutoffs = cv_df['cutoff'].unique()
for cutoff in cutoffs:
fig = nixtla_client.plot(
df.tail(100),
cv_df.query('cutoff == @cutoff').drop(columns=['cutoff', 'value']),
level=[80,90,99],
time_col='timestamp',
target_col='value',
)
display(fig)
```
<img src="/images/docs/tutorials-uncertainty/prediction_intervals_cv1.png"/>
<img src="/images/docs/tutorials-uncertainty/prediction_intervals_cv2.png"/>
<Check>
Congratulations! You have successfully generated prediction intervals using TimeGPT.
You also visualized historical forecasts with intervals and evaluated their coverage across multiple time windows using cross-validation.
</Check>
\ No newline at end of file
---
title: "Quantile Forecasts"
description: "Learn how to generate quantile forecasts with TimeGPT"
icon: "ruler-vertical"
---
## What Are Quantile Forecasts?
Quantile forecasts correspond to specific percentiles of the forecast distribution and provide a more complete representation of the range of possible outcomes.
- The 0.5 quantile (or 50th percentile) is the median forecast, meaning there is a 50% chance that the actual value will fall below or above this point.
- The 0.1 quantile (or 10th percentile) forecast represents a value that the actual observation is expected to fall below 10% of the time.
- The 0.9 quantile (or 90th percentile) forecast represents a value that the actual observation is expected to fall below 90% of the time.
TimeGPT supports quantile forecasts. In this tutorial, we will show you how to generate them.
## Why Use Quantile Forecasts
- Quantile forecasts can provide information about best and worst-case scenarios, allowing you to make better decisions under uncertainty.
- In many real-world scenarios, being wrong in one direction is more costly than being wrong in the other. Quantile forecasts allow you to focus on the specific percentiles that matter most for your particular use case.
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Nixtla/nixtla/blob/main/nbs/docs/tutorials/10_uncertainty_quantification_with_quantile_forecasts.ipynb)
## How to Generate Quantile Forecasts
### Step 1: Import Packages
Import the required packages and initialize a Nixtla client to connect with TimeGPT.
```python
import pandas as pd
from nixtla import NixtlaClient
from IPython.display import display
nixtla_client = NixtlaClient(
api_key='my_api_key_provided_by_nixtla' # Defaults to os.environ.get("NIXTLA_API_KEY")
)
```
### Step 2: Load Data
In this tutorial, we will use the Air Passengers dataset.
```python
df = pd.read_csv(
'https://raw.githubusercontent.com/Nixtla/transfer-learning-time-series/main/datasets/air_passengers.csv'
)
df.head()
```
| | timestamp | value |
| ----- | ------------ | ------- |
| 0 | 1949-01-01 | 112 |
| 1 | 1949-02-01 | 118 |
| 2 | 1949-03-01 | 132 |
| 3 | 1949-04-01 | 129 |
| 4 | 1949-05-01 | 121 |
### Step 3: Forecast with Quantiles
To specify the desired quantiles, you need to pass a list of quantiles to the `quantiles` parameter. Choose quantiles between 0 and 1 based on your uncertainty analysis needs.
```python
quantiles = [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]
timegpt_quantile_fcst_df = nixtla_client.forecast(
df=df,
h=12,
quantiles=quantiles,
time_col='timestamp',
target_col='value'
)
timegpt_quantile_fcst_df.head()
```
| timestamp | TimeGPT | TimeGPT-q-10 | TimeGPT-q-20 | TimeGPT-q-30 | TimeGPT-q-40 | TimeGPT-q-50 | TimeGPT-q-60 | TimeGPT-q-70 | TimeGPT-q-80 | TimeGPT-q-90 |
|-------------|---------|--------------|--------------|--------------|--------------|--------------|--------------|--------------|--------------|--------------|
| 1961-01-01 | 437.84 | 431.99 | 435.04 | 435.38 | 436.40 | 437.84 | 439.27 | 440.29 | 440.63 | 443.69 |
| 1961-02-01 | 426.06 | 412.70 | 414.83 | 416.04 | 421.72 | 426.06 | 430.41 | 436.08 | 437.29 | 439.42 |
| 1961-03-01 | 463.12 | 437.41 | 444.23 | 446.42 | 450.71 | 463.12 | 475.53 | 479.81 | 482.00 | 488.82 |
| 1961-04-01 | 478.24 | 448.72 | 455.43 | 465.57 | 469.88 | 478.24 | 486.61 | 490.92 | 501.06 | 507.76 |
| 1961-05-01 | 505.65 | 478.41 | 493.16 | 497.99 | 499.14 | 505.65 | 512.15 | 513.30 | 518.14 | 532.89 |
TimeGPT returns multiple columns in the forecast output:
- Each requested quantile gets its own column named in the format `TimeGPT-q-...`
- The `TimeGPT` column shows the mean forecast
- The mean forecast (`TimeGPT`) is identical to the 0.5 quantile (`TimeGPT-q-50`)
### Step 4: Plot the Quantile Forecasts
To plot the quantile forecasts, you can use the `plot` method.
```python
nixtla_client.plot(
df,
timegpt_quantile_fcst_df,
time_col='timestamp',
target_col='value'
)
```
<img src="/images/docs/tutorials-uncertainty/quantiles_fc.png"/>
The plot displays:
- The actual time series data in blue.
- Multiple forecast intervals represented by different quantiles:
- The 0.5 quantile (50th percentile) represents the median forecast.
- The 0.1 and 0.9 quantiles (10th and 90th percentiles) show the outer bounds of the forecast.
- Additional quantiles (0.2, 0.3, 0.4, 0.6, 0.7, 0.8) are shown in between, creating a gradient of uncertainty.
This type of visualization is particularly useful because it:
- Shows the full distribution of possible outcomes rather than just a single point forecast.
- Helps identify best and worst-case scenarios.
- Allows decision-makers to understand the range of uncertainty in the predictions.
### Step 5: Historical Forecast
You can also use quantile forecasts to forecast historical data by setting the `add_history` parameter to `True`.
```python
timegpt_quantile_fcst_df = nixtla_client.forecast(
df=df,
h=12,
quantiles=quantiles,
time_col='timestamp',
target_col='value',
add_history=True, # Add historical data to the forecast
)
nixtla_client.plot(
df,
timegpt_quantile_fcst_df,
time_col='timestamp',
target_col='value'
)
```
<img src="/images/docs/tutorials-uncertainty/quantiles_historical.png"/>
The plot now includes quantile forecasts for the historical data. This allows you to evaluate how well the quantile forecasts capture the true variability and identify any systematic bias.
### Step 6: Cross-Validation
To evaluate the performance of the quantile forecasts across multiple time windows, you can use the `cross_validation` method.
```python
cv_df = nixtla_client.cross_validation(
df=df,
h=12,
n_windows=4,
quantiles=quantiles,
time_col='timestamp',
target_col='value'
)
```
After computing the forecasts, you can visualize the results for each cross-validation cutoff to better understand model performance over time.
```python
cutoffs = cv_df['cutoff'].unique()
for cutoff in cutoffs:
fig = nixtla_client.plot(
df.tail(100),
cv_df.query('cutoff == @cutoff').drop(columns=['cutoff', 'value']),
time_col='timestamp',
target_col='value'
)
display(fig)
```
<img src="/images/docs/tutorials-uncertainty/quantiles_cv1.png"/>
<img src="/images/docs/tutorials-uncertainty/quantiles_cv2.png"/>
Each plot shows a different cross-validation window (or cutoff) for the time series. This allows you to evaluate how well the predicted intervals capture the true values across multiple, independent forecast windows.
<Check>
Congratulations! You have successfully generated quantile forecasts using TimeGPT. You also visualized historical quantile predictions and evaluated their performance through cross-validation.
</Check>
\ No newline at end of file
---
title: "Bounded Forecasts"
description: "Learn how to generate forecasts with upper and lower bounds to match your business constraints."
icon: "arrows-up-down"
---
## Why Generate Bounded Forecasts?
In forecasting, we often want to make sure the predictions stay within a certain
range. For example, for predicting the sales of a product, we may require all
forecasts to be positive. Thus, the forecasts may need to be bounded.
This tutorial shows how to generate bounded forecasts with TimeGPT by
transforming data prior to forecasting.
## Tutorial
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Nixtla/nixtla/blob/main/nbs/docs/tutorials/13_bounded_forecasts.ipynb)
### Step 1: Import Packages
First, we install and import the required packages.
```python
import pandas as pd
import numpy as np
from nixtla import NixtlaClient
```
Next, initialize your Nixtla client with the API key:
```python
nixtla_client = NixtlaClient(
# defaults to os.environ.get("NIXTLA_API_KEY")
api_key='my_api_key_provided_by_nixtla'
)
```
### Step 2: Load Data
We use the [annual egg prices](https://github.com/robjhyndman/fpp3package/tree/master/data)
dataset from [Forecasting, Principles and Practices](https://otexts.com/fpp3/).
We expect egg prices to be strictly positive, so we want to bound our forecasts
to be positive.
> NOTE: If you do not have `pyreadr`, you can install it with `pip`:
```shell
pip install pyreadr
```
```python
import pyreadr
from pathlib import Path
url = 'https://github.com/robjhyndman/fpp3package/raw/master/data/prices.rda'
dst_path = str(Path.cwd().joinpath('prices.rda'))
result = pyreadr.read_r(pyreadr.download_file(url, dst_path), dst_path)
df = result['prices'][['year', 'eggs']]
df = df.dropna().reset_index(drop=True)
df = df.rename(columns={'year': 'ds', 'eggs': 'y'})
df['ds'] = pd.to_datetime(df['ds'], format='%Y')
df['unique_id'] = 'eggs'
df.tail(10)
```
| | **ds** | **y** | **unique_id** |
| ----- | ------------ | -------- | ------------- |
| 84 | 1984-01-01 | 100.58 | eggs |
| 85 | 1985-01-01 | 76.84 | eggs |
| 86 | 1986-01-01 | 81.10 | eggs |
| 87 | 1987-01-01 | 69.60 | eggs |
| 88 | 1988-01-01 | 64.55 | eggs |
| 89 | 1989-01-01 | 80.36 | eggs |
| 90 | 1990-01-01 | 79.79 | eggs |
| 91 | 1991-01-01 | 74.79 | eggs |
| 92 | 1992-01-01 | 64.86 | eggs |
| 93 | 1993-01-01 | 62.27 | eggs |
We can have a look at how the prices have evolved in the 20th century,
demonstrating that the price is trending down.
```python
nixtla_client.plot(df)
```
<Frame caption="Figure 1: Annual Egg Prices Trend from 1900s to 1990s">
![Annual Egg Prices Trend](https://raw.githubusercontent.com/Nixtla/nixtla/readme_docs/nbs/_docs/docs/tutorials/13_bounded_forecasts_files/figure-markdown_strict/cell-12-output-1.png)
</Frame>
### Step 3: Generate Bounded Forecasts with TimeGPT
First, we transform the target data. In this case, we will log-transform the
data prior to forecasting, such that we can only forecast positive prices.
```python
df_transformed = df.copy()
df_transformed['y'] = np.log(df_transformed['y'])
```
We will create forecasts for the next 10 years, and we include an 80, 90 and
99.5 percentile of our forecast distribution.
```python
timegpt_fcst_with_transform = nixtla_client.forecast(
df=df_transformed,
h=10,
freq='Y',
level=[80, 90, 99.5]
)
```
After having created the forecasts, we need to inverse the transformation that
we applied earlier. With a log-transformation, this simply means we need to
exponentiate the forecasts:
```python
cols_to_transform = [
col for col in timegpt_fcst_with_transform if col not in ['unique_id', 'ds']
]
for col in cols_to_transform:
timegpt_fcst_with_transform[col] = np.exp(timegpt_fcst_with_transform[col])
```
Now, we can plot the forecasts. We include a number of prediction intervals,
indicating the 80, 90 and 99.5 percentile of our forecast distribution.
```python
nixtla_client.plot(
df,
timegpt_fcst_with_transform,
level=[80, 90, 99.5],
max_insample_length=20
)
```
<Frame caption="Figure 2: Bounded Forecasts with TimeGPT Using Log Transformation">
![Bounded Forecasts with Log Transformation](https://raw.githubusercontent.com/Nixtla/nixtla/readme_docs/nbs/_docs/docs/tutorials/13_bounded_forecasts_files/figure-markdown_strict/cell-16-output-1.png)
</Frame>
The forecast and the prediction intervals look reasonable.
### Step 4: Compare with Unbounded Forecast
Let's compare these forecasts to the situation where we don't apply a
transformation. In this case, it may be possible to forecast a negative price.
```python
timegpt_fcst_without_transform = nixtla_client.forecast(
df=df,
h=10,
freq='Y',
level=[80, 90, 99.5]
)
```
Indeed, we now observe prediction intervals that become negative:
```python
nixtla_client.plot(
df,
timegpt_fcst_without_transform,
level=[80, 90, 99.5],
max_insample_length=20
)
```
<Frame caption="Figure 3: Unbounded Forecast with Possible Negative Intervals">
![Unbounded Forecast with Possible Negative Intervals](https://raw.githubusercontent.com/Nixtla/nixtla/readme_docs/nbs/_docs/docs/tutorials/13_bounded_forecasts_files/figure-markdown_strict/cell-18-output-1.png)
</Frame>
For example, in 1995:
```python
timegpt_fcst_without_transform
```
| | unique_id | ds | TimeGPT | TimeGPT-lo-99.5 | TimeGPT-lo-90 | TimeGPT-lo-80 | TimeGPT-hi-80 | TimeGPT-hi-90 | TimeGPT-hi-99.5 |
|--:|----------:|-----------:|----------:|----------------:|--------------:|--------------:|--------------:|--------------:|----------------:|
| 0 | eggs | 1994-01-01 | 66.859756 | 43.103240 | 46.131448 | 49.319034 | 84.400479 | 87.588065 | 90.616273 |
| 1 | eggs | 1995-01-01 | 64.993477 | -20.924112 | -4.750041 | 12.275298 | 117.711656 | 134.736995 | 150.911066 |
| 2 | eggs | 1996-01-01 | 66.695808 | 6.499170 | 8.291150 | 10.177444 | 123.214173 | 125.100467 | 126.892446 |
| 3 | eggs | 1997-01-01 | 66.103325 | 17.304282 | 24.966939 | 33.032894 | 99.173756 | 107.239711 | 114.902368 |
| 4 | eggs | 1998-01-01 | 67.906517 | 4.995371 | 12.349648 | 20.090992 | 115.722042 | 123.463386 | 130.817663 |
| 5 | eggs | 1999-01-01 | 66.147575 | 29.162207 | 31.804460 | 34.585779 | 97.709372 | 100.490691 | 103.132943 |
| 6 | eggs | 2000-01-01 | 66.062637 | 14.671932 | 19.305822 | 24.183601 | 107.941673 | 112.819453 | 117.453343 |
| 7 | eggs | 2001-01-01 | 68.045769 | 3.915282 | 13.188964 | 22.950736 | 113.140802 | 122.902573 | 132.176256 |
| 8 | eggs | 2002-01-01 | 66.718903 | -42.212631 | -30.583703 | -18.342726 | 151.780531 | 164.021508 | 175.650436 |
| 9 | eggs | 2003-01-01 | 67.344078 | -86.239911 | -44.959745 | -1.506939 | 136.195095 | 179.647901 | 220.928067 |
## Conclusion
Log-transformations are a simple and effective way to enforce non-negative
predictions. This tutorial demonstrated how TimeGPT accommodates bounded
forecasts to enhance forecast realism and reliability.
## References
- [**Hyndman, Rob J., and George Athanasopoulos (2021). Forecasting: Principles and Practice (3rd Ed)**](https://otexts.com/fpp3/)
\ No newline at end of file
---
title: "Hierarchical Forecasting"
description: "Learn how to use TimeGPT for hierarchical forecasting across multiple levels."
icon: "diagram-project"
---
## What is Hierarchical Forecasting?
Hierarchical forecasting involves generating forecasts for multiple time series that share a hierarchical structure (e.g., product demand by category, department, or region). The goal is to ensure that forecasts are coherent across each level of the hierarchy.
Hierarchical forecasting can be particularly important when you need to generate forecasts at different granularities (e.g., country, state, region) and ensure they align with each other and aggregate correctly at higher levels.
Using TimeGPT, you can create forecasts for multiple related time series and then apply hierarchical forecasting methods from [HierarchicalForecast](https://nixtlaverse.nixtla.io/hierarchicalforecast/index.html) to reconcile those forecasts across your specified hierarchy.
## Why use Hierarchical Forecasting?
- Ensures consistency: Forecasts at lower levels add up to higher-level forecasts.
- Improves accuracy: Reconciliation methods often yield more robust predictions.
- Facilitates deeper insights: Understand how smaller segments contribute to overall trends.
## Tutorial
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Nixtla/nixtla/blob/main/nbs/docs/tutorials/14_hierarchical_forecasting.ipynb)
### Step 1: Install, Import and Initialize
Start by installing the required packages.
```shell
pip install nixtla
pip install hierarchicalforecast
```
Next, initialize the TimeGPT NixtlaClient.
```python
import pandas as pd
import numpy as np
from nixtla import NixtlaClient
nixtla_client = NixtlaClient(
api_key='my_api_key_provided_by_nixtla'
)
```
### Step 2: Load and Prepare Data
This tutorial uses the Australian Tourism dataset from
[Forecasting: Principles and Practices](https://otexts.com/fpp3/). The dataset
contains different levels of hierarchical data, from the entire country of
Australia down to individual regions.
<Frame caption="Examples of Australia's Tourism Hierarchy and Map">
<img
src="https://github.com/Nixtla/nixtla/blob/main/nbs/img/australia_tourism.png?raw=true"
alt="Map of Australia color coded by state."
width="700"
/>
<img
src="https://github.com/Nixtla/nixtla/blob/main/nbs/img/australia_hierarchy.png?raw=true"
alt="Australia hierarchical structure."
width="700"
/>
</Frame>
The dataset provides only the lowest-level series, so higher-level series need
to be aggregated explicitly. Let's load and preprocess the dataset.
```python
Y_df = pd.read_csv(
'https://raw.githubusercontent.com/Nixtla/transfer-learning-time-series/main/datasets/tourism.csv'
)
Y_df = Y_df.rename({'Trips': 'y', 'Quarter': 'ds'}, axis=1)
Y_df.insert(0, 'Country', 'Australia')
Y_df = Y_df[['Country', 'Region', 'State', 'Purpose', 'ds', 'y']]
Y_df['ds'] = Y_df['ds'].str.replace(r'(\d+) (Q\d)', r'\1-\2', regex=True)
Y_df['ds'] = pd.to_datetime(Y_df['ds'])
Y_df.head(10)
```
| Country | Region | State | Purpose | ds | y |
| --------- | -------- | --------------- | -------- | ---------- | ---------- |
| Australia | Adelaide | South Australia | Business | 1998-01-01 | 135.077690 |
| Australia | Adelaide | South Australia | Business | 1998-04-01 | 109.987316 |
| Australia | Adelaide | South Australia | Business | 1998-07-01 | 166.034687 |
| Australia | Adelaide | South Australia | Business | 1998-10-01 | 127.160464 |
| Australia | Adelaide | South Australia | Business | 1999-01-01 | 137.448533 |
| Australia | Adelaide | South Australia | Business | 1999-04-01 | 199.912586 |
| Australia | Adelaide | South Australia | Business | 1999-07-01 | 169.355090 |
| Australia | Adelaide | South Australia | Business | 1999-10-01 | 134.357937 |
| Australia | Adelaide | South Australia | Business | 2000-01-01 | 154.034398 |
| Australia | Adelaide | South Australia | Business | 2000-04-01 | 168.776364 |
We define the dataset hierarchies explicitly. Each level in the list describes
one view of the hierarchy:
```python
spec = [
['Country'],
['Country', 'State'],
['Country', 'Purpose'],
['Country', 'State', 'Region'],
['Country', 'State', 'Purpose'],
['Country', 'State', 'Region', 'Purpose']
]
```
Then, use `aggregate` from `HierarchicalForecast` to generate the aggregated series:
```python
from hierarchicalforecast.utils import aggregate
Y_df, S_df, tags = aggregate(Y_df, spec)
Y_df.head(10)
```
| unique_id | ds | y |
| --------- | ---------- | ------------ |
| Australia | 1998-01-01 | 23182.197269 |
| Australia | 1998-04-01 | 20323.380067 |
| Australia | 1998-07-01 | 19826.640511 |
| Australia | 1998-10-01 | 20830.129891 |
| Australia | 1999-01-01 | 22087.353380 |
| Australia | 1999-04-01 | 21458.373285 |
| Australia | 1999-07-01 | 19914.192508 |
| Australia | 1999-10-01 | 20027.925640 |
| Australia | 2000-01-01 | 22339.294779 |
| Australia | 2000-04-01 | 19941.063482 |
Next, create the train/test splits. Here, we use the last two years (eight
quarters) of data for testing:
```python
Y_test_df = Y_df.groupby('unique_id').tail(8)
Y_train_df = Y_df.drop(Y_test_df.index)
```
### Step 3: Hierarchical Forecasting Using TimeGPT
Now we'll generate base forecasts across all series using TimeGPT and then apply
hierarchical reconciliation to ensure the forecasts align across each level.
#### Generate Base Forecasts with TimeGPT
Obtain forecasts with TimeGPT for all series in your training data.
```python
timegpt_fcst = nixtla_client.forecast(
df=Y_train_df,
h=8,
freq='QS',
add_history=True
)
```
Next, separate the generated forecasts into in-sample (historical) and
out-of-sample (forecasted) periods:
```python
timegpt_fcst_insample = timegpt_fcst.query("ds < '2016-01-01'")
timegpt_fcst_outsample = timegpt_fcst.query("ds >= '2016-01-01'")
```
#### Visualize TimeGPT Forecasts
Quickly visualize the forecasts for different hierarchy levels. Here, we look at
the entire country, the state of Queensland, the Brisbane region, and holidays
in Brisbane:
```python
nixtla_client.plot(
Y_df,
timegpt_fcst_outsample,
max_insample_length=4 * 12,
unique_ids=[
'Australia',
'Australia/Queensland',
'Australia/Queensland/Brisbane',
'Australia/Queensland/Brisbane/Holiday'
]
)
```
![hier_plot1](/images/docs/hier1.png)
#### Apply Hierarchical Reconciliation
We use `MinTrace` methods to reconcile forecasts across all levels of the hierarchy.
```python
from hierarchicalforecast.methods import MinTrace
from hierarchicalforecast.core import HierarchicalReconciliation
reconcilers = [
MinTrace(method='ols'),
MinTrace(method='mint_shrink')
]
hrec = HierarchicalReconciliation(reconcilers=reconcilers)
Y_df_with_insample_fcsts = timegpt_fcst_insample.merge(Y_df.copy())
Y_rec_df = hrec.reconcile(
Y_hat_df=timegpt_fcst_outsample,
Y_df=Y_df_with_insample_fcsts,
S=S_df,
tags=tags
)
```
Now, let's plot the reconciled forecasts to ensure they make sense across the
full country → state → region → purpose hierarchy:
```python
nixtla_client.plot(
Y_df,
Y_rec_df,
max_insample_length=4 * 12,
unique_ids=[
'Australia',
'Australia/Queensland',
'Australia/Queensland/Brisbane',
'Australia/Queensland/Brisbane/Holiday'
]
)
```
![hier_plot1](/images/docs/hier2.png)
### Step 4: Evaluate Forecast Accuracy
Finally, evaluate your forecast performance using RMSE for different levels of
the hierarchy, from total (country) to bottom-level (region/purpose).
```python
from hierarchicalforecast.evaluation import evaluate
from utilsforecast.losses import rmse
eval_tags = {
'Total': tags['Country'],
'Purpose': tags['Country/Purpose'],
'State': tags['Country/State'],
'Regions': tags['Country/State/Region'],
'Bottom': tags['Country/State/Region/Purpose']
}
evaluation = evaluate(
df=Y_rec_df.merge(Y_test_df, on=['unique_id', 'ds']),
tags=eval_tags,
train_df=Y_train_df,
metrics=[rmse]
)
evaluation[evaluation.select_dtypes(np.number).columns] = evaluation.select_dtypes(np.number).map('{:.2f}'.format)
evaluation
```
| | level | metric | TimeGPT | TimeGPT/MinTrace_method-ols | TimeGPT/MinTrace_method-mint_shrink |
| --- | ------- | ------ | ------- | --------------------------- | ----------------------------------- |
| 0 | Total | rmse | 1433.07 | 1436.07 | 1627.43 |
| 1 | Purpose | rmse | 482.09 | 475.64 | 507.50 |
| 2 | State | rmse | 275.85 | 278.39 | 294.28 |
| 3 | Regions | rmse | 49.40 | 47.91 | 47.99 |
| 4 | Bottom | rmse | 19.32 | 19.11 | 18.86 |
| 5 | Overall | rmse | 38.66 | 38.21 | 39.16 |
## Conclusion
We made a small improvement in overall RMSE by reconciling the forecasts with
`MinTrace(ols)`, and made them slightly worse using `MinTrace(mint_shrink)`,
indicating that the base forecasts were relatively strong already.
However, we now have coherent forecasts too - so not only did we make a (small)
accuracy improvement, we also got coherency to the hierarchy as a result of our
reconciliation step.
## References
- [Hyndman, Rob J., and George Athanasopoulos (2021). Forecasting: Principles and Practice](https://otexts.com/fpp3/).
---
title: "Irregular Timestamps"
description: "Learn how to work with both regular and irregular timestamps in TimeGPT for accurate forecasting."
icon: "clock"
---
## Why Handle Irregular Timestamps?
When working with time series data, it is important to specify its frequency
correctly, as this can significantly impact forecasting results. TimeGPT is
designed to automatically infer the frequency of your timestamps. For commonly
used frequencies, such as hourly, daily, or monthly, TimeGPT reliably infers
the frequency automatically, so no additional input is required.
However, for irregular frequencies, where observations are not recorded at
consistent or regular intervals, such as the days the U.S. stock market is open,
it is necessary to specify the frequency directly.
In this tutorial, we will show you how to handle irregular and custom
frequencies in TimeGPT.
> NOTE: TimeGPT requires that your data does not contain missing values, as this is not
currently supported. In other words, the irregularity of the data should stem
from the nature of the recorded phenomenon, not from missing observations.
If your data contains missing values, please refer to our
[tutorial on missing dates](/data_requirements/missing_values).
## Tutorial
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Nixtla/nixtla/blob/main/nbs/docs/capabilities/forecast/11_irregular_timestamps.ipynb)
### Step 1: Import Packages
First, we import the required packages and initialize the Nixtla client.
```python
import pandas as pd
import pandas_market_calendars as mcal
from nixtla import NixtlaClient
```
Initialize NixtlaClient with your API key:
```python
nixtla_client = NixtlaClient(
api_key='my_api_key_provided_by_nixtla'
)
```
### Step 2: Handling Regular Frequencies
As discussed in the introduction, for time series data with regular frequencies,
where observations are recorded at consistent intervals, TimeGPT can automatically
infer the frequency of your timestamps if the input data is a **pandas DataFrame**.
If you prefer not to rely on TimeGPT's automatic inference, you can set the
`freq` parameter to a valid
[pandas frequency string](https://pandas.pydata.org/docs/user_guide/timeseries.html#offset-aliases),
such as `MS` for month-start frequency or `min` for minutely frequency.
When working with **Polars DataFrames**, you must specify the frequency explicitly
by using a valid [polars offset](https://docs.pola.rs/api/python/stable/reference/expressions/api/polars.Expr.dt.offset_by.html),
such as `1d` for daily frequency or `1h` for hourly frequency.
Below is an example of how to specify the frequency for a Polars DataFrame.
```python
import polars as pl
url = 'https://raw.githubusercontent.com/Nixtla/transfer-learning-time-series/main/datasets/air_passengers.csv'
polars_df = pl.read_csv(url, try_parse_dates=True)
fcst_df = nixtla_client.forecast(
df=polars_df,
h=12,
freq='1mo',
time_col='timestamp',
target_col='value',
level=[80, 95]
)
```
Plot the forecast DataFrame:
```python
nixtla_client.plot(
polars_df,
fcst_df,
time_col='timestamp',
target_col='value',
level=[80, 95]
)
```
<Frame caption="Air Passengers Forecast">
![Air Passengers Forecast](https://raw.githubusercontent.com/Nixtla/nixtla/readme_docs/nbs/_docs/docs/capabilities/forecast/11_irregular_timestamps_files/figure-markdown_strict/cell-11-output-1.png)
</Frame>
### Step 3: Handling Irregular Frequencies
In this section, we will discuss cases where observations are not recorded at
consistent intervals.
#### Load data
We will use the daily stock prices of Palantir Technologies (PLTR) from 2020 to 2023.
The dataset includes data up to 2023-09-22, but for this tutorial, we will exclude
any data before 2023-08-28. This allows us to show how a custom frequency can
handle days when the stock market is closed, such as Labor Day in the U.S.
> IMPORTANT NOTE: While we are using TimeGPT to predict stock price in this
tutorial, please note that this is being done only with the intention of showing
the capability of forecasting with irregular timestamps. **Stock prices are [`random
walks`](https://otexts.com/fpppy/nbs/09-arima.html#random-walk-model) and as
such can not be predicted using traditional time series forecasting methods
(including TimeGPT)**. Predictions for random walk will be a straight line type
of forecast where tomorrow's price is predicted to be equal to today's price,
which is not a useful model.
```python
url = 'https://raw.githubusercontent.com/Nixtla/transfer-learning-time-series/main/datasets/openbb/pltr.csv'
pltr_df = pd.read_csv(url, parse_dates=['date'])
pltr_df = pltr_df.query('date < "2023-08-28"')
pltr_df.head()
```
| | date | Open | High | Low | Close | Adj Close | Volume | Dividends | Stock Splits |
|--:|-----------:|------:|------:|-----:|------:|----------:|----------:|----------:|-------------:|
| 0 | 2020-09-30 | 10.00 | 11.41 | 9.11 | 9.50 | 9.50 | 338584400 | 0.0 | 0.0 |
| 1 | 2020-10-01 | 9.69 | 10.10 | 9.23 | 9.46 | 9.46 | 124297600 | 0.0 | 0.0 |
| 2 | 2020-10-02 | 9.06 | 9.28 | 8.94 | 9.20 | 9.20 | 55018300 | 0.0 | 0.0 |
| 3 | 2020-10-05 | 9.43 | 9.49 | 8.92 | 9.03 | 9.03 | 36316900 | 0.0 | 0.0 |
| 4 | 2020-10-06 | 9.04 | 10.18 | 8.90 | 9.90 | 9.90 | 90864000 | 0.0 | 0.0 |
We will forecast the **adjusted closing price**, which represents the stock's
closing price adjusted for corporate actions such as stock splits, dividends,
and rights offerings. Hence, we will exclude the other columns from the dataset.
```python
pltr_df = pltr_df[['date', 'Adj Close']]
nixtla_client.plot(
pltr_df,
time_col="date",
target_col="Adj Close"
)
```
<Frame caption="PLTR Adjusted Close Prices">
![PLTR Adjusted Close Prices](https://raw.githubusercontent.com/Nixtla/nixtla/readme_docs/nbs/_docs/docs/capabilities/forecast/11_irregular_timestamps_files/figure-markdown_strict/cell-13-output-1.png)
</Frame>
#### Define the Frequency
To define a custom frequency, we will first extract and sort the dates from the
input data, ensuring they are in the correct datetime format. Next, we will use
the [`pandas_market_calendars package`](https://pypi.org/project/pandas-market-calendars/),
specifically the `get_calendar` method, to obtain the New York Stock Exchange
(NYSE) calendar. Using this calendar, we can create a custom frequency that
includes only the days the stock market is open.
```python
dates = pd.DatetimeIndex(sorted(pltr_df['date'].unique()))
nyse = mcal.get_calendar('NYSE')
```
Note that the days the stock market is open need to include all the dates in the
input data plus the forecast horizon. In this example, we will forecast 7 days
ahead, so we need to make sure our trading days include the last date in the
input data as well as the next 7 valid trading days.
To avoid dealing with holidays or weekends during the forecast horizon, we will
specify an end date well beyond the forecast horizon. For this example, we will
use January 1, 2024, as a safe cutoff.
```python
trading_days = nyse.valid_days(
start_date=dates.min(),
end_date="2024-01-01"
).tz_localize(None)
```
Now, with the list of trading days, we can identify the days the stock market is
closed. These are all weekdays (Monday to Friday) within the range that are not
trading days. Using this information, we can define a custom frequency that skips
the stock market's closed days.
```python
all_weekdays = pd.date_range(
start=dates.min(),
end="2024-01-01",
freq='B'
)
closed_days = all_weekdays.difference(trading_days)
custom_bday = pd.offsets.CustomBusinessDay(
holidays=closed_days
)
```
#### Forecast with TimeGPT
With the custom frequency defined, we can now use the forecast method,
specifying the custom_bday frequency in the freq argument. This will make the
forecast respect the trading schedule of the stock market.
```python
fcst_pltr_df = nixtla_client.forecast(
df=pltr_df,
h=7,
freq=custom_bday,
time_col='date',
target_col='Adj Close',
level=[80, 95]
)
```
Finally, plot the forecast results:
```python
nixtla_client.plot(
pltr_df,
fcst_pltr_df,
time_col="date",
target_col="Adj Close",
level=[80, 95],
max_insample_length=180
)
```
<Frame caption="PLTR Forecast (Custom Frequency)">
![PLTR Forecast (Custom Frequency)](https://raw.githubusercontent.com/Nixtla/nixtla/readme_docs/nbs/_docs/docs/capabilities/forecast/11_irregular_timestamps_files/figure-markdown_strict/cell-18-output-1.png)
</Frame>
```python
fcst_pltr_df[['date']].head(7)
```
| | date |
|--:|------------|
| 0 | 2023-08-28 |
| 1 | 2023-08-29 |
| 2 | 2023-08-30 |
| 3 | 2023-08-31 |
| 4 | 2023-09-01 |
| 5 | 2023-09-05 |
| 6 | 2023-09-06 |
Note that the forecast excludes 2023-09-04, which was a Monday when the stock
market was closed for Labor Day in the United States.
## Conclusion
Below are the key takeaways of this tutorial:
- TimeGPT can reliably infer regular frequencies, but you can override this by
setting the `freq` parameter to the corresponding pandas alias.
- When working with polars data frames, you must always specify the frequency
using the correct polars offset.
- TimeGPT supports irregular frequencies and allows you to define a custom
frequency, generating forecasts exclusively for the specified dates.
\ No newline at end of file
---
title: "Temporal Hierarchical Forecasting with TimeGPT"
description: "Learn how to combine forecasts at different time frequencies to improve accuracy."
icon: "sitemap"
---
## What is Temporal Hierarchical Forecasting?
Temporal hierarchical forecasting is a technique that improves prediction accuracy by leveraging the structure of time series data across multiple temporal resolutions such as hourly, daily, weekly, and monthly.
Rather than modeling just one time scale, it generates forecasts at each level of the temporal hierarchy and then reconciles them to ensure consistency (e.g., the sum of hourly forecasts aligns with the daily total).
This approach captures both high-frequency variations and long-term trends, allowing for coherent forecasts across time scales.
It is particularly effective in domains like energy demand, retail sales, and transportation planning, where decisions depend on both granular and aggregated time-based insights.
## Tutorial
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Nixtla/nixtla/blob/main/nbs/docs/tutorials/23_temporalhierarchical.ipynb)
In this tutorial, we demonstrate how to use TimeGPT for temporal hierarchical forecasting. We will use a dataset that has an hourly frequency, and we create forecasts with TimeGPT for both the hourly and the 2-hourly frequency level. The latter constitutes the timeseries when it is aggregated across 2-hour windows. Subsequently, we can use temporal reconciliation techniques to improve the forecasting performance of TimeGPT.
### Step 1: Import and Initialize
Let's import the NixtlaClient and Initialize it with an API key.
```python
import numpy as np
import pandas as pd
from utilsforecast.evaluation import evaluate
from utilsforecast.plotting import plot_series
from utilsforecast.losses import mae, rmse
from nixtla import NixtlaClient
# Initialize NixtlaClient
nixtla_client = NixtlaClient(
# api_key = 'my_api_key_provided_by_nixtla'
)
```
### Step 2: Load and Prepare Data
First, let's read and process the dataset.
```python
df = pd.read_csv(
'https://raw.githubusercontent.com/Nixtla/transfer-learning-time-series/main/datasets/electricity-short-with-ex-vars.csv'
)
df['ds'] = pd.to_datetime(df['ds'])
df_sub = df.query('unique_id == "DE"')
```
Next, let's create the train-test splits
```python
df_train = df_sub.query('ds < "2017-12-29"')
df_test = df_sub.query('ds >= "2017-12-29"')
df_train.shape, df_test.shape
```
```bash
((1632, 12), (48, 12))
```
Let's visualize the train and test splits to make sure that they are as expected.
```python
plot_series(
df_train[['unique_id', 'ds', 'y']][-200:],
forecasts_df=df_test[['unique_id', 'ds', 'y']].rename(columns={'y': 'test'})
)
```
<Frame>
![Training and Testing Data](https://raw.githubusercontent.com/Nixtla/nixtla/readme_docs/nbs/_docs/docs/tutorials/23_temporalhierarchical_files/figure-markdown_strict/cell-11-output-1.png)
</Frame>
### Step 3: Temporal Hierarchical Forecasting
#### Temporal Aggregation
We are interested in generating forecasts for the hourly and 2-hourly
windows. We can generate these forecasts using TimeGPT. After generating
these forecasts, we make use of hierarchical forecasting techniques to
improve the accuracy of each forecast.
We first define the temporal aggregation spec. The spec is a dictionary in
which the keys are the name of the aggregation and the value is the amount
of bottom-level timesteps that should be aggregated in that aggregation.
In this example, we choose a temporal aggregation of a 2-hour period and a
1-hour period (the bottom level).
```python
spec_temporal = { "2-hour-period": 2, "1-hour-period": 1 }
```
We next compute the temporally aggregated train- and test sets using the
aggregate_temporal function from hierarchicalforecast. Note that we have
different aggregation matrices S for the train- and test set, as the test
set contains temporal hierarchies that are not included in the train set.
```python
from hierarchicalforecast.utils import aggregate_temporal
Y_train, S_train, tags_train = aggregate_temporal(
df=df_train[['unique_id', 'ds', 'y']], spec=spec_temporal
)
Y_test, S_test, tags_test = aggregate_temporal(
df=df_test[['unique_id', 'ds', 'y']], spec=spec_temporal
)
```
`Y_train` contains our training data, for both 1-hour and 2-hour periods.
For example, if we look at the first two timestamps of the training data,
we have a 2-hour period ending at 2017-10-22 01:00, and two 1-hour periods,
the first ending at 2017-10-22 00:00, and the second at 2017-10-22 01:00,
the latter corresponding to when the first 2-hour period ends.
Also, the ground truth value `y` of the first 2-hour period is 38.13, which
is equal to the sum of the first two 1-hour periods (19.10 + 19.03). This
showcases how the higher frequency `1-hour-period` has been aggregated into
the `2-hour-period` frequency.
```python
Y_train.query("ds <= '2017-10-22 01:00:00'")
```
| | temporal_id | unique_id | ds | y |
| ----- | ----------------- | ----------- | --------------------- | ------- |
| 0 | 2-hour-period-1 | DE | 2017-10-22 01:00:00 | 38.13 |
| 816 | 1-hour-period-1 | DE | 2017-10-22 00:00:00 | 19.10 |
| 817 | 1-hour-period-2 | DE | 2017-10-22 01:00:00 | 19.03 |
The aggregation matrices `S_train` and `S_test` detail how the lowest temporal
granularity (hour) can be aggregated into the 2-hour periods. For example,
the first 2-hour period, named `2-hour-period-1`, can be constructed by
summing the first two hour-periods, `1-hour-period-1` and `1-hour-period-2`,
which we also verified above in our inspection of Y_train.
```python
S_train.iloc[:5, :5]
```
| | temporal_id | 1-hour-period-1 | 1-hour-period-2 | 1-hour-period-3 | 1-hour-period-4 |
| ----- | ----------------- | ----------------- | ----------------- | ----------------- | ----------------- |
| 0 | 2-hour-period-1 | 1.0 | 1.0 | 0.0 | 0.0 |
| 1 | 2-hour-period-2 | 0.0 | 0.0 | 1.0 | 1.0 |
| 2 | 2-hour-period-3 | 0.0 | 0.0 | 0.0 | 0.0 |
| 3 | 2-hour-period-4 | 0.0 | 0.0 | 0.0 | 0.0 |
| 4 | 2-hour-period-5 | 0.0 | 0.0 | 0.0 | 0.0 |
#### Computing Base Forecasts with TimeGPT
Now, we need to compute base forecasts for each temporal aggregation. The
following cell computes the **base forecasts** for each temporal aggregation
in `Y_train` using TimeGPT.
Note that both frequency and horizon are different for each temporal
aggregation. In this example, the lowest level has a hourly frequency, and a
horizon of `48`. The `2-hourly-period` aggregation thus has a 2-hourly
frequency with a horizon of `24`.
```python
Y_hats = []
id_cols = ["unique_id", "temporal_id", "ds", "y"]
for level, temporal_ids_train in tags_train.items():
Y_level_train = Y_train.query("temporal_id in @temporal_ids_train")
temporal_ids_test = tags_test[level]
Y_level_test = Y_test.query("temporal_id in @temporal_ids_test")
freq_level = pd.infer_freq(Y_level_train["ds"].unique())
horizon_level = Y_level_test["ds"].nunique()
Y_hat_level = nixtla_client.forecast(
df=Y_level_train[["ds", "unique_id", "y"]],
h=horizon_level
)
Y_hat_level = Y_hat_level.merge(Y_level_test, on=["ds", "unique_id"], how="left")
Y_hat_cols = id_cols + [col for col in Y_hat_level.columns if col not in id_cols]
Y_hat_level = Y_hat_level[Y_hat_cols]
Y_hats.append(Y_hat_level)
Y_hat = pd.concat(Y_hats, ignore_index=True)
```
Observe that `Y_hat` contains all the forecasts but they are not coherent
with each other. For example, consider the forecasts for the first time
period of both frequencies.
| | unique_id | temporal_id | ds | y | TimeGPT |
|---:|------------:|----------------:|--------------------:|--------:|-----------|
| 0 | DE | 2-hour-period-1 | 2017-12-29 01:00:00 | 10.45 | 16.949455 |
| 24 | DE | 1-hour-period-1 | 2017-12-29 00:00:00 | 9.73 | -0.241482 |
| 25 | DE | 1-hour-period-2 | 2017-12-29 01:00:00 | 0.72 | -3.456478 |
The ground truth value `y` for the first 2-hour period is 10.45, and the sum
of the ground truth values for the first two 1-hour periods is (9.73 + 0.72)
= 10.45. Hence, these values are coherent with each other.
However, the forecast for the first 2-hour period is 16.95, but the sum of
the forecasts for the first two 1-hour periods is -3.69. Hence, these
forecasts are clearly not coherent with each other.
We will use reconciliation techniques to make these forecasts better
coherent with each other and improve their accuracy.
#### Forecast Reconciliation
We can use the `HierarchicalReconciliation` class to reconcile the forecasts.
In this example we use `MinTrace`. Note that we have to set `temporal=True`
in the `reconcile` function.
```python
from hierarchicalforecast.methods import MinTrace
from hierarchicalforecast.core import HierarchicalReconciliation
reconcilers = [MinTrace(method="wls_struct")]
hrec = HierarchicalReconciliation(reconcilers=reconcilers)
Y_rec = hrec.reconcile(Y_hat_df=Y_hat, S=S_test, tags=tags_test, temporal=True)
```
### Step 4. Evaluation
The `HierarchicalForecast` package includes the `evaluate` function to
evaluate the different hierarchies.
We evaluate the temporally aggregated forecasts across **all temporal aggregations**.
```python
import hierarchicalforecast.evaluation as hfe
evaluation = hfe.evaluate(
df=Y_rec.drop(columns='unique_id'),
tags=tags_test,
metrics=[mae],
id_col='temporal_id'
)
numeric_cols = evaluation.select_dtypes('number').columns
evaluation[numeric_cols] = evaluation[numeric_cols].map('{:.3}'.format).astype(float)
evaluation
```
| | level | metric | TimeGPT | TimeGPT/MinTrace_method-wls_struct |
|--:|--------------:|-------:|--------:|-----------------------------------:|
| 0 | 2-hour-period | mae | 25.2 | 12.00 |
| 1 | 1-hour-period | mae | 18.5 | 6.16 |
| 2 | Overall | mae | 20.8 | 8.12 |
As we can see, we improved performance of TimeGPT's predictions both for the
2-hour period and for the 1-hour period, as both levels see a significant
reduction in MAE.
Visually, we can also verify the forecast is better after using reconciliation
techniques.
For the 1-hour-period forecasts:
```python
plot_series(
Y_train.query(
"temporal_id in @tags_train['1-hour-period']"
)[["y", "ds", "unique_id"]].iloc[-100:],
forecasts_df=Y_rec.query("temporal_id in @tags_test['1-hour-period']").drop(columns=["temporal_id"])
)
```
![hier_plot-1hour](/images/docs/1-hour-fcst.png)
and for the 2-hour period forecasts:
```python
plot_series(
Y_train.query(
"temporal_id in @tags_train['2-hour-period']"
)[["y", "ds", "unique_id"]].iloc[-50:],
forecasts_df=Y_rec.query("temporal_id in @tags_test['2-hour-period']").drop(columns=["temporal_id"])
)
```
![hier_plot-2hour](/images/docs/2-hour-fcst.png)
Also, we can now verify that the forecasts are better coherent with each other.
For the first 2-hour period, our forecast after reconciliation is 6.63, and
the sum of the forecasts for the first two 1-hour periods is 1.7 + 4.92 =
6.63. Hence, we now have more accurate and coherent forecasts across frequencies.
```python
Y_rec.query(
"temporal_id in ['2-hour-period-1', '1-hour-period-1', '1-hour-period-2']"
)
```
| | unique_id | temporal_id | ds | y | TimeGPT | TimeGPT/MinTrace_method-wls_struct |
|---:|----------:|----------------:|--------------------:|--------:|-----------:|-----------------------------------:|
| 0 | DE | 2-hour-period-1 | 2017-12-29 01:00:00 | 10.45 | 16.949455 | 6.625748 |
| 24 | DE | 1-hour-period-1 | 2017-12-29 00:00:00 | 9.73 | -0.241482 | 4.920372 |
| 25 | DE | 1-hour-period-2 | 2017-12-29 01:00:00 | 0.72 | -3.456478 | 1.705376 |
## Conclusion
In this tutorial we have shown:
- How to create forecasts for multiple frequencies for the same dataset with TimeGPT
- How to improve the accuracy of these forecasts using temporal reconciliation techniques
Note that even though we created forecasts for two different frequencies, there
is no 'need' to use the forecast of the 2-hour-period. One can use this technique
also simply to improve the forecast of the 1-hour-period.
---
title: "Quickstart Guide"
description: "Learn how to use TimeGPT for accurate time series forecasting"
icon: "chart-line"
---
## TimeGPT - Foundation Model for Time Series Forecasting
TimeGPT is a production-ready generative pretrained transformer for time series forecasting and predictions. It delivers accurate forecasts for retail sales, electricity demand, financial markets, and IoT sensor data with just a few lines of Python code. This quickstart guide will have you making your first forecast in under 5 minutes!
## Set Up TimeGPT for Python Time Series Forecasting
### Step 1: Get an API Key
- Visit [dashboard.nixtla.io](https://dashboard.nixtla.io) to activate your free trial and create an account.
- Sign in using Google, GitHub, or your email.
- Navigate to **API Keys** in the menu and select **Create New API Key**.
- Your new API key will appear on the screen. Copy this key using the button on the right.
<Frame caption="TimeGPT dashboard showing API key management interface for Python forecasting">
![Dashboard for TimeGPT API keys](https://github.com/Nixtla/nixtla/blob/main/nbs/img/dashboard.png?raw=true)
</Frame>
### Step 2: Install Nixtla
Install the Nixtla library in your preferred Python environment:
```bash
pip install nixtla
```
### Step 3: Import the Nixtla TimeGPT client
Import the Nixtla client and instantiate it with your API key:
```python
from nixtla import NixtlaClient
nixtla_client = NixtlaClient(
api_key='my_api_key_provided_by_nixtla'
)
```
### Step 4: Load your time series data
Import the Nixtla client and instantiate it with your API key:
```python
from nixtla import NixtlaClient
nixtla_client = NixtlaClient(
api_key='my_api_key_provided_by_nixtla'
)
```
Verify the status and validity of your API key:
```python
nixtla_client.validate_api_key()
```
```bash
True
```
<Info>
For enhanced security practices, see our guide on
[Setting Up your API Key](/setup/setting_up_your_api_key).
</Info>
## Make Your First Time Series Forecast
We'll demonstrate TimeGPT's forecasting capabilities using the classic `AirPassengers` dataset, a monthly time series showing international airline passengers from 1949 to 1960.
```python
import pandas as pd
df = pd.read_csv('https://raw.githubusercontent.com/Nixtla/transfer-learning-time-series/main/datasets/air_passengers.csv')
df.head()
```
| | timestamp | value |
| --- | ---------- | ----- |
| 0 | 1949-01-01 | 112 |
| 1 | 1949-02-01 | 118 |
| 2 | 1949-03-01 | 132 |
| 3 | 1949-04-01 | 129 |
| 4 | 1949-05-01 | 121 |
<Info>
If you are using your own data, here are the data requirements:
- The target variable must not contain missing or non-numeric values.
- The timestamp column must not contain missing values.
- Date stamps must form a continuous sequence without gaps for the selected frequency.
- pandas must be able to parse the timestamp column as datetime objects. ([see Pandas documentation](https://pandas.pydata.org/docs/reference/api/pandas.to_datetime.html)).
For more details, visit [Data Requirements](/data_requirements/data_requirements).
</Info>
Plot the dataset:
```python
nixtla_client.plot(df, time_col='timestamp', target_col='value')
```
<Frame caption="Historical AirPassengers time series data visualization showing monthly passenger trends from 1949 to 1960">
![Time Series Plot](https://raw.githubusercontent.com/Nixtla/nixtla/readme_docs/nbs/_docs/docs/getting-started/2_quickstart_files/figure-markdown_strict/cell-13-output-1.png)
</Frame>
The `plot` method automatically displays figures in notebook environments. To save a plot locally:
```python
fig = nixtla_client.plot(df, time_col='timestamp', target_col='value')
fig.savefig('plot.png', bbox_inches='tight')
```
## Real-World Forecasting Applications
TimeGPT excels at:
- **Retail forecasting**: Predict product demand and inventory needs
- **Energy forecasting**: Forecast electricity consumption and renewable energy production
- **Financial forecasting**: Project revenue, sales, and market trends
- **IoT predictions**: Anticipate sensor readings and equipment metrics
## Short and Long-Term Forecasting Examples
### Generate a longer-term forecast
Forecast the next 12 months using the SDK's `forecast` method:
```python
timegpt_fcst_df = nixtla_client.forecast(
df=df,
h=12,
freq='MS',
time_col='timestamp',
target_col='value'
)
timegpt_fcst_df.head()
```
Plot the forecast:
```python
nixtla_client.plot(df, timegpt_fcst_df, time_col='timestamp', target_col='value')
```
<Frame caption="TimeGPT 12-month forecast visualization with confidence intervals for AirPassengers dataset">
![Forecasted Plot](https://raw.githubusercontent.com/Nixtla/nixtla/readme_docs/nbs/_docs/docs/getting-started/2_quickstart_files/figure-markdown_strict/cell-15-output-1.png)
</Frame>
You may also generate forecasts for longer horizons with the `timegpt-1-long-horizon` model. For example, 36 months ahead:
```python
timegpt_fcst_df = nixtla_client.forecast(
df=df,
h=36,
freq='MS',
time_col='timestamp',
target_col='value',
model='timegpt-1-long-horizon'
)
timegpt_fcst_df.head()
```
Plot the forecast:
```python
nixtla_client.plot(df, timegpt_fcst_df, time_col='timestamp', target_col='value')
```
<Frame caption="TimeGPT long-horizon model 36-month forecast with extended prediction intervals">
![Longer Forecast Plot](https://raw.githubusercontent.com/Nixtla/nixtla/readme_docs/nbs/_docs/docs/getting-started/2_quickstart_files/figure-markdown_strict/cell-17-output-1.png)
</Frame>
### Generate a shorter-term forecast
Forecast the next 6 months with a single command:
```python
timegpt_fcst_df = nixtla_client.forecast(
df=df,
h=6,
freq='MS',
time_col='timestamp',
target_col='value'
)
```
Plot the forecast:
```python
nixtla_client.plot(df, timegpt_fcst_df, time_col='timestamp', target_col='value')
```
<Frame caption="TimeGPT 6-month short-term forecast for AirPassengers with prediction confidence bands">
![Shorter Forecast Plot](https://raw.githubusercontent.com/Nixtla/nixtla/readme_docs/nbs/_docs/docs/getting-started/2_quickstart_files/figure-markdown_strict/cell-18-output-2.png)
</Frame>
## Frequently Asked Questions
### How accurate is TimeGPT for forecasting?
TimeGPT achieves state-of-the-art accuracy across multiple domains including retail, finance, and electricity forecasting with zero-shot learning.
### Can I use TimeGPT with my own time series data?
Yes, TimeGPT works with any time series data in pandas DataFrame format with a timestamp and target value column.
### How long does it take to generate forecasts?
TimeGPT typically generates forecasts in seconds, making it suitable for production environments.
## Next Steps
Now that you've made your first forecast, explore these tutorials to unlock TimeGPT's full capabilities:
- [Improve Accuracy](/forecasting/improve_accuracy) - Advanced techniques to enhance forecast accuracy
- [Fine-Tuning](/forecasting/fine-tuning/steps) - Customize TimeGPT for your specific data
- [Exogenous Variables](/forecasting/exogenous-variables/numeric_features) - Include external variables in forecasts
- [Uncertainty Quantification](/forecasting/probabilistic/introduction) - Generate prediction intervals and quantile forecasts
- [Cross-Validation](/forecasting/evaluation/cross_validation) - Assess forecast performance
- [Forecasting at Scale](/forecasting/forecasting-at-scale/computing_at_scale) - Process thousands of time series
- [Anomaly Detection](/anomaly_detection/historical_anomaly_detection) - Identify outliers in your data
\ No newline at end of file
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment