Commit f42429f6 authored by bailuo's avatar bailuo
Browse files

readme

parents
<svg width="366" height="211" viewBox="0 0 366 211" fill="none" xmlns="http://www.w3.org/2000/svg"><path d="M28.364 207.934L179.581 55.9773C180.996 54.5556 183.291 54.5556 184.686 55.9773L257.15 129.41C258.565 130.832 260.84 130.832 262.255 129.41L362.939 28.2336C364.354 26.8119 364.354 24.5255 362.939 23.1037L341.009 1.06633C339.594 -0.355442 337.319 -0.355442 335.904 1.06633L262.847 74.4795C261.433 75.9013 259.157 75.9013 257.743 74.4795L184.686 1.10475C183.271 -0.317016 180.996 -0.317016 179.581 1.10475L106.525 74.518C105.11 75.9397 102.835 75.9397 101.42 74.518L28.364 1.10475C26.9492 -0.317016 24.6739 -0.317016 23.2591 1.10475L1.06114 23.4111C-0.353714 24.8329 -0.353714 27.1193 1.06114 28.541L74.1173 101.954C75.5322 103.376 75.5322 105.662 74.1173 107.084L1.06114 180.497C-0.353714 181.919 -0.353714 184.206 1.06114 185.627L23.2591 207.934C24.6739 209.355 26.9492 209.355 28.364 207.934Z" fill="#1F1F1F"/><path d="M246.444 145.37L247 144.81L222.464 120.073C221.045 118.642 218.764 118.642 217.346 120.073L184.95 152.733C183.532 154.163 181.251 154.163 179.832 152.733L147.437 120.073C146.018 118.642 143.737 118.642 142.319 120.073L120.064 142.51C118.645 143.94 118.645 146.24 120.064 147.67L179.832 207.927C181.251 209.358 183.532 209.358 184.95 207.927L246.732 145.641L246.463 145.37H246.444Z" fill="#1F1F1F"/><path d="M298.086 119.948L275.885 142.148C274.473 143.56 274.473 145.85 275.885 147.262L336.128 207.505C337.54 208.917 339.83 208.917 341.242 207.505L363.443 185.305C364.855 183.893 364.855 181.603 363.443 180.191L303.199 119.948C301.787 118.535 299.498 118.535 298.086 119.948Z" fill="#1F1F1F"/></svg>
\ No newline at end of file
## Introduction
This repository contains the documentation for Nixtla TimeGPT. It is the third iteration of the docs:
- First version was hosted in readme.com and still visible there for reference. https://timegpt.readme.io/ and in this same doc in the [migration to mintlify](#migration-from-readmeio-to-mintlify) section you can see some details about the migration.
- Second version has exactly this same content but it's in the same repo as the website, which isn't ideal for collaborators, currently visible at https://nixtla.io/docs/ and hosted in mintlify.
- This iteration will also be hosted in mintlify.
## Contributing
### Option 1: Using Mintlify Online Editor
The easiest and safest way to make changes to the documentation is using the [online mintlify editor](https://dashboard.mintlify.com/nixtla/nixtla-docs/editor/main).
The mintlify editor will send a PR in github so the team can review the changes and merge them. Same process as if you were contributing to the repo in any other way.
### Option 2: Using GitHub Codespaces
If you prefer to work locally, this repository is configured with GitHub Codespaces to easily view and develop documentation with Mintlify.
1. Click the "Code" button at the top of this repository
2. Select the "Codespaces" tab
3. Click "Create codespace on [branch]"
4. Once the Codespace is ready, open a terminal and run:
```bash
mintlify dev
```
5. Click on the "Ports" tab and open port 3000 in your browser to view the documentation
For more information about the Codespace setup, see the `.devcontainer` directory.
Please be aware that linters and formatters fail most of the time because the docs follow a very specific structure that is not supported by the default linters. So please turn them off when working locally to prevent extra changes in your PRs like adding or removing spaces in the code.
## Release process
When a PR is merged to main branch, the docs are deployed to production automatically.
## Pending improvements and wishlist
### Technical
- Add support for .ipynb files, so that we can run the notebooks and output the results in the docs.
- Implement Nixtla branding on the docs.
- Organize the docs in subfolders.
### Content
- Rethink how we welcome non technical users into the docs, currently we start the into with "Welcome to TimeGPT - The foundational model for time series forecasting and anomaly detection" which sounds very overwhelming for a non technical user. It would be nice to add content for people who are just starting to work with time series forecasting and anomaly detection.
- ????
## Migration from Readme.io to Mintlify
#### HTML to Markdown Link Conversion
We've standardized the documentation by converting all HTML links (`<a href="URL">text</a>`) to Markdown format (`[text](URL)`). This makes the documentation more consistent and easier to maintain.
#### Conversion Script
The conversion was done using a Node.js script located at `/docs/utils/convert-links.js`. To run the script:
1. Install dependencies (if not already installed):
```bash
npm install glob
```
2. Change to the docs directory:
```bash
cd docs
```
3. Run the script:
```bash
node utils/convert-links.js
```
For testing a single file without making changes:
```bash
node utils/convert-links.js --test path/to/file.mdx
```
For a dry run (simulate changes without writing to files):
```bash
node utils/convert-links.js --dry-run
```
### HTML to Markdown General Conversion
We've enhanced our documentation by converting various HTML elements to their Markdown equivalents for improved readability and maintainability. This comprehensive conversion captures elements beyond just links.
#### Conversion Script
The conversion was performed using a more advanced Node.js script at `/docs/utils/html-to-markdown.js`. This script:
1. Converts HTML formatting to Markdown equivalents
2. Preserves component structure and attributes
3. Makes the content more consistent with Markdown best practices
To run the script:
1. Change to the docs directory: `cd docs`
2. Run the script:
```bash
node utils/html-to-markdown.js
```
For testing a single file without making changes:
```bash
node utils/html-to-markdown.js --test path/to/file.mdx
```
For a dry run (simulate changes without writing to files):
```bash
node utils/html-to-markdown.js --dry-run
```
### Frame Component Image Standardization
We've standardized how images are used inside Frame components by ensuring all images use Markdown syntax (`![alt](url)`) rather than HTML `img` tags or the deprecated `src` attribute.
#### Frame Image Fix Script
The standardization was done using a Node.js script located at `/docs/utils/fix-frame-images.js`. This script fixes two issues:
1. Converts `<Frame src="URL" alt="ALT"></Frame>` to `<Frame>![ALT](URL)</Frame>`
2. Converts `<Frame><img src="URL" alt="ALT" /></Frame>` to `<Frame>![ALT](URL)</Frame>`
To run the script:
1. Change to the docs directory:
```
cd docs
```
2. Run the script:
```
node utils/fix-frame-images.js
```
For testing a single file without making changes:
```
node utils/fix-frame-images.js --test path/to/file.mdx
```
For a dry run (simulate changes without writing to files):
```
node utils/fix-frame-images.js --dry-run
```
### Card Title Standardization
We've standardized how titles are defined in Card components by ensuring they use the `title` prop rather than bold text (`**Text**`) inside the card body.
#### Card Title Check Script
To identify and fix cards with incorrectly defined titles, use the script at `/docs/utils/check-card-titles.js`. This script:
1. Identifies Card components where the title is defined using bold text instead of the `title` prop
2. Reports which files have this issue and shows what needs to be fixed
3. Can automatically fix the issues with the `--fix` flag
To run the script:
1. Change to the docs directory:
```
cd docs
```
2. Check for issues without making changes:
```
node utils/check-card-titles.js
```
3. Automatically fix issues:
```
node utils/check-card-titles.js --fix
```
For testing a single file:
```
node utils/check-card-titles.js --test path/to/file.mdx
```
For a dry run (simulate fixes without making changes):
```
node utils/check-card-titles.js --fix --dry-run
```
### Table Conversion to Markdown
We've standardized how tables are presented in documentation by converting all `<Table>` components to native Markdown table syntax. This makes the documentation more consistent and easier to maintain.
#### Table Conversion Script
To convert tables from HTML to Markdown format, use the script at `/docs/utils/convert-tables.js`. This script:
1. Identifies `<Table>` components in MDX files
2. Converts them to Markdown table syntax (`| Header | Header |` format)
3. Preserves all table data while simplifying the format
To run the script:
1. Change to the docs directory:
```
cd docs
```
2. Check for tables without making changes:
```
node utils/convert-tables.js
```
3. Automatically convert all tables:
```
node utils/convert-tables.js --fix
```
For testing a single file:
```
node utils/convert-tables.js --test path/to/file.mdx
```
For a dry run (simulate conversions without making changes):
```
node utils/convert-tables.js --fix --dry-run
```
### Publishing Changes
Install our Github App to auto propagate changes from your repo to your deployment. Changes will be deployed to production automatically after pushing to the default branch. Find the link to install on your dashboard.
#### Troubleshooting
- Mintlify dev isn't running - Run `mintlify install` it'll re-install dependencies.
- Page loads as a 404 - Make sure you are running in a folder with `docs.json`
\ No newline at end of file
---
title: "Key Concepts"
description: "Understanding the foundations of time series forecasting with TimeGPT"
icon: "lightbulb"
---
<Info>
These key concepts cover the foundations of time series data, how forecasts
are generated, and the role of TimeGPT in predicting future values and
detecting anomalies.
</Info>
<Check>
Use these concepts as a reference to better understand how TimeGPT simplifies
tasks such as demand forecasting, anomaly detection, and multi-series
forecasting.
</Check>
<CardGroup>
<Card title="Time Series">
A sequence of numerical data points arranged in chronological order.
</Card>
<Card title="Forecasting">
Predicting future values by analyzing historical data and patterns.
</Card>
<Card title="Anomaly Detection">
Identifying unusual or unexpected events that deviate from typical behavior.
</Card>
<Card title="Multiple Series">
Managing and forecasting multiple time series data at once.
</Card>
<Card title="TimeGPT">
Nixtla's generative pre-trained model for time series forecasting.
</Card>
<Card title="Inputs (Tokens)">
Segments of historical data that inform TimeGPT's forecasting process.
</Card>
</CardGroup>
<AccordionGroup>
<Accordion title="Time Series">
## Time Series
A time series is a sequence of numerical data points arranged in chronological order. In the context of TimeGPT, each data point in the series serves as input to the model. The model learns from patterns in the data and uses this understanding to forecast future values. Time series data appear in various domains, such as stock prices, weather recordings, and sales figures.
</Accordion>
<Accordion title="Forecasting">
## Forecasting
Forecasting is a method used in many fields—such as business and environmental studies—to predict future outcomes based on historical information. It involves analyzing past data to detect patterns, trends, or recurring behaviors and extending these insights into the future.
<Info>
One significant advancement in forecasting is the application of modern
machine-learning methods, including deep learning. Models like TimeGPT can
handle large datasets and identify complex patterns with enhanced prediction
accuracy.
</Info>
For example, a retailer might analyze past sales to forecast product demand, while an economist uses historical data to anticipate future economic conditions. TimeGPT makes these advanced capabilities accessible even to users without in-depth machine-learning expertise.
<Frame caption="TimeGPT output">
![TimeGPT output](https://files.readme.io/f0402e5-image.png)
</Frame>
</Accordion>
<Accordion title="Anomaly Detection">
## Anomaly Detection
Analyzing sequential data often requires identifying anomalies or unexpected events that deviate from standard patterns. TimeGPT supports anomaly detection by monitoring data sequences (such as daily temperatures) for unusual fluctuations.
<Warning>
Detecting anomalies is crucial for timely responses. Sudden changes in market
behavior, unusual network activity, or abnormal sensor readings can all
indicate a need for prompt investigation.
</Warning>
For example, in finance, TimeGPT can highlight abrupt market changes; in cybersecurity, it helps uncover suspicious network activity. Anomaly detection enhances forecasting by flagging significant outliers, improving overall data insights.
<Frame caption="Anomaly detection">
![Anomaly detection](https://files.readme.io/9655290-slice4.png)
</Frame>
</Accordion>
<Accordion title="Multiple Series">
## Multiple Series
TimeGPT provides robust support for multi-series forecasting, allowing simultaneous analysis of multiple time series. Users can train the model on many related series, improving accuracy and enabling more flexible customization for specific forecasting requirements.
<Frame caption="Multiple series forecasting">
![Multiple series forecasting](https://files.readme.io/8b9b818-slice8.png)
</Frame>
</Accordion>
<Accordion title="TimeGPT">
## TimeGPT
TimeGPT by Nixtla is a generative pre-trained model specifically designed for time series forecasting. It reviews historical series values (and optional exogenous variables) to generate predictions. Beyond forecasting, TimeGPT enables tasks like anomaly detection and financial forecasts.
<Info>
TimeGPT scans time series data similarly to how a person might read text:
sequentially, from left to right. It can interpret historical windows (tokens)
and leverage temporal patterns learned from billions of data points.
</Info>
With the TimeGPT API, you can access these forecasting capabilities for various potential use cases—from scenario planning to anomaly detection and beyond.
<Frame caption="TimeGPT API">
![TimeGPT API](https://files.readme.io/6f59c1b-Screenshot_2023-08-09_at_2.49.05_PM.png)
</Frame>
</Accordion>
</AccordionGroup>
## Get Started with TimeGPT
Now that you understand the key concepts, you're ready to start using TimeGPT for your forecasting needs.
<CardGroup>
<Card
title="Introduction"
icon="book-open"
href="/introduction/introduction"
>
Learn more about TimeGPT and how it can transform your time series analysis.
</Card>
<Card
title="Quickstart"
icon="rocket"
href="/forecasting/timegpt_quickstart"
>
Get up and running with TimeGPT in minutes with our step-by-step guide.
</Card>
</CardGroup>
---
title: "Privacy Notice"
description: "Details on how Nixtla collects, uses, and protects your personal information."
icon: "lock"
---
We at Nixtla Inc. (together with our affiliates, “**Nixtla**”, “**we**”, “**our**” or “**us**”) respect your privacy and are strongly committed to keeping secure any information we obtain from you or about you. This Privacy Policy describes our practices with respect to Personal Information we collect from or about you when you use our website, applications, and services (collectively, “**Services**”). This Privacy Policy does not apply to content that we process on behalf of customers of our business offerings, such as our API. Our use of that data is governed by our customer agreements covering access to and use of those offerings.
# 1. Personal Information we collect
We collect personal information relating to you (“**Personal Information**”) as follows:
Personal Information You Provide: We collect Personal Information if you create an account to use our Services or communicate with us as follows:
**Account Information**: When you create an account with us, we will collect information associated with your account, including your name, contact information, account credentials, payment card information, and transaction history, (collectively, “**Account Information**”).
**User Content**: When you use our Services, we collect Personal Information that is included in the input, file uploads, or feedback that you provide to our Services (“**Content**”).
**Communication Information**: If you communicate with us, we collect your name, contact information, and the contents of any messages you send (“**Communication Information**”).
**Social Media Information**: We have pages on social media sites like Medium, Twitter, YouTube, and LinkedIn. When you interact with our social media pages, we will collect Personal Information that you elect to provide to us, such as your contact details (collectively, “**Social Information**”). In addition, the companies that host our social media pages may provide us with aggregate information and analytics about our social media activity.
**Personal Information We Receive Automatically From Your Use of the Services**: When you visit, use, or interact with the Services, we receive the following information about your visit, use, or interactions (“**Technical Information**”):
**Log Data**: Information that your browser automatically sends when you use our Services. Log data includes your Internet Protocol address, browser type and settings, the date and time of your request, and how you interact with our website.
**Usage Data**: We may automatically collect information about your use of the Services, such as the types of content that you view or engage with, the features you use, and the actions you take, as well as your time zone, country, the dates and times of access, user agent and version, type of computer or mobile device, and your computer connection.
**Device Information**: Includes name of the device, operating system, device identifiers, and browser you are using. Information collected may depend on the type of device you use and its settings.
**Cookies**: We use cookies to operate and administer our Services, and improve your experience.
A “cookie” is a piece of information sent to your browser by a website you visit. You can set your browser to accept all cookies, to reject all cookies, or to notify you whenever a cookie is offered so that you can decide each time whether to accept it. However, refusing a cookie may in some cases preclude you from using, or negatively affect the display or function of, a website or certain areas or features of a website. For more details on cookies, please visit All About Cookies.
**Analytics**: We may use a variety of online analytics products that use cookies to help us analyze how users use our Services and enhance your experience when you use the Services.
# 2. We may use Personal Information for the following purposes:
1. To provide, administer, maintain, and/or analyze the Services;
2. To improve our Services and conduct research;
3. To communicate with you;
4. To develop new programs and services;
5. To prevent fraud, criminal activity, or misuses of our Services, and to protect the security of our IT systems, architecture, and networks;
6. To carry out business transfers; and
7. To comply with legal obligations and legal processes and to protect our rights, privacy, safety, or property, and/or that of our affiliates, you, or other third parties.
**Aggregated or De-Identified Information**. We may aggregate or de-identify Personal Information so that it may no longer be used to identify you and use such information to analyze the effectiveness of our Services, to improve and add features to our Services, to conduct research and for other similar purposes. In addition, from time to time, we may analyze the general behavior and characteristics of users of our Services and share aggregated information like general user statistics with third parties, publish such aggregated information or make such aggregated information generally available. We may collect aggregated information through the Services, through cookies, and through other means described in this Privacy Policy. We will maintain and use de-identified information in anonymous or de-identified form and we will not attempt to reidentify the information, unless required by law.
As noted above, we may use Content you provide us to improve our Services, for example to train the models that power TimeGPT. Fill [this form](https://forms.gle/rvF58qkNCt2uNjSX8) to opt out of our use of your Content to train our models.
# 3. Disclosure of personal information
In certain circumstances we may provide your Personal Information to third parties without further notice to you, unless required by the law:
**Vendors and Service Providers**. To assist us in meeting business operations needs and to perform certain services and functions, we may provide Personal Information to vendors and service providers, including providers of hosting services, cloud services, and other information technology services providers, email communication software, and web analytics services, among others. Pursuant to our instructions, these parties will access, process, or store Personal Information only in the course of performing their duties to us.
**Business Transfers**. If we are involved in strategic transactions, reorganization, bankruptcy, receivership, or transition of service to another provider (collectively, a “**Transaction**”), your Personal Information and other information may be disclosed in the diligence process with counterparties and others assisting with the Transaction and transferred to a successor or affiliate as part of that Transaction along with other assets.
**Legal Requirements**. We may share your Personal Information, including information about your interaction with our Services, with government authorities, industry peers, or other third parties (i) if required to do so by law or in the good faith belief that such action is necessary to comply with a legal obligation, (ii) to protect and defend our rights or property, (iii) if we determine, in our sole discretion, that there is a violation of our terms, policies, or the law; (iv) to detect or prevent fraud or other illegal activity; (v) to protect the safety, security, and integrity of our products, employees, or users, or the public, or (vi) to protect against legal liability.
**Affiliates**. We may disclose Personal Information to our affiliates, meaning an entity that controls, is controlled by, or is under common control with Nixtla. Our affiliates may use the Personal Information we share in a manner consistent with this Privacy Policy.
# 4. Your choices and controls
Depending on where you live, you may have the right to exercise certain controls and choices regarding our collection, use, and sharing of your Personal Information. To opt-out of marketing communications please email us at [ops@nixtla.io](mailto:ops@nixtla.io) or by following the instructions included in the email or text correspondence.
Please note that, even if you unsubscribe from certain correspondence, we may still need to contact you with important transactional or administrative information, as permitted by law. Additionally, if you choose not to provide certain Personal Information, we may be unable to provide some or all of our Services to you.
# 5. Children
Our Services are not directed to children under the age of 13. Nixtla does not knowingly collect Personal Information from children under the age of 13. If you have reason to believe that a child under the age of 13 has provided Personal Information to Nixtla through the Services, please email us at [ops@nixtla.io](mailto:ops@nixtla.io)
We will investigate any notification and if appropriate, delete the Personal Information from our systems. If you are 13 or older, but under 18, you must have consent from your parent or guardian to use our Services.
# 6. Links to other websites
The Services may contain links to other websites not operated or controlled by Nixtla, including social media services (“**Third Party Sites**”). The information that you share with Third Party Sites will be governed by the specific privacy policies and terms of service of the Third Party Sites and not by this Privacy Policy. By providing these links we do not imply that we endorse or have reviewed these sites. Please contact the Third Party Sites directly for information on their privacy practices and policies.
# 7. Security and Retention
We implement commercially reasonable technical, administrative, and organizational measures to protect Personal Information both online and offline from loss, misuse, and unauthorized access, disclosure, alteration, or destruction. However, no Internet or email transmission is ever fully secure or error-free. In particular, emails sent to or from us may not be secure. Therefore, you should take special care in deciding what information you send to us via the Services or email. In addition, we are not responsible for circumvention of any privacy settings or security measures contained on the Services, or third-party websites.
We’ll retain your Personal Information for only as long as we need in order to provide our Services to you, or for other legitimate business purposes such as resolving disputes, safety and security reasons, or complying with our legal obligations. How long we retain Personal Information will depend on a number of factors, such as the amount, nature, and sensitivity of the information, the potential risk of harm from unauthorized use or disclosure, our purpose for processing the information, and any legal requirements.
# 8. Changes to the privacy policy
We may update this Privacy Policy from time to time. All changes will be effective immediately upon posting to this page. Material changes will be conspicuously posted on this page or otherwise communicated to you as required by law.
# 9. How to contact us
Please contact us at [ops@nixtla.io](mailto:ops@nixtla.io) if you have any questions or concerns not already addressed in this Privacy Policy.
\ No newline at end of file
---
title: "Nixtla"
description: "About us"
icon: "user"
---
# Nixtla
<CardGroup>
<Card title="Who We Are">
Nixtla is to numbers what Anthropic or Open AI are to language and images. We are the creators of TimeGPT—a pre-trained model that allows enterprises to upload their data and receive predictions within minutes. This approach saves significant money, development time, and maintenance effort.
</Card>
<Card title="Our Impact">
TimeGPT was trained on the largest collection of time series data in history—over 100 billion rows across financial, weather, energy, and web data. Nixtla has also built the most comprehensive time series ecosystem, with over 5 million downloads worldwide. Our software is trusted and used in production by leading companies such as Amazon, Walmart, and Lyft.
</Card>
<Card title="Our Philosophy">
We are a group of hackers driven by curiosity and a profound desire to make a meaningful impact. With backgrounds ranging from research and development to philosophy, we have united to revolutionize the time series field. We embrace diversity, champion inclusivity, and believe that the future belongs to everyone.
</Card>
</CardGroup>
<Info>
We stand by our roots in Latin America. We are queer, we are different, and we take pride in it. Our shared passion for understanding the world guides us in pushing the boundaries of what’s possible with time series analysis.
</Info>
## Our Open Source Initiatives
TimeGPT is only one part of our story. Before its creation, Nixtla developed an open-source time series ecosystem that quickly flourished, garnering millions of downloads.
<Frame>
![Nixtla Open Source](https://files.readme.io/d1318f2-Screenshot_2023-08-04_at_3.13.41_PM.png)
</Frame>
<Check>
Our thriving open-source community is a testament to the power of collaboration. Join us in building innovative tools for time series analysis.
</Check>
## Our Origin Story
<AccordionGroup>
<Accordion title="A Simple Start">
Nixtla began as a side project. We built tools for an old company we worked for, and then everyone took different paths—some pursued academic careers, others founded companies, and some focused on shipping products.
</Accordion>
<Accordion title="Growth and Collaboration">
We eventually reunited to turn what started as a modest open-source library into the most comprehensive time-series ecosystem. By challenging the status quo and giants like Facebook, Amazon, and Google, we proved how a dedicated group of passionate individuals, powered by open-source software, can successfully compete with major players.
</Accordion>
<Accordion title="A Global Community">
As Nixtla’s usage soared, our community grew, fueling our development. Today, Nixtla is the most impactful time series ecosystem worldwide, relied upon by innovators in both industry and academia.
</Accordion>
<Accordion title="Envisioning the Future">
Recognizing this was only the beginning, we set our sights on a new challenge—pioneering foundation models for time series. This breakthrough helps us share the future of data science with everyone.
</Accordion>
</AccordionGroup>
## Follow Us
<Steps>
<Step title="Join Our Slack">
Connect with fellow developers, researchers, and enthusiasts in our&nbsp;
[Slack Channel](https://join.slack.com/t/nixtlacommunity/shared_invite/zt-1pmhan9j5-F54XR20edHk0UtYAPcW4KQ).
</Step>
<Step title="Follow Us on Twitter">
Stay up-to-date with the latest Nixtla news and community highlights on&nbsp;
[Twitter](https://twitter.com/nixtlainc).
</Step>
<Step title="Contribute on GitHub">
Be part of our open-source evolution by contributing to Nixtla’s core projects on&nbsp;
[GitHub](https://github.com/Nixtla).
</Step>
</Steps>
<Frame caption="Join Nixtla Community">
![Join Nixtla Community](https://files.readme.io/3f17d41-Screenshot_2024-05-04_at_3.24.35_PM.png)
</Frame>
<Check>
Together, we are not just shaping Nixtla— we are defining the future of data science.
</Check>
\ No newline at end of file
---
title: "Terms and Conditions"
description: "Terms and conditions for using Nixtla Services."
icon: "book"
---
Thank you for using Nixtla's TimeGPT and or TimeGEN!
These Terms of Use apply when you use the services of Nixtla, Inc. or our affiliates, including our application programming interface, software, tools, developer services, data, documentation, and websites ("**Services**"). The Terms include other terms and conditions, documentation, guidelines, or policies we may provide in writing. By using our Services, you agree to these Terms. Our [Privacy Notice](/about/privacy-notice) explains how we collect and use personal information.
# 1. Registration and Access
You must be at least 13 years old to use the Services. If you are under 18 you must have your parent or legal guardian's permission to use the Services. If you use the Services on behalf of another person or entity, you must have the authority to accept the Terms on their behalf. You must provide accurate and complete information to register for an account. You may not make your access credentials or account available to others outside your organization, and you are responsible for all activities that occur using your credentials.
# 2. Usage Requirements
**(a) Use of Services**. You may access, and we grant you a non-exclusive right to use, the Services in accordance with these Terms. You will comply with these Terms and all applicable laws when using the Services. We and our affiliates own all rights, title, and interest in and to the Services.
**(b) Feedback**. We appreciate feedback, comments, ideas, proposals and suggestions for improvements. If you provide any of these things, we may use it without restriction or compensation to you.
**(c) Restrictions**. You may not (i) use the Services in a way that infringes, misappropriates or violates any person's rights; (ii) reverse assemble, reverse compile, decompile, translate or otherwise attempt to discover the source code or underlying components of models, algorithms, and systems of the Services (except to the extent such restrictions are contrary to applicable law); (iii) use output from the Services to develop models that compete with Nixtla; (iv) except as permitted through the API, use any automated or programmatic method to extract data or output from the Services, including scraping, web harvesting, or web data extraction; (v) represent that output from the Services was human-generated when it is not or otherwise violate our policies; (vi) buy, sell, or transfer API keys without our prior consent; or (vii), send us any personal information of children under 13 or the applicable age of consent. You will comply with any rate limits and other requirements in our documentation. You may use Services only in geographies currently supported by Nixtla.
**(d) Third Party Services**. Any third party software, services, or other products you use in connection with the Services are subject to their own terms, and we are not responsible for third party products.
# 3. Content
**(a) Your Content**. You may provide input to the Services ("**Input**"), and receive output generated and returned by the Services based on the Input ("**Output**"). Input and Output are collectively ("**Content**"). As between the parties and to the extent permitted by applicable law, you own all Input. Subject to your compliance with these Terms, Nixtla hereby assigns to you all its rights, title, and interest in and to Output. This means you can use Content for any purpose, including commercial purposes such as sale or publication, if you comply with these Terms. Nixtla may use Content to provide and maintain the Services, comply with applicable law, and enforce our policies. You are responsible for Content, including for ensuring that it does not violate any applicable law or these Terms.
**(b) Use of Content to Improve Services**. In order to improve our Services, we may use Content that you provide to or receive from our API ("**API Content**") to develop or improve our Services. We may use Content from Services other than our API ("**Non-API Content**") to help develop and improve our Services.
Nixtla may use aggregated, de-identified data to enhance and operate the Services and for other business activities, including creating industry benchmarks and best practice guides for users.
If you do not want your Content used to improve Services, you can opt-out by filling out [this form](https://forms.gle/rvF58qkNCt2uNjSX8). In case you opt-out, we will not use the Content you provide after opt-out to train our machine-learning models or otherwise use your Content in any way to improve our Services. Please note that in some cases this may limit the ability of our Services to better address your specific use case.
**(c) Accuracy**. Artificial intelligence and machine learning are rapidly evolving fields of study. We are constantly working to improve our Services to make them more accurate, reliable, safe, and beneficial. Given the probabilistic nature of machine learning, the use of our Services may in some situations result in incorrect Output. You should always evaluate the accuracy of any Output as appropriate for your use case, including by using human review of the Output.
# 4. Fees and Payments
**(a) Fees and Billing**. You will pay all fees charged to your account ("**Fees**") according to the prices and terms on the applicable pricing page, or as otherwise agreed between us in writing. We have the right to correct pricing errors or mistakes even if we have already issued an invoice or received payment. You will provide complete and accurate billing information including a valid and authorized payment method. We will charge your payment method on an agreed-upon periodic basis, but may reasonably change the date on which the charge is posted. You authorize Nixtla and its affiliates, and our third-party payment processor(s), to charge your payment method for the Fees.
If your payment cannot be completed, we will provide you written notice and may suspend access to the Services until payment is received. Fees are payable in U.S. dollars and are due upon invoice issuance. Payments are nonrefundable except as provided in this Agreement.
**(b) Taxes**. Unless otherwise stated, Fees do not include federal, state, local, and foreign taxes, duties, and other similar assessments ("**Taxes**"). You are responsible for all Taxes associated with your purchase, excluding Taxes based on our net income, and we may invoice you for such Taxes. You agree to timely pay such Taxes and provide us with documentation showing the payment, or additional evidence that we may reasonably require. Nixtla uses the name and address in your account registration as the place of supply for tax purposes, so you must keep this information accurate and up-to-date.
**(c) Price Changes**. We may change our prices by posting notice to your account and/or to our website. Price increases will be effective 14 days after they are posted, except for increases made for legal reasons or increases made to Beta Services, which will be effective immediately. Any price changes will apply to the Fees charged to your account immediately after the effective date of the changes.
**(d) Disputes and Late Payments**. If you want to dispute any Fees or Taxes, please contact [ops@nixtla.io](mailto:ops@nixtla.io) within thirty (30) days of the date of the disputed invoice. Undisputed amounts past due may be subject to a finance charge of 1.5% of the unpaid balance per month. If any amount of your Fees are past due, we may suspend your access to the Services after we provide you written notice of late payment.
**(e) Free Tier**. You may not create more than one account to benefit from credits provided in the free tier of the Services. If we believe you are not using the free tier in good faith, we may charge you standard fees or stop providing access to the Services.
# 5. Confidentiality, Security and Data Protection
**(a) Confidentiality**. You may be given access to Confidential Information of Nixtla, its affiliates and other third parties. You may use Confidential Information only as needed to use the Services as permitted under these Terms.
You may not disclose Confidential Information to any third party, and you will protect Confidential Information in the same manner that you protect your own confidential information of a similar nature, using at least reasonable care. Confidential Information means nonpublic information that Nixtla or its affiliates or third parties designate as confidential or should reasonably be considered confidential under the circumstances, including software, specifications, and other nonpublic business information.
Confidential Information does not include information that: (i) is or becomes generally available to the public through no fault of yours; (ii) you already possess without any confidentiality obligations when you received it under these Terms; (iii) is rightfully disclosed to you by a third party without any confidentiality obligations; or (iv) you independently developed without using Confidential Information. You may disclose Confidential Information when required by law or the valid order of a court or other governmental authority if you give reasonable prior written notice to Nixtla and use reasonable efforts to limit the scope of disclosure, including assisting us with challenging the disclosure requirement, in each case where possible.
**(b) Security**. You must implement reasonable and appropriate measures designed to help secure your access to and use of the Services. If you discover any vulnerabilities or breaches related to your use of the Services, you must promptly contact Nixtla and provide details of the vulnerability or breach.
**(c) Processing of Personal Data**. If you use the Services to process personal data, you must provide legally adequate privacy notices and obtain necessary consents for the processing of such data, and you represent to us that you are processing such data in accordance with applicable law.
# 6. Term and Termination
**(a) Termination; Suspension**. These Terms take effect when you first use the Services and remain in effect until terminated. You may terminate these Terms at any time for any reason by discontinuing the use of the Services and Content.
We may terminate these Terms for any reason by providing you at least 30 days' advance notice. We may terminate these Terms immediately upon notice to you if you materially breach Sections 2 (Usage Requirements), 5 (Confidentiality, Security and Data Protection), 8 (Dispute Resolution) or 9 (General Terms), if there are changes in relationships with third-party technology providers outside of our control, or to comply with law or government requests. We may suspend your access to the Services if you do not comply with these Terms, if your use poses a security risk to us or any third party, or if we suspect that your use is fraudulent or could subject us or any third party to liability.
**(b) Effect on Termination**. Upon termination, you will stop using the Services and you will promptly return or, if instructed by us, destroy any Confidential Information. The sections of these Terms which by their nature should survive termination or expiration should survive, including but not limited to Sections 3 and 5-9.
# 7. Indemnification; Disclaimer of Warranties; Limitations on Liability
**(a) Indemnity**. You will defend, indemnify, and hold harmless us, our affiliates, and our personnel, from and against any claims, losses, and expenses (including attorneys' fees) arising from or relating to your use of the Services, including your Content, products or services you develop or offer in connection with the Services, and your breach of these Terms or violation of applicable law.
**(b) Disclaimer**. THE SERVICES ARE PROVIDED "AS IS." EXCEPT TO THE EXTENT PROHIBITED BY LAW, WE AND OUR AFFILIATES AND LICENSORS MAKE NO WARRANTIES (EXPRESS, IMPLIED, STATUTORY OR OTHERWISE) WITH RESPECT TO THE SERVICES, AND DISCLAIM ALL WARRANTIES INCLUDING BUT NOT LIMITED TO WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, SATISFACTORY QUALITY, NON-INFRINGEMENT, AND QUIET ENJOYMENT, AND ANY WARRANTIES ARISING OUT OF ANY COURSE OF DEALING OR TRADE USAGE. WE DO NOT WARRANT THAT THE SERVICES WILL BE UNINTERRUPTED, ACCURATE OR ERROR FREE, OR THAT ANY CONTENT WILL BE SECURE OR NOT LOST OR ALTERED.
**(c) Limitations of Liability**. NEITHER WE NOR ANY OF OUR AFFILIATES OR LICENSORS WILL BE LIABLE FOR ANY INDIRECT, INCIDENTAL, SPECIAL, CONSEQUENTIAL OR EXEMPLARY DAMAGES, INCLUDING DAMAGES FOR LOSS OF PROFITS, GOODWILL, USE, OR DATA OR OTHER LOSSES, EVEN IF WE HAVE BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. OUR AGGREGATE LIABILITY UNDER THESE TERMS SHALL NOT EXCEED THE GREATER OF THE AMOUNT YOU PAID FOR THE SERVICE THAT GAVE RISE TO THE CLAIM DURING THE 12 MONTHS BEFORE THE LIABILITY AROSE OR ONE HUNDRED DOLLARS ($100). THE LIMITATIONS IN THIS SECTION APPLY ONLY TO THE MAXIMUM EXTENT PERMITTED BY APPLICABLE LAW.
# 8. Dispute Resolution
YOU AGREE TO THE FOLLOWING MANDATORY ARBITRATION AND CLASS ACTION WAIVER PROVISIONS:
**(a) MANDATORY ARBITRATION**. You and Nixtla agree to resolve any past or present claims relating to these Terms or our Services through final and binding arbitration, except that you have the right to opt out of these arbitration terms, and future changes to these arbitration terms, by emailing [ops@nixtla.io](mailto:ops@nixtla.io) within 30 days of agreeing to these arbitration terms or the relevant changes.
**(b) Informal Dispute Resolution**. We would like to understand and try to address your concerns prior to formal legal action. Before filing a claim against Nixtla, you agree to try to resolve the dispute informally by sending us notice at [ops@nixtla.io](mailto:ops@nixtla.io) of your name, a description of the dispute, and the relief you seek. If we are unable to resolve a dispute within 60 days, you may bring a formal proceeding. Any statute of limitations will be tolled during the 60-day resolution process. If you reside in the EU, the European Commission provides for an online dispute resolution platform, which you can access at [https://ec.europa.eu/consumers/odr](https://ec.europa.eu/consumers/odr).
**(c) Arbitration Forum**. Either party may commence binding arbitration through ADR Services, an alternative dispute resolution provider. The parties will pay equal shares of the arbitration fees. If the arbitrator finds that you cannot afford to pay the arbitration fees and cannot obtain a waiver, Nixtla will pay them for you. Nixtla will not seek its attorneys' fees and costs in arbitration unless the arbitrator determines that your claim is frivolous.
**(d) Arbitration Procedures**. The arbitration will be conducted by telephone, based on written submissions, via video conference, or in person in San Francisco, California, or at another mutually agreed location. The arbitration will be conducted by a sole arbitrator by ADR Services under its then-prevailing rules. All issues are for the arbitrator to decide, except a California court has the authority to determine (i) the scope, enforceability, and arbitrability of this Section 8, including the mass filing procedures below, and (ii) whether you have complied with the pre-arbitration requirements in this section. The amount of any settlement offer will not be disclosed to the arbitrator by either party until after the arbitrator determines the final award, if any.
**(e). Exceptions**. This arbitration section does not require arbitration of the following claims: (i) individual claims brought in small claims court; and (ii) injunctive or other equitable relief to stop unauthorized use or abuse of the Services or intellectual property infringement.
**(f) NO CLASS ACTIONS**. Disputes must be brought on an individual basis only, and may not be brought as a plaintiff or class member in any purported class, consolidated, or representative proceeding. Class arbitrations, class actions, private attorney general actions, and consolidation with other arbitrations are not allowed. If for any reason a dispute proceeds in court rather than through arbitration, each party knowingly and irrevocably waives any right to trial by jury in any action, proceeding, or counterclaim. This does not prevent either party from participating in a class-wide settlement of claims.
**(g) Mass Filings**. If, at any time, 30 or more similar demands for arbitration are asserted against Nixtla or related parties by the same or coordinated counsel or entities ("**Mass Filing**"), ADR Services will randomly assign sequential numbers to each of the Mass Filings. Claims numbered 1-10 will be the "Initial Test Cases" and will proceed to arbitration first. The arbitrators will render a final award for the Initial Test Cases within 120 days of the initial pre-hearing conference, unless the claims are resolved in advance or the parties agree to extend the deadline. The parties will then have 90 days (the "**Mediation Period**") to resolve the remaining cases in mediation based on the awards from the Initial Test Cases. If the parties are unable to resolve the outstanding claims during this time, the parties may choose to opt out of the arbitration process and proceed in court by providing written notice to the other party within 60 days after the Mediation Period. Otherwise, the remaining cases will be arbitrated in their assigned order. Any statute of limitations will be tolled from the time the Initial Test Cases are chosen until your case is chosen as described above.
**(h) Severability**. If any part of this Section 8 is found to be illegal or unenforceable, the remainder will remain in effect, except that if a finding of partial illegality or unenforceability would allow Mass Filing or class or representative arbitration, this Section 8 will be unenforceable in its entirety. Nothing in this section will be deemed to waive or otherwise limit the right to seek public injunctive relief or any other non-waivable right, pending a ruling on the substance of such claim from the arbitrator.
# 9. General Terms
**(a) Relationship of the Parties**. These Terms do not create a partnership, joint venture, or agency relationship between you and Nixtla or any of Nixtla's affiliates. Nixtla and you are independent contractors and neither party will have the power to bind the other or to incur obligations on the other's behalf without the other party's prior written consent.
**(b) Use of Brands**. You may not use Nixtla's or any of its affiliates' names, logos, or trademarks, without our prior written consent.
**(c) U.S. Federal Agency Entities**. The Services were developed solely at private expense and are commercial computer software and related documentation within the meaning of the applicable U.S. Federal Acquisition Regulation and agency supplements thereto.
**(d) Copyright Complaints**. If you believe that your intellectual property rights have been infringed, please send notice to the address below or fill out [this form](https://forms.gle/N3xmuZss1Y7rrb889). We may delete or disable content alleged to be infringing and may terminate accounts of repeat infringers.
Written claims concerning copyright infringement must include the following information:
1. A physical or electronic signature of the person authorized to act on behalf of the owner of the copyright interest;
2. A description of the copyrighted work that you claim has been infringed upon;
3. A description of where the material that you claim is infringing is located on the site;
4. Your address, telephone number, and e-mail address;
5. A statement by you that you have a good-faith belief that the disputed use is not authorized by the copyright owner, its agent, or the law; and
6. A statement by you, made under penalty of perjury, that the above information in your notice is accurate and that you are the copyright owner or authorized to act on the copyright owner's behalf.
**(e) Assignment and Delegation**. You may not assign or delegate any rights or obligations under these Terms, including in connection with a change of control. Any purported assignment and delegation shall be null and void. We may assign these Terms in connection with a merger, acquisition, or sale of all or substantially all of our assets, or to any affiliate or as part of a corporate reorganization.
**(f) Modifications**. We may amend these Terms from time to time by posting a revised version on the website, or if an update materially adversely affects your rights or obligations under these Terms we will provide notice to you either by emailing the email associated with your account or providing an in-product notification. Those changes will become effective no sooner than 30 days after we notify you. All other changes will be effective immediately. Your continued use of the Services after any change means you agree to such change.
**(g) Notices**. All notices will be in writing. We may notify you using the registration information you provided or the email address associated with your use of the Services. Service will be deemed given on the date of receipt if delivered by email or on the date sent via courier if delivered by post. Nixtla accepts service of process at this address:
Nixtla, Inc.
166 Geary Str 15th FL #1056
San Francisco, CA 94108
United States.
Attn: Nixtla, Inc. - [ops@nixtla.io](mailto:ops@nixtla.io)
**(h) Waiver and Severability**. If you do not comply with these Terms, and Nixtla does not take action right away, this does not mean Nixtla is giving up any of our rights. Except as provided in Section 8, if any part of these Terms is determined to be invalid or unenforceable by a court of competent jurisdiction, that term will be enforced to the maximum extent permissible and it will not affect the enforceability of any other terms.
**(i) Export Controls**. The Services may not be used in or for the benefit of, exported, or re-exported (a) into any U.S. embargoed countries (collectively, the "**Embargoed Countries**") or (b) to anyone on the U.S. Treasury Department's list of Specially Designated Nationals, any other restricted party lists (existing now or in the future) identified by the Office of Foreign Asset Control, or the U.S. Department of Commerce Denied Persons List or Entity List, or any other restricted party lists (collectively, "**Restricted Party Lists**").
You represent and warrant that you are not located in any Embargoed Countries and not on any such restricted party lists. You must comply with all applicable laws related to Embargoed Countries or Restricted Party Lists, including any requirements or obligations to know your end users directly.
**(j) Equitable Remedies**. You acknowledge that if you violate or breach these Terms, it may cause irreparable harm to Nixtla and its affiliates, and Nixtla shall have the right to seek injunctive relief against you in addition to any other legal remedies.
**(k) Entire Agreement**. These Terms and any policies incorporated in these Terms contain the entire agreement between you and Nixtla regarding the use of the Services and, other than any Service specific terms of use or any applicable enterprise agreements, supersedes any prior or contemporaneous agreements, communications, or understandings between you and Nixtla on that subject.
**(l) Jurisdiction, Venue and Choice of Law**. These Terms will be governed by the laws of the State of California, excluding California's conflicts of law rules or principles. Except as provided in the "Dispute Resolution" section, all claims arising out of or relating to these Terms will be brought exclusively in the federal or state courts of San Francisco County, California, USA.
\ No newline at end of file
---
title: "Add Exogenous Variables"
description: "Learn how to improve anomaly detection by incorporating external factors."
icon: "input-text"
---
## Why Use Exogenous Variables?
Including relevant exogenous variables can greatly improve anomaly detection, especially for time series influenced by external factors such as weather or
market indicators.
Key benefits of using exogenous variables:
- Improve anomaly detection accuracy
- Enhance model interpretability
- Provide additional context for anomaly detection
## How to Use Exogenous Variables
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Nixtla/nixtla/blob/main/nbs/docs/capabilities/historical-anomaly-detection/02_anomaly_exogenous.ipynb)
### Step 1: Set Up Data and Client
Follow the steps in the [historical anomaly detection tutorial](/anomaly_detection/historical_anomaly_detection) to set up the data and client.
### Step 2: Detect Anomalies with Exogenous Features
Use the `detect_anomalies` method to identify anomalies. The method will automatically detect and utilize any exogenous features present in your DataFrame:
```python
anomalies_df = nixtla_client.detect_anomalies(
df=df,
time_col='ds',
target_col='y'
)
```
### Step 3: Add Date Features (Optional)
Adding date features is a powerful way to enrich datasets for historical anomaly detection—especially when external exogenous variables are unavailable. By passing date components like `['month', 'year']` and enabling `date_features_to_one_hot=True`, TimeGPT automatically encodes these as one-hot vectors. This allows the model to better detect seasonal patterns, calendar effects, and periodic anomalies.
```python
anomalies_df = nixtla_client.detect_anomalies(
df=df,
time_col='ds',
target_col='y',
date_features=['month', 'year'],
date_features_to_one_hot=True
)
```
### Step 4: Visualize Anomalies
Use the `plot` method to visualize the detected anomalies in the time series data.
```python
nixtla_client.plot(df, anomalies_df)
```
<Frame caption="Detected anomalies in time series with exogenous variables">
![Detected anomalies in time series with exogenous variables](https://raw.githubusercontent.com/Nixtla/nixtla/readme_docs/nbs/_docs/docs/capabilities/historical-anomaly-detection/02_anomaly_exogenous_files/figure-markdown_strict/cell-11-output-2.png)
</Frame>
The plot shows the time series with detected anomalies marked in red. The blue line represents the actual values, while the shaded area indicates the confidence interval. Points that fall outside this interval are flagged as anomalies.
### Step 5: Inspect Model Weights (Optional)
Use the `weights_x` method to view the relative weights of the exogenous features to understand their impact:
```python
nixtla_client.weights_x.plot.barh(
x='features',
y='weights'
)
```
<Frame caption="Weights of exogenous date features">
![Weights of exogenous date features](https://raw.githubusercontent.com/Nixtla/nixtla/readme_docs/nbs/_docs/docs/capabilities/historical-anomaly-detection/02_anomaly_exogenous_files/figure-markdown_strict/cell-12-output-1.png)
</Frame>
The horizontal bar plot shows the relative importance of each exogenous feature in the anomaly detection model. Features with larger weights have a stronger influence on the model's predictions. This visualization helps identify which external factors are most significant in determining anomalies in your time series.
\ No newline at end of file
---
title: "Quickstart"
description: "Get started with TimeGPT's historical anomaly detection capabilities."
icon: "bug"
---
<CardGroup cols={2}>
<Card title="What you'll learn" icon="circle-info">
- Understand how TimeGPT detects anomalies in historical time series.
- How to setup and detect anomalies with TimeGPT.
- How to plot and interpret identified anomalies.
</Card>
<Card title="Key benefits" icon="circle-check">
- Quickly identify outliers in large time series.
- Improve decision-making by focusing on unusual data points.
- Automate anomaly alerts to save time and resources.
</Card>
</CardGroup>
## What Is Historical Anomaly Detection?
Historical anomaly detection is a technique that identifies data points that significantly deviate from expected patterns in a time series. This technique is useful for uncovering potential fraud, security breaches, or other unusual events.
## Overview of TimeGPT's Historical Anomaly Detection
TimeGPT's historical anomaly detection works by:
1. Generating predictions for future values (or reconstructing missing values) within your historical time series.
2. Constructing a confidence interval based on the model's predictions.
3. Flagging any historical data point that falls outside your chosen confidence interval as an anomaly.
## Quickstart Example
You'll learn how historical anomaly detection works—illustrated through an example analyzing daily visits to the Wikipedia page of Peyton Manning.
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Nixtla/nixtla/blob/main/nbs/docs/capabilities/historical-anomaly-detection/01_quickstart.ipynb)
### Step 1: Import Packages and Create a NixtlaClient Instance
We'll start by importing required packages and setting up our API key.
```python
import pandas as pd
from nixtla import NixtlaClient
nixtla_client = NixtlaClient(
api_key='my_api_key_provided_by_nixtla' # Defaults to os.environ.get("NIXTLA_API_KEY")
)
```
### Step 2: Load the Dataset
This dataset tracks the daily visits to the Wikipedia page of Peyton Manning.
```python
df = pd.read_csv('https://datasets-nixtla.s3.amazonaws.com/peyton-manning.csv')
df.head()
```
| | unique_id | ds | y |
|---|-----------|------------|----------|
| 0 | 0 | 2007-12-10 | 9.590761 |
| 1 | 0 | 2007-12-11 | 8.519590 |
| 2 | 0 | 2007-12-12 | 8.183677 |
| 3 | 0 | 2007-12-13 | 8.072467 |
| 4 | 0 | 2007-12-14 | 7.893572 |
### Step 3: Visualize the Data
You can visualize the time series with the following command:
```python
nixtla_client.plot(df, max_insample_length=365)
```
<Frame caption="Figure 1. Peyton Manning Wikipedia page visits over time." >
![Data plot](https://raw.githubusercontent.com/Nixtla/nixtla/readme_docs/nbs/_docs/docs/tutorials/20_anomaly_detection_files/figure-markdown_strict/cell-11-output-1.png)
</Frame>
### Step 4: Perform Anomaly Detection
By default, TimeGPT uses a 99% confidence interval. Points outside this interval are flagged as anomalies.
```python
anomalies_df = nixtla_client.detect_anomalies(df, freq='D')
anomalies_df.head()
```
| | unique_id | ds | y | TimeGPT | TimeGPT-hi-99 | TimeGPT-lo-99 | anomaly |
|---|-----------|------------|----------|----------|---------------|---------------|---------|
| 0 | 0 | 2008-01-10 | 8.281724 | 8.224187 | 9.503586 | 6.944788 | False |
| 1 | 0 | 2008-01-11 | 8.292799 | 8.151533 | 9.430932 | 6.872135 | False |
| 2 | 0 | 2008-01-12 | 8.199189 | 8.127243 | 9.406642 | 6.847845 | False |
| 3 | 0 | 2008-01-13 | 9.996522 | 8.917259 | 10.196658 | 7.637861 | False |
| 4 | 0 | 2008-01-14 | 10.127071| 9.002326 | 10.281725 | 7.722928 | False |
A `False` anomaly value indicates a normal data point; `True` identifies an outlier.
### Step 5: Review Anomalies
```python
nixtla_client.plot(df, anomalies_df)
```
<Frame caption="Figure 2. Anomalies detected in the Peyton Manning dataset.">
![Anomalies plot](https://raw.githubusercontent.com/Nixtla/nixtla/readme_docs/nbs/_docs/docs/tutorials/20_anomaly_detection_files/figure-markdown_strict/cell-13-output-1.png)
</Frame>
### Step 6: Inspect and Iterate
Inspect the anomalies flagged by the model. These points are potential indicators of significant deviations in your data.If you find that the model is overly sensitive or missing critical outliers, adjust the confidence interval or include additional features (e.g., exogenous data, date features) to improve detection accuracy.
<Check>
Congratulations! You've successfully performed anomaly detection using TimeGPT. You can now start experimenting with this example or apply it to your own data. For advanced tips on improving detection performance, explore the following sections on using exogenous variables and adjusting confidence intervals.
</Check>
\ No newline at end of file
---
title: "Controlling the Anomaly Detection Process"
description: "Learn how to tune TimeGPT's anomaly detection parameters for optimal accuracy. Step-by-step guide to adjusting detection_size, level, confidence intervals, and fine-tuning strategies with Python code examples."
icon: "brain"
---
## Overview
Fine-tuning anomaly detection parameters is essential for reducing false positives and improving detection accuracy in time series data. This guide shows you how to optimize TimeGPT's `detect_anomalies_online` method by adjusting key parameters like detection sensitivity, window sizes, and model fine-tuning options.
For an introduction to real-time anomaly detection, see our [Real-Time Anomaly Detection guide](/anomaly_detection/real-time/introduction). To understand local vs global detection strategies, check out [Local vs Global Anomaly Detection](/anomaly_detection/real-time/univariate_multivariate).
## Why Parameter Tuning Matters
TimeGPT leverages forecast errors to identify anomalies in your time-series data. By optimizing parameters, you can detect subtle deviations, reduce false positives, and customize results for specific use cases.
## Key Parameters for Anomaly Detection
TimeGPT's anomaly detection can be controlled through three primary parameters:
- **detection_size**: Controls the data window size for threshold calculation, determining how much historical context is used
- **level**: Sets confidence intervals for anomaly thresholds (e.g., 80%, 95%, 99%), controlling detection sensitivity
- **freq**: Aligns detection with data frequency (e.g., 'D' for daily, 'H' for hourly, 'min' for minute-level data)
## Common Use Cases
Adjusting anomaly detection parameters is crucial for:
- **Reducing false positives** in noisy time series data
- **Increasing sensitivity** to detect subtle anomalies
- **Optimizing detection** for different data frequencies (hourly, daily, weekly)
- **Improving accuracy** through model fine-tuning with custom loss functions
## How to Adjust the Anomaly Detection Process
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Nixtla/nixtla/blob/main/nbs/docs/capabilities/online-anomaly-detection/02_adjusting_detection_process.ipynb)
### Step 1: Install and Import Dependencies
In your environment, install and import the necessary libraries:
```python
import pandas as pd
from nixtla import NixtlaClient
import matplotlib.pyplot as plt
```
### Step 2: Initialize the Nixtla Client
Create an instance of NixtlaClient with your API key:
```python
nixtla_client = NixtlaClient(api_key='my_api_key_provided_by_nixtla')
```
### Step 3: Conduct a baseline detection
Load a portion of the Peyton Manning dataset to illustrate the default anomaly detection process. We use the Peyton Manning Wikipedia page views dataset to demonstrate parameter tuning on real-world data with natural anomalies and trends.
```python
df = pd.read_csv(
'https://datasets-nixtla.s3.amazonaws.com/peyton-manning.csv',
parse_dates=['ds']
).tail(200)
df.head()
```
| x | unique_id | ds | y |
| ------ | ----------- | ------------ | ---------- |
| 2764 | 0 | 2015-07-05 | 6.499787 |
| 2765 | 0 | 2015-07-06 | 6.859615 |
| 2766 | 0 | 2015-07-07 | 6.881411 |
| 2767 | 0 | 2015-07-08 | 6.997596 |
| 2768 | 0 | 2015-07-09 | 7.152269 |
Set a baseline by using only the default parameters of the method.
```python
anomaly_df = nixtla_client.detect_anomalies_online(
df,
freq='D',
h=14,
level=80,
detection_size=150
)
```
```bash Baseline Detection Log Output
INFO:nixtla.nixtla_client:Validating inputs...
INFO:nixtla.nixtla_client:Preprocessing dataframes...
WARNING:nixtla.nixtla_client:Detection size is large. Using the entire series to compute the anomaly threshold...
INFO:nixtla.nixtla_client:Calling Online Anomaly Detector Endpoint...
```
<Frame caption="Baseline Anomaly Detection Visualization">
![Baseline Anomaly Detection Visualization](https://raw.githubusercontent.com/Nixtla/nixtla/readme_docs/nbs/_docs/docs/capabilities/online-anomaly-detection/02_adjusting_detection_process_files/figure-markdown_strict/cell-13-output-1.png)
</Frame>
### Step 4: Fine-tuned detection
TimeGPT detects anomalies based on forecast errors. By improving your model's forecasts, you can strengthen anomaly detection performance. The following parameters can be fine-tuned:
- **finetune_steps**: Number of additional training iterations
- **finetune_depth**: Depth level for refining the model
- **finetune_loss**: Loss function used during fine-tuning
```python
anomaly_online_ft = nixtla_client.detect_anomalies_online(
df,
freq='D',
h=14,
level=80,
detection_size=150,
finetune_steps=10,
finetune_depth=2,
finetune_loss='mae'
)
```
```bash Fine-tuned Detection Log Output
INFO:nixtla.nixtla_client:Validating inputs...
INFO:nixtla.nixtla_client:Preprocessing dataframes...
WARNING:nixtla.nixtla_client:Detection size is large. Using the entire series to compute the anomaly threshold...
INFO:nixtla.nixtla_client:Calling Online Anomaly Detector Endpoint...
```
<Frame caption="Fine-tuned TimeGPT Anomaly Detection">
![Fine-tuned TimeGPT Anomaly Detection](https://raw.githubusercontent.com/Nixtla/nixtla/readme_docs/nbs/_docs/docs/capabilities/online-anomaly-detection/02_adjusting_detection_process_files/figure-markdown_strict/cell-15-output-1.png)
</Frame>
From the plot above, we can see that fewer anomalies were detected by the model, since the fine-tuning process helps TimeGPT better forecast the series.
### Step 5: Adjusting Forecast Horizon and Step Size
Similar to cross-validation, the anomaly detection method generates forecasts for historical data by splitting the time series into multiple windows. The way these windows are defined can impact the anomaly detection results. Two key parameters control this process:
* `h`: Specifies how many steps into the future the forecast is made for each window.
* `step_size`: Determines the interval between the starting points of consecutive windows.
Note that when `step_size` is smaller than `h`, then we get overlapping windows. This can make the detection process more robust, as TimeGPT will see the same time step more than once. However, this comes with a computational cost, since the same time step will be predicted more than once.
```python
anomaly_df_horizon = nixtla_client.detect_anomalies_online(
df,
time_col='ds',
target_col='y',
freq='D',
h=2,
step_size=1,
level=80,
detection_size=150
)
```
<Frame caption="Adjusted Horizon and Step Size Visualization">
![Adjusted Horizon and Step Size Visualization](https://raw.githubusercontent.com/Nixtla/nixtla/readme_docs/nbs/_docs/docs/capabilities/online-anomaly-detection/02_adjusting_detection_process_files/figure-markdown_strict/cell-17-output-1.png)
</Frame>
**Choosing `h` and `step_size`** depends on the nature of your data:
- Frequent or short anomalies: Use smaller `h` and `step_size`
- Smooth or longer trends: Choose larger `h` and `step_size`
## Summary
You've learned how to control TimeGPT's anomaly detection process through:
1. **Baseline detection** using default parameters
2. **Fine-tuning** with custom training iterations and loss functions
3. **Window adjustment** using forecast horizon and step size parameters
Experiment with these parameters to optimize detection for your specific use case and data patterns.
## Frequently Asked Questions
**How do I reduce false positives in anomaly detection?**
Increase the `level` parameter (e.g., from 80 to 95 or 99) to make detection stricter, or use fine-tuning parameters like `finetune_steps` to improve forecast accuracy.
**What's the difference between detection_size and step_size?**
`detection_size` determines how many data points to analyze, while `step_size` controls the interval between detection windows when using overlapping windows.
**When should I use fine-tuning for anomaly detection?**
Use fine-tuning when you have domain-specific patterns or when baseline detection produces too many false positives. Fine-tuning helps TimeGPT better understand your specific time series characteristics.
**How does overlapping windows improve detection?**
When `step_size` < `h`, TimeGPT analyzes the same time steps multiple times from different perspectives, making detection more robust but requiring more computation.
\ No newline at end of file
---
title: "Online (Real-Time) Anomaly Detection"
description: "Learn how to detect anomalies in real-time streaming data using TimeGPT's detect_anomalies_online method. Complete Python tutorial with code examples for monitoring server logs, IoT sensors, and live data streams."
icon: "bolt"
---
## Overview
Real-time anomaly detection enables you to identify unusual patterns in streaming time series data instantly—essential for monitoring server performance, detecting fraud, identifying system failures, and tracking IoT sensor anomalies. TimeGPT's `detect_anomalies_online` method provides:
- **Flexible Control**: Fine-tune detection sensitivity and confidence levels
- **Local & Global Detection**: Analyze individual series or detect system-wide anomalies across multiple correlated metrics
- **Stream Processing**: Monitor live data feeds with rolling window analysis
## Common Use Cases
- **Server Monitoring**: Detect CPU spikes, memory leaks, and downtime
- **IoT Sensors**: Identify equipment failures and sensor malfunctions
- **Fraud Detection**: Flag suspicious transactions in real-time
- **Application Performance**: Monitor API response times and error rates
## Quick Start
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Nixtla/nixtla/blob/main/nbs/docs/capabilities/online-anomaly-detection/01_quickstart.ipynb)
### Step 1: Set up your environment
Initialize your Python environment by importing the required libraries:
```python
import pandas as pd
from nixtla import NixtlaClient
import matplotlib.pyplot as plt
```
### Step 2: Configure your NixtlaClient
Provide your API key (and optionally a custom base URL).
```python
nixtla_client = NixtlaClient(
# defaults to os.environ.get("NIXTLA_API_KEY")
api_key='my_api_key_provided_by_nixtla'
)
```
### Step 3: Load your dataset
We use a minute-level time series dataset that monitors server usage. This dataset is ideal for showcasing streaming data scenarios, where the task is to detect server failures or downtime in real time.
```python
df = pd.read_csv(
'https://datasets-nixtla.s3.us-east-1.amazonaws.com/machine-1-1.csv',
parse_dates=['ts']
)
```
We observe that the time series remains stable during the initial period; however, a spike occurs in the last 20 steps, indicating anomalous behavior. Our goal is to capture this abnormal jump as soon as it appears.
<Frame caption="Server Data with Spike Anomaly">
![Server Data with Spike Anomaly](https://raw.githubusercontent.com/Nixtla/nixtla/readme_docs/nbs/_docs/docs/capabilities/online-anomaly-detection/01_quickstart_files/figure-markdown_strict/cell-11-output-1.png)
</Frame>
### Step 4: Detect anomalies in real time
The `detect_anomalies_online` method detects anomalies in a time series leveraging TimeGPT's forecast power. It uses the forecast error in deciding the anomalous step so you can specify and tune the parameters like that of the `forecast` method. This function will return a dataframe that contains anomaly flags and anomaly score (its absolute value quantifies the abnormality of the value).
To perform real-time anomaly detection, set the following parameters:
- `df`: A pandas DataFrame containing the time series data.
- `time_col`: The column that identifies the datestamp.
- `target_col`: The variable to forecast.
- `h`: Horizon is the number of steps ahead to make a forecast.
- `freq`: The frequency of the time series in Pandas format.
- `level`: Percentile of scores distribution at which the threshold is set, controlling how strictly anomalies are flagged. Default at 99%.
- `detection_size`: The number of steps to analyze for anomaly at the end of time series.
```python
anomaly_online = nixtla_client.detect_anomalies_online(
df,
time_col='ts',
target_col='y',
freq='min', # Specify the frequency of the data
h=10, # Specify the forecast horizon
level=99, # Set the confidence level for anomaly detection
detection_size=100 # Number of steps to analyze for anomalies
)
anomaly_online.tail()
```
```bash Log Output
INFO:nixtla.nixtla_client:Validating inputs...
INFO:nixtla.nixtla_client:Preprocessing dataframes...
INFO:nixtla.nixtla_client:Calling Online Anomaly Detector Endpoint...
```
View last 5 anomaly detections:
| unique_id | ts | y | TimeGPT | anomaly | anomaly_score | TimeGPT-hi-99 | TimeGPT-lo-99 |
| ------------------ | --------------------- | ---------- | ---------- | --------- | --------------- | --------------- | --------------- |
| machine-1-1_y_29 | 2020-02-01 22:11:00 | 0.606017 | 0.544625 | True | 18.463266 | 0.553161 | 0.536090 |
| machine-1-1_y_29 | 2020-02-01 22:12:00 | 0.044413 | 0.570869 | True | -158.933850 | 0.579404 | 0.562333 |
| machine-1-1_y_29 | 2020-02-01 22:13:00 | 0.038682 | 0.560303 | True | -157.474880 | 0.568839 | 0.551767 |
| machine-1-1_y_29 | 2020-02-01 22:14:00 | 0.024355 | 0.521797 | True | -150.178240 | 0.530333 | 0.513261 |
| machine-1-1_y_29 | 2020-02-01 22:15:00 | 0.044413 | 0.467860 | True | -127.848560 | 0.476396 | 0.459325 |
<Frame caption="Identified Anomalies">
![Identified Anomalies](https://raw.githubusercontent.com/Nixtla/nixtla/readme_docs/nbs/_docs/docs/capabilities/online-anomaly-detection/01_quickstart_files/figure-markdown_strict/cell-13-output-1.png)
</Frame>
From the plot, we observe that the anomalous period is promptly detected.
<Check>
Here we use a detection size of 100 to illustrate the anomaly detection process. In production, running detections more frequently with smaller detection sizes can help identify anomalies as soon as they occur.
</Check>
## Frequently Asked Questions
**What's the difference between online and historical anomaly detection?**
Online detection analyzes recent data windows for immediate alerting, while historical detection analyzes complete datasets for pattern discovery.
**Can I adjust detection sensitivity?**
Yes, tune the `level` parameter (confidence threshold) and `detection_size` (analysis window) to control false positive rates.
## Next Steps
Now that you've detected your first anomalies in real-time, explore these guides to optimize your detection:
- [Controlling the Anomaly Detection Process](/anomaly_detection/real-time/adjusting_detection) - Learn how to fine-tune key parameters for more accurate detection
- [Local vs Global Anomaly Detection](/anomaly_detection/real-time/univariate_multivariate) - Choose the right detection strategy for single vs multiple correlated time series
\ No newline at end of file
---
title: "Local vs Global Anomaly Detection"
description: "Compare local vs global anomaly detection methods for time series. Learn when to use univariate detection for independent metrics vs multivariate detection for correlated server data with Python examples."
icon: "chart-mixed"
---
## Overview
When monitoring multiple time series simultaneously, such as server metrics (CPU, memory, disk I/O), you need to choose between local and global anomaly detection strategies. This guide demonstrates:
- **Local (Univariate) Detection**: Analyzing each time series independently for isolated metric anomalies
- **Global (Multivariate) Detection**: Analyzing all time series collectively to detect system-wide failures
Both methods use TimeGPT's `detect_anomalies_online` with the `threshold_method` parameter. The main difference is whether anomalies are identified individually per series (local) or collectively across multiple correlated series (global).
For an introduction to real-time anomaly detection, see our [Real-Time Anomaly Detection guide](/anomaly_detection/real-time/introduction). To learn about parameter tuning, check out [Controlling the Anomaly Detection Process](/anomaly_detection/real-time/adjusting_detection).
## When to Use Each Method
### Use Local Detection When:
- Monitoring independent, uncorrelated metrics
- Each metric has distinct baseline behavior
- You need low computational overhead
- False positives in individual series are acceptable
### Use Global Detection When:
- Monitoring correlated server or system metrics
- System-wide failures affect multiple metrics simultaneously
- You need to detect coordinated anomalies (e.g., CPU spike + memory spike + network spike)
- Reducing false positives by considering metric relationships
## How to Detect Anomalies Across Multiple Time Series
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Nixtla/nixtla/blob/main/nbs/docs/capabilities/online-anomaly-detection/03_univariate_vs_multivariate_anomaly_detection.ipynb)
### Step 1: Set Up Your Environment
Import dependencies that you will use in the tutorial.
```python
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from nixtla import NixtlaClient
```
Create a NixtlaClient instance. Replace 'my_api_key_provided_by_nixtla' with your actual API key.
```python
nixtla_client = NixtlaClient(
api_key='my_api_key_provided_by_nixtla'
)
```
### Step 2: Load the Dataset
This tutorial uses the SMD (Server Machine Dataset), a benchmark dataset for anomaly detection across multiple time series. SMD monitors abnormal patterns in server machine data.
We analyze monitoring data from a single server (machine-1-1) containing 38 time series. Each series represents a different server metric: CPU usage, memory usage, disk I/O, network throughput, and other system performance indicators.
```python
df = pd.read_csv(
'https://datasets-nixtla.s3.us-east-1.amazonaws.com/SMD_test.csv',
parse_dates=['ts']
)
df.unique_id.nunique()
```
Output:
```bash
38
```
### Step 3: Local and Global Anomaly Detection Methods
#### Method Comparison
| Aspect | Local (Univariate) | Global (Multivariate) |
|--------|-------------------|----------------------|
| **Analysis Scope** | Individual series | All series collectively |
| **Best For** | Independent metrics | Correlated metrics |
| **Computational Cost** | Low | Higher |
| **System-wide Anomalies** | May miss | Detects effectively |
| **Parameter** | `threshold_method='univariate'` | `threshold_method='multivariate'` |
#### Step 3.1: Local Method
Local anomaly detection analyzes each time series in isolation, flagging anomalies based on each series' individual deviation from its expected behavior. This approach is efficient for individual metrics or when correlations between metrics are not relevant. However, it may miss large-scale, system-wide anomalies that are only apparent when multiple series deviate simultaneously.
Example usage:
```python
anomaly_online = nixtla_client.detect_anomalies_online(
df[['ts', 'y', 'unique_id']],
time_col='ts',
target_col='y',
freq='h',
h=24,
level=95,
detection_size=475,
threshold_method='univariate' # local anomaly detection
)
```
Log output:
```bash
INFO:nixtla.nixtla_client:Validating inputs...
INFO:nixtla.nixtla_client:Preprocessing dataframes...
WARNING:nixtla.nixtla_client:Detection size is large. Using the entire series to compute the anomaly threshold...
INFO:nixtla.nixtla_client:Calling Online Anomaly Detector Endpoint...
```
Visualize the anomalies:
```python
# Utility function to plot anomalies
def plot_anomalies(df, unique_ids, rows, cols):
fig, axes = plt.subplots(rows, cols, figsize=(12, rows * 2))
for i, (ax, uid) in enumerate(zip(axes.flatten(), unique_ids)):
filtered_df = df[df['unique_id'] == uid]
ax.plot(filtered_df['ts'], filtered_df['y'], color='navy', alpha=0.8, label='y')
ax.plot(filtered_df['ts'], filtered_df['TimeGPT'], color='orchid', alpha=0.7, label='TimeGPT')
ax.scatter(
filtered_df.loc[filtered_df['anomaly'] == 1, 'ts'],
filtered_df.loc[filtered_df['anomaly'] == 1, 'y'],
color='orchid', label='Anomalies Detected'
)
ax.set_title(f"Unique_id: {uid}", fontsize=8)
ax.tick_params(axis='x', labelsize=6)
fig.legend(loc='upper center', ncol=3, fontsize=8, labels=['y', 'TimeGPT', 'Anomaly'])
plt.tight_layout(rect=[0, 0, 1, 0.95])
plt.show()
display_ids = ['machine-1-1_y_0', 'machine-1-1_y_1', 'machine-1-1_y_6', 'machine-1-1_y_29']
plot_anomalies(anomaly_online, display_ids, rows=2, cols=2)
```
<Frame caption="Local Anomaly Detection Results">
![Local Anomaly Detection Results](https://raw.githubusercontent.com/Nixtla/nixtla/readme_docs/nbs/_docs/docs/capabilities/online-anomaly-detection/03_univariate_vs_multivariate_anomaly_detection_files/figure-markdown_strict/cell-13-output-1.png)
</Frame>
*This figure highlights anomalies detected in four selected metrics. Each metric is analyzed independently, so anomalies reflect unusual behavior within that series alone.*
#### Step 3.2: Global Method
Global anomaly detection considers all time series collectively, flagging a time step as anomalous if the aggregate deviation across all series at that time exceeds a threshold. This approach captures systemic or correlated anomalies that might be missed when analyzing each series in isolation. However, it comes with slightly higher complexity and computational overhead, and may require careful threshold tuning.
Example usage:
```python
anomaly_online_multi = nixtla_client.detect_anomalies_online(
df[['ts', 'y', 'unique_id']],
time_col='ts',
target_col='y',
freq='h',
h=24,
level=95,
detection_size=475,
threshold_method='multivariate' # global anomaly detection
)
```
Log output:
```bash
INFO:nixtla.nixtla_client:Validating inputs...
INFO:nixtla.nixtla_client:Preprocessing dataframes...
WARNING:nixtla.nixtla_client:Detection size is large. Using the entire series to compute the anomaly threshold...
INFO:nixtla.nixtla_client:Calling Online Anomaly Detector Endpoint...
```
Visualize the anomalies:
```python
plot_anomalies(anomaly_online_multi, display_ids, rows=2, cols=2)
```
<Frame caption="Global Anomaly Detection Results">
![Global Anomaly Detection Results](https://raw.githubusercontent.com/Nixtla/nixtla/readme_docs/nbs/_docs/docs/capabilities/online-anomaly-detection/03_univariate_vs_multivariate_anomaly_detection_files/figure-markdown_strict/cell-15-output-1.png)
</Frame>
*In global mode, an anomaly is flagged when the combined deviation across these series reaches a threshold. This can reveal system-wide anomalies.*
In global anomaly detection, anomaly scores from all series at each time step are aggregated. A step is anomalous if the combined score exceeds the threshold. This reveals systemic anomalies that may go unnoticed if each series is considered alone.
## Real-World Use Cases
### Local Detection Examples:
- **Independent application metrics**: Response time, error rates, request counts for different microservices
- **IoT sensor networks**: Temperature sensors at different locations with no correlation
- **Business metrics**: Sales figures across different product categories
### Global Detection Examples:
- **Server monitoring**: CPU, memory, disk I/O, and network metrics from the same server
- **Distributed system health**: Correlated metrics across multiple nodes indicating cluster-wide issues
- **Manufacturing equipment**: Multiple sensor readings from a single machine indicating equipment failure
## Summary
- **Local:** Best for detecting anomalies in a single metric or uncorrelated metrics. Low computational overhead, but may overlook cross-series patterns.
- **Global:** Considers correlations across metrics, capturing system-wide issues. More complex and computationally intensive than local methods.
Both detection approaches use Nixtla's online anomaly detection method. Choose the strategy that best fits your use case and data characteristics.
## Frequently Asked Questions
**What's the difference between univariate and multivariate anomaly detection?**
Univariate (local) detection analyzes each time series independently using the `threshold_method='univariate'` parameter, while multivariate (global) detection analyzes all series together using `threshold_method='multivariate'`, considering correlations between metrics.
**When should I use global detection instead of local?**
Use global detection when your time series are correlated and system-wide failures affect multiple metrics simultaneously, such as monitoring CPU, memory, and network metrics from the same server.
**Does global detection increase computational cost?**
Yes, global detection requires analyzing relationships across all time series, making it more computationally intensive. However, it can reduce overall false positives by considering metric correlations.
**Can I run both local and global detection?**
Yes, you can run both methods and compare results. Local detection may catch metric-specific anomalies while global detection identifies system-wide issues.
\ No newline at end of file
---
title: "Audit and Clean Data"
description: "Learn how to audit and clean your data with TimeGPT."
icon: "table"
---
The `audit_data` and `clean_data` methods from TimeGPT can help you identify and fix potential issues in your data.
The `audit_data` method checks for common problems such as duplicates, missing dates, categorical columns, negative values, and leading zeros. While not all issues will result in errors, addressing them can improve the quality of the forecasts, depending on your specific use case.
Once identified, `clean_data` can be used to automatically fix these issues.
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Nixtla/nixtla/blob/main/nbs/docs/tutorials/24_audit_data.ipynb)
## How to Use the Audit and Clean Methods
### Step 1: Import Packages
To use the `audit_data` and `clean_data` methods, you first need to import and instantiate the `NixtlaClient` class.
```python
import pandas as pd
from nixtla import NixtlaClient
nixtla_client = NixtlaClient(
api_key='my_api_key_provided_by_nixtla' # defaults to os.environ.get("NIXTLA_API_KEY")
)
```
### Step 2: Create Minimal Example
The `audit_data` method performs a series of checks to identify issues in your data. These checks fall into two categories:
<table>
<thead>
<tr>
<th><strong>Check Type</strong></th>
<th><strong>Description</strong></th>
<th><strong>Checks Performed</strong></th>
</tr>
</thead>
<tbody>
<tr>
<td><strong>Fail</strong></td>
<td>Issues that will cause errors when you run TimeGPT</td>
<td>
Duplicate rows (D001)<br />
Missing dates (D002)<br />
Categorical feature columns (F001)
</td>
</tr>
<tr>
<td><strong>Case-specific</strong></td>
<td>Issues that may not cause errors but could negatively affect your results</td>
<td>
Negative values (V001)<br />
Leading zeros (V002)
</td>
</tr>
</tbody>
</table>
To show how the `audit_data` method works, we will create a sample dataset with missing dates, negative values and leading zeros.
```python
df = pd.DataFrame({
'unique_id': ['id1', 'id1', 'id1', 'id2', 'id2', 'id2', 'id2', 'id3', 'id3', 'id3', 'id3'],
'ds': ['2023-01-01', '2023-01-03', '2023-01-04', '2023-01-01', '2023-01-02', '2023-01-03', '2023-01-04', '2023-01-01', '2023-01-02', '2023-01-03', '2023-01-04'],
'y': [1, 1, 1, 0, 0, 1, 2, -1, 0, 1, -2]
})
df
```
| unique_id | ds | y |
|-----------|------------|----|
| id1 | 2023-01-01 | 1 |
| id1 | 2023-01-03 | 1 |
| id1 | 2023-01-04 | 1 |
| id2 | 2023-01-01 | 0 |
| id2 | 2023-01-02 | 0 |
| id2 | 2023-01-03 | 1 |
| id2 | 2023-01-04 | 2 |
| id3 | 2023-01-01 | -1 |
| id3 | 2023-01-02 | 0 |
| id3 | 2023-01-03 | 1 |
| id3 | 2023-01-04 | -2 |
### Step 3: Audit Data
The `audit_data` method requires the following parameters:
- `df` *(required)*: A pandas DataFrame with your input data.
- `freq` *(required)*: The frequency of your time series data (e.g., `D` for daily, `M` for monthly).
- `id_col`: Column name identifying each unique series. Default is `unique_id`.
- `time_col`: Column name containing timestamps. Default is `ds`.
- `target_col`: Column name containing the target variable. Default is `y`.
Additionally, you can use the following optional parameters to specify how missing dates are identified:
- `start`: The initial timestamp for the series.
- `end`: The final timestamp for the series.
Both `start` and `end` can take the following options:
- `per_serie`: Uses the first or last timestamp of each individual series.
- `global`: Uses the earliest or latest timestamp from the entire dataset.
- A specific timestamp or integer (e.g., `2025-01-01`, `2025`, or `datetime(2025, 1, 1)`).
```python
all_pass, fail_dfs, case_specific_dfs = nixtla_client.audit_data(
df = df,
freq = 'D',
start = 'per_serie',
end = 'per_serie'
)
```
The audit_data method returns three values:
- **all_pass** (bool): True if every check passed, otherwise False.
- **fail_dfs** (dict): Any failed tests (D001, D002 or F001), each paired with the rows that failed.
- **case_specific_dfs** (dict): Any case-specific tests (V001 or V002), each paired with the rows flagged.
In the example above, the `audit_data` method should find missing dates (D002), negative values (V001), and leading zeros (V002).
### Step 4. Clean Data
The `clean_data` method fixes the issues identified by the `audit_data` method. It requires the output of `audit_data`, so it must always be run after it. The `clean_data` method takes the following parameters:
- `df` *(required)*: A pandas DataFrame with your input data.
- `fail_dict` *(required)*: A dictionary with failed checks, as returned by the `audit_data` method.
- `case_specific_dict` *(required)*: A dictionary with case-specific checks, also returned by the `audit_data` method.
- `freq` *(required)*: The frequency of your time series data (e.g., `D` for daily, `M` for monthly). Can be a string, integer, or pandas offset.
- `clean_case_specific`: Whether to clean case-specific issues (e.g., negative values, leading zeros). Default is `False`.
- `id_col`: Column name identifying each unique series. Default is `unique_id`.
- `time_col`: Column name containing timestamps or integer steps. Default is `ds`.
- `target_col`: Column name containing the target variable. Default is `y`.
```python
clean_df, all_pass, fail_dfs, case_specific_dfs = nixtla_client.clean_data(
df = df,
fail_dict = fail_dfs,
case_specific_dict = case_specific_dfs,
clean_case_specific = True,
freq = 'D'
)
clean_df
```
| unique_id | ds | y |
|-----------|------------|-----|
| id1 | 2023-01-01 | 1.0 |
| id1 | 2023-01-03 | 1.0 |
| id1 | 2023-01-04 | 1.0 |
| id1 | 2023-01-02 | NaN |
| id2 | 2023-01-03 | 1.0 |
| id2 | 2023-01-04 | 2.0 |
| id3 | 2023-01-01 | 0.0 |
| id3 | 2023-01-02 | 0.0 |
| id3 | 2023-01-03 | 1.0 |
| id3 | 2023-01-04 | 0.0 |
In this example, `clean_data` added the missing date in `id1`, removed the leading zeros in `id2`, and replaced the negative values in `id3`.
However, replacing negative values with zeros introduced new leading zeros in `id3`, so a second run of `clean_data` is required.
```python
clean_df2, all_pass, fail_dfs, case_specific_dfs = nixtla_client.clean_data(
df = clean_df,
fail_dict = fail_dfs,
case_specific_dict = case_specific_dfs,
clean_case_specific = True, # if False, the case-specific tests will be ignored
freq = 'D'
)
clean_df2
```
| unique_id | ds | y |
|-----------|------------|-----|
| id1 | 2023-01-01 | 1.0 |
| id1 | 2023-01-03 | 1.0 |
| id1 | 2023-01-04 | 1.0 |
| id1 | 2023-01-02 | NaN |
| id2 | 2023-01-03 | 1.0 |
| id2 | 2023-01-04 | 2.0 |
| id3 | 2023-01-03 | 1.0 |
| id3 | 2023-01-04 | 0.0 |
After the second run of `clean_data`, the leading zeros in `id3` have been removed.
The only remaining step is to fill the missing value created when the missing date was added in `id1`, and to sort the DataFrame by `unique_id` and `ds`.
| unique_id | ds | y |
|-----------|------------|-----|
| id1 | 2023-01-01 | 1.0 |
| id1 | 2023-01-02 | 0.0 |
| id1 | 2023-01-03 | 1.0 |
| id1 | 2023-01-04 | 1.0 |
| id2 | 2023-01-03 | 1.0 |
| id2 | 2023-01-04 | 2.0 |
| id3 | 2023-01-03 | 1.0 |
| id3 | 2023-01-04 | 0.0 |
## Conclusion
The `audit_data` method helps you identify issues that may prevent TimeGPT from running properly.
These include fail tests (duplicate rows, missing dates, and categorical feature columns), which will always result in errors if not addressed.
It also flags case-specific issues (negative values and leading zeros), which may not cause errors but can affect the quality of your forecasts depending on your use case.
The `clean_data` method can automatically fix the issues identified by `audit_data`.
Be cautious when removing negative values or leading zeros, as they may contain important information about your data.
Above all, when auditing and cleaning your data, make decisions based on the needs and context of your specific use case.
---
title: "Data Requirements"
description: "Overview of the data format and requirements for TimeGPT forecasting."
icon: "table"
---
<Info>
TimeGPT accepts **pandas** and **polars** dataframes in [long format](https://www.theanalysisfactor.com/wide-and-long-data/#comments). The minimum required columns are:
</Info>
<CardGroup cols={2}>
<Card title="Required Columns">
- **unique_id**: String or numerical value to label each series.
- **ds**(timestamp): String or datetime in `YYYY-MM-DD` or `YYYY-MM-DD HH:MM:SS` format.
- **y**(numeric): Numerical target variable to forecast.
</Card>
<Card title="Optional Index">
If a DataFrame lacks the `ds` column but uses a **DatetimeIndex**, that is also supported.
</Card>
</CardGroup>
<Check>
TimeGPT also supports distributed dataframe libraries such as **dask**, **spark**, and **ray**.
</Check>
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Nixtla/nixtla/blob/main/nbs/docs/getting-started/5_data_requirements.ipynb)
<Info>
You can include additional exogenous features in the same DataFrame. See the [Exogenous Variables tutorial](/forecasting/exogenous-variables/numeric_features) for details.
</Info>
---
## Example DataFrame
Below is a sample of a valid input DataFrame for TimeGPT (with columns named `timestamp` and `value` instead of `ds` and `y`):
```python Sample Data Loading
import pandas as pd
df = pd.read_csv('https://raw.githubusercontent.com/Nixtla/transfer-learning-time-series/main/datasets/air_passengers.csv')
df["unique_id"] = "series1"
df.head()
```
<Card title="Data Preview">
**Sample Data Preview**
| **unique_id** | **timestamp** | **value** |
| ----- | ------------ | ------- |
| series1 | 1949-01-01 | 112 |
| series1 | 1949-02-01 | 118 |
| series1 | 1949-03-01 | 132 |
| series1 | 1949-04-01 | 129 |
| series1 | 1949-05-01 | 121 |
</Card>
In this example:
- `unique_id` identifies the series
- `timestamp` corresponds to `ds`.
- `value` corresponds to `y`.
---
## Matching Columns to TimeGPT
You can choose how to align your DataFrame columns with TimeGPT’s expected structure:
<Tabs>
<Tab title="Rename Columns">
Rename `timestamp` to `ds` and `value` to `y`:
```python Rename Columns Example
df = df.rename(columns={'timestamp': 'ds', 'value': 'y'})
```
Now your DataFrame has the explicitly required columns:
```bash Show Head of DataFrame
print(df.head())
```
</Tab>
<Tab title="Use time_col & target_col">
Specify column names directly when calling `NixtlaClient`:
```python NixtlaClient Forecast Example
from nixtla import NixtlaClient
nixtla_client = NixtlaClient(api_key='my_api_key_provided_by_nixtla')
fcst = nixtla_client.forecast(
df=df,
h=12,
time_col='timestamp',
target_col='value'
)
fcst.head()
```
This way, you don’t need to rename your DataFrame columns, as TimeGPT will know which ones to treat as `ds` and `y`.
</Tab>
</Tabs>
---
## Example Forecast
When you run the forecast method:
```python Forecast Example
fcst = nixtla_client.forecast(
df=df,
h=12,
time_col='timestamp',
target_col='value'
)
fcst.head()
```
<Accordion title="Forecast Logs">
```bash Forecast Logs
INFO:nixtla.nixtla_client:Validating inputs...
INFO:nixtla.nixtla_client:Inferred freq: MS
INFO:nixtla.nixtla_client:Preprocessing dataframes...
INFO:nixtla.nixtla_client:Querying model metadata...
INFO:nixtla.nixtla_client:Restricting input...
INFO:nixtla.nixtla_client:Calling Forecast Endpoint...
```
</Accordion>
<Frame caption="Forecast Output Preview">
| unique_id | timestamp | TimeGPT |
| ----- | ------------ | ----------- |
| series1 | 1961-01-01 | 437.83792 |
| series1 | 1961-02-01 | 426.06270 |
| series1 | 1961-03-01 | 463.11655 |
| series1 | 1961-04-01 | 478.24450 |
| series1 | 1961-05-01 | 505.64648 |
</Frame>
<Info>
TimeGPT attempts to automatically infer your data’s frequency (`freq`). You can override this by specifying the **freq** parameter (e.g., `freq='MS'`).
</Info>
For more information, see the [TimeGPT Quickstart](/forecasting/timegpt_quickstart).
---
## Important Considerations
<Warning>
**Warning:** Data passed to TimeGPT must not contain missing values or time gaps.
</Warning>
To handle missing data, see [Dealing with Missing Values in TimeGPT](/data_requirements/missing_values).
---
### Minimum Data Requirements (Azure AI)
<Info>
These are the minimum data sizes required for each frequency when using Azure AI:
</Info>
| Frequency | Minimum Size |
| ---------------------------------- | -------------- |
| Hourly and subhourly (e.g., "H") | 1008 |
| Daily ("D") | 300 |
| Weekly (e.g., "W-MON") | 64 |
| Monthly and others | 48 |
When preparing your data, also consider:
<Steps>
<Step title="Forecast horizon (h)">
Number of future periods you want to predict.
</Step>
<Step title="Number of validation windows (n_windows)">
How many times to test the model's performance.
</Step>
<Step title="Gaps (step_size)">
Periodic offset between validation windows during cross-validation.
</Step>
</Steps>
This ensures you have enough data for both training and evaluation.
\ No newline at end of file
---
title: "Missing Values"
description: "Learn how to handle missing values in time series data for accurate forecasting with TimeGPT."
icon: "table"
---
## Missing Values in Time Series
TimeGPT requires time series data without missing values. While you may have
multiple series starting and ending on different dates, each one must maintain
a continuous data sequence.
This tutorial shows you how to handle missing values for use with TimeGPT. For
reference, this tutorial is based on the skforecast tutorial:
[Forecasting Time Series with Missing Values](https://cienciadedatos.net/documentos/py46-forecasting-time-series-missing-values).
<Tip>
Managing missing values ensures your forecasts with TimeGPT are accurate and reliable.
When dates or values are missing, fill or interpolate them according to the nature of your dataset.
</Tip>
## Tutorial
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Nixtla/nixtla/blob/main/nbs/docs/tutorials/15_missing_values.ipynb)
### Step 1: Load Data
Load the daily bike rental counts dataset using pandas. Note that the original column names are in Spanish; you will rename them to match `ds` and `y`.
```python
import pandas as pd
df = pd.read_csv('https://raw.githubusercontent.com/JoaquinAmatRodrigo/Estadistica-machine-learning-python/master/data/usuarios_diarios_bicimad.csv')
df = df[['fecha', 'Usos bicis total día']]
df.rename(columns={'fecha': 'ds', 'Usos bicis total día': 'y'}, inplace=True)
df.head()
```
| | ds | y |
| ----- | ------------ | ----- |
| 0 | 2014-06-23 | 99 |
| 1 | 2014-06-24 | 72 |
| 2 | 2014-06-25 | 119 |
| 3 | 2014-06-26 | 135 |
| 4 | 2014-06-27 | 149 |
Next, convert your dates to timestamps and assign a unique identifier (`unique_id`) to handle multiple series if needed:
```python
df['ds'] = pd.to_datetime(df['ds'])
df['unique_id'] = 'id1'
df = df[['unique_id', 'ds', 'y']]
```
Reserve the last 93 days for testing:
```python
train_df = df[:-93]
test_df = df[-93:]
```
To simulate missing data, remove specific date ranges from the training dataset:
```python
mask = ~((train_df['ds'] >= '2020-09-01') & (train_df['ds'] <= '2020-10-10')) & \
~((train_df['ds'] >= '2020-11-08') & (train_df['ds'] <= '2020-12-15'))
train_df_gaps = train_df[mask]
```
### Step 2: Initialize TimeGPT
Initialize a `NixtlaClient` object with your Nixtla API key:
```python
from nixtla import NixtlaClient
nixtla_client = NixtlaClient(api_key='my_api_key_provided_by_nixtla')
```
### Step 3: Visualize Data
Plot your dataset and examine the gaps introduced above:
```python
nixtla_client.plot(train_df_gaps)
```
<Frame>
![Chart Image](https://raw.githubusercontent.com/Nixtla/nixtla/readme_docs/nbs/_docs/docs/tutorials/15_missing_values_files/figure-markdown_strict/cell-14-output-1.png)
</Frame>
Note that there are two gaps in the data: from September 1, 2020, to October 10,
2020, and from November 8, 2020, to December 15, 2020. To better visualize these
gaps, you can use the `max_insample_length` argument of the `plot` method or you
can simply zoom in on the plot.
```python
nixtla_client.plot(train_df_gaps, max_insample_length=800)
```
<Frame>
![Chart Image](https://raw.githubusercontent.com/Nixtla/nixtla/readme_docs/nbs/_docs/docs/tutorials/15_missing_values_files/figure-markdown_strict/cell-15-output-1.png)
</Frame>
Additionally, notice a period from March 16, 2020, to April 21, 2020, where the
data shows zero rentals. These are not missing values, but actual zeros
corresponding to the COVID-19 lockdown in the city.
### Step 4: Fill Missing Values
You can use `fill_gaps` from `utilsforecast` to insert the missing dates:
Before using TimeGPT, we need to ensure that:
1. All timestamps from the start date to the end date are present in the data.
2. The target column contains no missing values.
To address the first issue, we will use the `fill_gaps` function from `utilsforecast`,
a Python package from Nixtla that provides essential utilities for time series
forecasting, such as functions for data preprocessing, plotting, and evaluation.
The `fill_gaps` function will fill in the missing dates in the data. To do this,
it requires the following arguments:
- `df`: The DataFrame containing the time series data.
- `freq` (str or int): The frequency of the data.
```python
from utilsforecast.preprocessing import fill_gaps
print('Number of rows before filling gaps:', len(train_df_gaps))
train_df_complete = fill_gaps(train_df_gaps, freq='D')
print('Number of rows after filling gaps:', len(train_df_complete))
```
```bash
Number of rows before filling gaps: 2851
Number of rows after filling gaps: 2929
```
> NOTE: In this tutorial, the data contains only one time series. However, TimeGPT
supports passing multiple series to the model. In this case, none of the time
series can have missing values from their individual earliest timestamp until
their individual latest timestamp. If these individual time series have missing
values, the user must decide how to fill these gaps for the individual time
series. The `fill_gaps` function provides a couple of additional arguments to
assist with this (refer to the documentation for complete details), namely
`start` and `end`.
Now we need to decide how to fill the missing values in the target column. In
this tutorial, we will use interpolation, but it is important to consider the
specific context of your data when selecting a filling strategy. For example,
if you are dealing with daily retail data, a missing value most likely indicates
that there were no sales on that day, and you can fill it with zero. Conversely,
if you are working with hourly temperature data, a missing value probably means
that the sensor was not functioning, and you might prefer to use interpolation
to fill the missing values.
In this case, we will handle the newly inserted missing values by interpolation.
```python
train_df_complete['y'] = train_df_complete['y'].interpolate(
method='linear', limit_direction='both'
)
train_df_complete.isna().sum()
```
```bash
unique_id 0
ds 0
y 0
dtype: int64
```
### Step 5: Forecast with TimeGPT
Typically, a horizon > 2 times the typical seasonality is considered long. In
this case, the data has a seasonality of 7 days and a horizon of 93 days.
Since the forecast horizon is long compared to the frequency of the data (daily),
we will use `timegpt-1-long-horizon` model.
```python
fcst = nixtla_client.forecast(
train_df_complete,
h=len(test_df),
model='timegpt-1-long-horizon'
)
```
Visualize the forecasts against the actual test data:
```python
nixtla_client.plot(test_df, fcst)
```
<Frame caption="Forecast comparison between the test dataset and TimeGPT predictions">
![Forecast with Missing Data Filled](https://raw.githubusercontent.com/Nixtla/nixtla/readme_docs/nbs/_docs/docs/tutorials/15_missing_values_files/figure-markdown_strict/cell-21-output-1.png)
</Frame>
Evaluate performance using `utilsforecast`. We will use Mean Absolute Error (MAE)
as the evaluation metric, but you can choose others like MSE, RMSE, etc.:
```python
from utilsforecast.evaluation import evaluate
from utilsforecast.losses import mae
fcst['ds'] = pd.to_datetime(fcst['ds'])
result = test_df.merge(fcst, on=['ds', 'unique_id'], how='left')
evaluate(result, metrics=[mae])
```
| | unique_id | metric | TimeGPT |
| ----- | ----------- | -------- | ------------- |
| 0 | id1 | mae | 1824.693059 |
### Step 6: Conclusion
- Always ensure that your data is free of missing dates and values before forecasting with TimeGPT.
- Select a gap-filling strategy based on your domain knowledge (linear interpolation, constant filling, etc.).
## References
- [Exclude COVID Impact in Time Series Forecasting](https://www.cienciadedatos.net/documentos/py45-weighted-time-series-forecasting.html)
- [Forecasting Time Series with Missing Values](https://cienciadedatos.net/documentos/py46-forecasting-time-series-missing-values.html)
\ No newline at end of file
---
title: "Multiple Time Series"
description: "Learn how to handle missing values in time series data for accurate forecasting with TimeGPT."
icon: "table"
---
You can pass multiple time series within the same dataset to TimeGPT. We can then make forecasts or detect anomalies on all series simultaneously.
To include multiple series, simply include a unique identifier column. By default, we expect this column to be called `unique_id`. The identifier column assigns a value to each series such that we can distinguish between them.
## Load Data with Multiple Series
Here is an example of loading a dataset with multiple series inside.
```python
df = pd.read_csv('https://raw.githubusercontent.com/Nixtla/transfer-learning-time-series/main/datasets/electricity-short-with-ex-vars.csv')
df['ds'] = pd.to_datetime(df['ds'])
df = df[["unique_id", "ds", "y"]]
df.groupby('unique_id').head(1)
```
<Frame caption="Multiple-Series Data Preview">
| unique_id | ds | y |
| --------- | ---------- | ------ |
| BE | 2016-10-22 | 70.00 |
| DE | 2017-10-22 | 19.10 |
| FR | 2016-10-22 | 54.70 |
| NP | 2018-10-15 | 2.17 |
</Frame>
Above, we can see that we have four unique series in the dataset, as there are four different values in `unique_id`. Note that each series can start at different dates.
To forecast mutliple series, we can simply call:
```python Multiple Series Forecast Example
fcst = nixtla_client.forecast(df=df, h=24)
fcst.head()
```
TimeGPT will produce forecasts for all unique IDs in your DataFrame simultaneously.
### Specifying the series identifier column
In the case where unique identifier is not stored in a column called `unique_id`, you can specify the name of the column when making a call to TimeGPT:
```python Specify the name of the column for the series identifier
fcst = nixtla_client.forecast(df=df, h=24, id_col="your_column_name")
fcst.head()
```
---
## Exogenous Variables
TimeGPT supports the use of exogenous features. These are variables that are not part of the series you are trying to forecast.
For example, suppose that you are forecasting electricity consumption, which is affected by the temperature outside. In this case, the temperature is an exogenous feature, meaning that you want to use the information from the temperature to forecast the electricity consumption.
In such case, exogenous features can be included as new columns in the dataset. Any additional column to the standard `unique_id`, `ds`, `y` format is considered as an exogenous feature.
Here is an example of loading a dataset with multiple series inside and exogenous features.
```python Multiple Series Data Loading
df = pd.read_csv('https://raw.githubusercontent.com/Nixtla/transfer-learning-time-series/main/datasets/electricity-short-with-ex-vars.csv')
df['ds'] = pd.to_datetime(df['ds'])
df.groupby('unique_id').head(1)
```
<Frame caption="Multiple-Series with Exogenous Features Preview">
| unique_id | ds | y | Exogenous1 | Exogenous2 | day_0 | day_1 | day_2 | day_3 | day_4 | day_5 | day_6 |
|-----------|----|----|------------|------------|-------|-------|-------|-------|-------|-------|-------|
| BE | 2016-10-22 | 70.00 | 57253.00 | 49593 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 |
| DE | 2017-10-22 | 19.10 | 16972.75 | 15779 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 |
| FR | 2016-10-22 | 54.70 | 57253.00 | 49593 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 |
| NP | 2018-10-15 | 2.17 | 34078.00 | 1791 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
</Frame>
Above, we can see that we have the columns from `Exogenous1` to `day_6` will be considered as exogenous features when forecasting with TimeGPT.
For more information on forecasting with exogenous features, read the [Exogenous Variables tutorial](/forecasting/exogenous-variables/numeric_features) for further details.
---
\ No newline at end of file
{
"$schema": "https://mintlify.com/docs.json",
"banner": {
"content": "🚀 **TimeGPT-2** is here, the next generation of foundation models for time series forecasting! [Read the full announcement](https://www.nixtla.io/blog/timegpt-2-announcement)",
"dismissible": true
},
"theme": "mint",
"name": "TimeGPT Foundational model for time series forecasting and anomaly detection",
"colors": {
"primary": "#161616",
"light": "#FFF",
"dark": "#161616"
},
"favicon": "/favicon.ico",
"ignore": [
"utils/**",
"depracted-docs/**",
"pending_content/**"
],
"navigation": {
"groups": [
{
"group": "INTRODUCTION",
"pages": [
"/introduction/introduction",
"/introduction/why_timegpt",
"/introduction/about_timegpt",
"/introduction/timegpt_subscription_plans",
"/introduction/faq"
]
},
{
"group": "SETUP",
"pages": [
"/setup/setting_up_your_api_key",
"/setup/python_wheel",
"/setup/docker",
"/setup/azureai"
]
},
{
"group": "DATA REQUIREMENTS",
"pages": [
"/data_requirements/data_requirements",
"/data_requirements/multiple_series",
"/data_requirements/missing_values",
"/data_requirements/audit_clean"
]
},
{
"group": "FORECASTING",
"pages": [
"/forecasting/timegpt_quickstart",
{
"group": "Exogenous variables",
"pages": [
"/forecasting/exogenous-variables/numeric_features",
"/forecasting/exogenous-variables/categorical_features",
"/forecasting/exogenous-variables/holiday_and_special_dates",
"/forecasting/exogenous-variables/date_features",
"/forecasting/exogenous-variables/interpretability_with_shap"
]
},
{
"group": "Fine-tuning",
"pages": [
"/forecasting/fine-tuning/steps",
"/forecasting/fine-tuning/depth",
"/forecasting/fine-tuning/custom_loss",
"/forecasting/fine-tuning/save_reuse_delete_finetuned_models"
]
},
{
"group": "Probabilistic forecasting",
"pages": [
"/forecasting/probabilistic/introduction",
"/forecasting/probabilistic/prediction_intervals",
"/forecasting/probabilistic/quantiles"
]
},
{
"group": "Model versions",
"pages": [
"/forecasting/model-version/longhorizon_model"
]
},
{
"group": "Evaluation",
"pages": [
"/forecasting/evaluation/cross_validation",
"/forecasting/evaluation/evaluation_metrics",
"/forecasting/evaluation/evaluation_utilsforecast"
]
},
{
"group": "Special topics",
"pages": [
"/forecasting/special-topics/irregular_timestamps",
"/forecasting/special-topics/bounded_forecasts",
"/forecasting/special-topics/hierarchical_forecasting",
"/forecasting/special-topics/temporal_hierarchical"
]
},
"/forecasting/improve_accuracy",
{
"group": "Forecasting at scale",
"pages": [
"/forecasting/forecasting-at-scale/computing_at_scale",
"/forecasting/forecasting-at-scale/spark",
"/forecasting/forecasting-at-scale/dask",
"/forecasting/forecasting-at-scale/ray"
]
}
]
},
{
"group": "ANOMALY DETECTION",
"pages": [
"/anomaly_detection/historical_anomaly_detection",
"/anomaly_detection/exogenous_variables",
{
"group": "Real-time anomaly detection",
"pages": [
"/anomaly_detection/real-time/introduction",
"/anomaly_detection/real-time/adjusting_detection",
"/anomaly_detection/real-time/univariate_multivariate"
]
}
]
},
{
"group": "USE CASES",
"pages": [
"/use_cases/forecasting_web_traffic",
"/use_cases/bitcoin_price_prediction",
"/use_cases/forecasting_energy_demand",
"/use_cases/forecasting_intermittent_demand",
"/use_cases/what_if_forecasting_price_effects_in_retail"
]
},
{
"group": "REFERENCE",
"pages": [
"/reference/sdk_reference",
"/reference/date_features",
"/reference/timegpt_excel_add_in_beta_",
"/reference/timegpt_in_r"
],
"openapi": "./openapi.json"
},
{
"group": "About",
"pages": [
"/about/key-concepts",
"/about/sub-categoria",
"/about/terms-and-conditions",
"/about/privacy-notice"
]
}
],
"global": {
"anchors": [
{
"anchor": "Home",
"href": "https://www.nixtla.io",
"icon": "book-open-cover"
},
{
"anchor": "Get in touch",
"href": "https://share.hsforms.com/2kPRkvHcfRHO5m4Qqc9wKqArbxr6",
"icon": "envelope"
},
{
"anchor": "Meet with us",
"href": "https://meetings.hubspot.com/cristian-challu/enterprise-contact-us?uuid=dc037f5a-d93b-4%5B…%5D90b-a611dd9460af&utm_source=docs&utm_medium=docs",
"icon": "calendar"
}
]
}
},
"logo": {
"light": "/logo/light.svg",
"dark": "/logo/dark.svg"
},
"navbar": {
"links": [
{
"label": "Book a meeting",
"href": "https://meetings.hubspot.com/cristian-challu/enterprise-contact-us?uuid=dc037f5a-d93b-4%5B…%5D90b-a611dd9460af&utm_source=docs&utm_medium=docs"
}
],
"primary": {
"type": "button",
"label": "Get Started",
"href": "https://dashboard.nixtla.io"
}
},
"footer": {
"socials": {
"x": "https://x.com/nixtlainc",
"github": "https://github.com/nixtla",
"linkedin": "https://linkedin.com/company/nixtlainc"
}
},
"integrations": {
"amplitude": {
"apiKey": "799ca210229ab5f1a23493c6302eaae1"
},
"gtm": {
"tagId": "GTM-TBJ64S3X"
},
"posthog": {
"apiKey": "phc_hblNl72piphlbRCYfGM8QkgazW9NNrDrP6dMbGzMp82"
},
"intercom": {
"appId": "j7y9c2ep"
}
},
"redirects": [
{
"source": "/use-cases-forecasting_web_traffic",
"destination": "/use_cases/forecasting_web_traffic"
},
{
"source": "/use-cases-bitcoin_price_prediction",
"destination": "/use_cases/bitcoin_price_prediction"
},
{
"source": "/use-cases-forecasting_energy_demand",
"destination": "/use_cases/forecasting_energy_demand"
},
{
"source": "/use-cases-forecasting_intermittent_demand",
"destination": "/use_cases/forecasting_intermittent_demand"
},
{
"source": "/use-cases-what_if_forecasting_price_effects_in_retail",
"destination": "/use_cases/what_if_forecasting_price_effects_in_retail"
},
{
"source": "/getting-started-timegpt_quickstart",
"destination": "/forecasting/timegpt_quickstart"
},
{
"source": "/tutorials-special_topics-tutorials-improve_forecast_accuracy_with_timegpt",
"destination": "/forecasting/improve_accuracy"
},
{
"source": "/capabilities-anomaly-detection-anomaly_detection",
"destination": "/anomaly_detection/historical_anomaly_detection"
},
{
"source": "/getting-started-timegen_1_quickstart_azure_",
"destination": "/setup/azureai"
},
{
"source": "/capabilities-forecast-cross_validation",
"destination": "/forecasting/evaluation/cross_validation"
},
{
"source": "/capabilities-forecast-multiple_series_forecasting",
"destination": "/data_requirements/multiple_series"
},
{
"source": "/tutorials-computing_at_scale",
"destination": "/forecasting/forecasting-at-scale/computing_at_scale"
},
{
"source": "/tutorials-special_topics-tutorials-hierarchical_forecasting",
"destination": "/forecasting/special-topics/temporal_hierarchical"
},
{
"source": "/tutorials-exogenous_variables-tutorials-shap_values_for_timegpt_and_timegen",
"destination": "/forecasting/exogenous-variables/interpretability_with_shap"
},
{
"source": "/tutorials-uncertainty_quantification-tutorials-prediction_intervals",
"destination": "/forecasting/probabilistic/introduction#uncertainty-quantification-with-timegpt"
}
],
"contextual": {
"options": [
"copy",
"view",
"chatgpt",
"claude"
]
}
}
\ No newline at end of file
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment