custom_integrations.qmd 3.16 KB
Newer Older
chenzk's avatar
v1.0  
chenzk committed
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
---
title: Custom Integrations
toc: true
toc-depth: 3
---

```{python}
#| echo: false

import re

def process_readme(integration_name):
    try:
        path = f'../src/axolotl/integrations/{integration_name}/README.md'
        with open(path, 'r') as f:
            txt = f.read()
            # Remove h1 headings
            txt = re.sub(r'^# .*\n?', '', txt, flags=re.MULTILINE)
            # Convert h2 to h3
            txt = re.sub(r'^## ', '### ', txt, flags=re.MULTILINE)
            return txt
    except FileNotFoundError:
        return None

def print_section(name, folder_name):
    output = f"\n## {name}\n"
    content = process_readme(folder_name)
    if content:
        output += content
    output += f"\nPlease see reference [here](https://github.com/axolotl-ai-cloud/axolotl/tree/main/src/axolotl/integrations/{folder_name})\n"
    return output
```

```{python}
#| output: asis
#| echo: false

# Introduction text
print("""
Axolotl adds custom features through `integrations`. They are located within the `src/axolotl/integrations` directory.

To enable them, please check the respective documentations.
""")

# Sections
sections = [
    ("Cut Cross Entropy", "cut_cross_entropy"),
    ("Grokfast", "grokfast"),
    ("Knowledge Distillation (KD)", "kd"),
    ("Liger Kernels", "liger"),
    ("Language Model Evaluation Harness (LM Eval)", "lm_eval"),
    ("Spectrum", "spectrum"),
    ("LLMCompressor", "llm_compressor")
]

for section_name, folder_name in sections:
    print(print_section(section_name, folder_name))
```

## Adding a new integration

Plugins can be used to customize the behavior of the training pipeline through [hooks](https://en.wikipedia.org/wiki/Hooking). See [`axolotl.integrations.BasePlugin`](https://github.com/axolotl-ai-cloud/axolotl/blob/main/src/axolotl/integrations/base.py) for the possible hooks.

To add a new integration, please follow these steps:

1. Create a new folder in the `src/axolotl/integrations` directory.
2. Add any relevant files (`LICENSE`, `README.md`, `ACKNOWLEDGEMENTS.md`, etc.) to the new folder.
3. Add `__init__.py` and `args.py` files to the new folder.
  - `__init__.py` should import the integration and hook into the appropriate functions.
  - `args.py` should define the arguments for the integration.
4. (If applicable) Add CPU tests under `tests/integrations` or GPU tests under `tests/e2e/integrations`.

::: {.callout-tip}

See [src/axolotl/integrations/cut_cross_entropy](https://github.com/axolotl-ai-cloud/axolotl/tree/main/src/axolotl/integrations/cut_cross_entropy) for a minimal integration example.

:::

::: {.callout-warning}

If you could not load your integration, please ensure you are pip installing in editable mode.

```bash
pip install -e .
```

and correctly spelled the integration name in the config file.

```yaml
plugins:
  - axolotl.integrations.your_integration_name.YourIntegrationPlugin
```

:::

::: {.callout-note}

It is not necessary to place your integration in the `integrations` folder. It can be in any location, so long as it's installed in a package in your python env.

See this repo for an example: [https://github.com/axolotl-ai-cloud/diff-transformer](https://github.com/axolotl-ai-cloud/diff-transformer)

:::