gated_model_access.md 854 Bytes
Newer Older
1
2
3
4
# Serving Private & Gated Models

If the model you wish to serve is behind gated access or the model repository on Hugging Face Hub is private, and you have access to the model, you can provide your Hugging Face Hub access token. You can generate and copy a read token from [Hugging Face Hub tokens page](https://huggingface.co/settings/tokens)

xuxzh1's avatar
last  
xuxzh1 committed
5
If you're using the CLI, set the `HF_TOKEN` environment variable. For example:
Omar Sanseviero's avatar
Omar Sanseviero committed
6
7

```
xuxzh1's avatar
last  
xuxzh1 committed
8
export HF_TOKEN=<YOUR READ TOKEN>
Omar Sanseviero's avatar
Omar Sanseviero committed
9
10
```

xuxzh1's avatar
last  
xuxzh1 committed
11
If you would like to do it through Docker, you can provide your token by specifying `HF_TOKEN` as shown below.
Omar Sanseviero's avatar
Omar Sanseviero committed
12
13
14
15
16
17
18
19

```bash
model=meta-llama/Llama-2-7b-chat-hf
volume=$PWD/data
token=<your READ token>

docker run --gpus all \
    --shm-size 1g \
xuxzh1's avatar
last  
xuxzh1 committed
20
    -e HF_TOKEN=$token \
Omar Sanseviero's avatar
Omar Sanseviero committed
21
    -p 8080:80 \
xuxzh1's avatar
last  
xuxzh1 committed
22
    -v $volume:/data ghcr.io/huggingface/text-generation-inference:2.0.4 \
Omar Sanseviero's avatar
Omar Sanseviero committed
23
    --model-id $model
Nicolas Patry's avatar
Nicolas Patry committed
24
```