Commits · b4024edd4549ab647b02b8619b4072e33e64f1f9 · OpenDAS / text-generation-inference

10 Jul, 2023 1 commit
- feat: better errors for warmup and TP (#575) · b4024edd
  OlivierDehaene authored Jul 10, 2023
```
Close #571
```
  b4024edd
01 Jul, 2023 1 commit
- v0.9.0 (#525) · e28a8090
  OlivierDehaene authored Jul 01, 2023
  
  e28a8090
30 Jun, 2023 1 commit
- feat(server): add paged attention to flash models (#516) · e74bd41e
  OlivierDehaene authored Jun 30, 2023
```
Closes #478
```
  e74bd41e
28 Jun, 2023 1 commit

feat(router): add header option to disable buffering for the generate_stream response (#498) · 70f485bf

Robert Kimball authored Jun 28, 2023

# This PR adds an http header option to disable buffering for the
generate_stream endpoint response stream.

Problem: If a model is run behind a proxy server such as nginx that has
buffering enabled then the response stream from generate_stream gets
aggregated into a single response which basically disables streaming.
Instead of getting a chunked response where each token is presented over
time the response presents everything all at once.

Solution: This change adds the `X-Accel-Buffering` http header which
disables buffering for the generate_stream response, allowing the
response to stream properly.

70f485bf

16 Jun, 2023 1 commit
- feat(router): add ngrok integration (#453) · f59fb8b6
  OlivierDehaene authored Jun 16, 2023
  
  f59fb8b6
02 Jun, 2023 1 commit
- feat(server): only compute prefill logprobs when asked (#406) · 895c5f15
  OlivierDehaene authored Jun 02, 2023
```
Close #288
```
  895c5f15
23 May, 2023 1 commit
- feat(router): log input/ouput at debug level (#364) · 94200538
  OlivierDehaene authored May 23, 2023
```
@njhill FYI
```
  94200538
09 May, 2023 2 commits
- feat(docker): add benchmarking tool to docker image (#298) · e2502822
  OlivierDehaene authored May 09, 2023
  
  e2502822
- feat(router): Adding response schema for compat_generate (#292) · 926fd9a0
  Sai Vinay G authored May 09, 2023
  
  926fd9a0
02 May, 2023 1 commit
- chore(github): add templates (#264) · 411b0d4e
  Nicolas Patry authored May 02, 2023
  
  411b0d4e
26 Apr, 2023 2 commits
- feat(router): new healthcheck that skips the queue (#244) · db2b4e07
  Nicolas Patry authored Apr 26, 2023
```
Co-authored-by: OlivierDehaene <23298448+OlivierDehaene@users.noreply.github.com>
Co-authored-by: OlivierDehaene <olivier@huggingface.co>
```
  db2b4e07
- feat(router): add tests to validation (#237) · c4fb09f2
  Nicolas Patry authored Apr 26, 2023
  
  c4fb09f2
25 Apr, 2023 1 commit
- feat(router): add endpoint info to /info route (#228) · 8b182eb9
  OlivierDehaene authored Apr 25, 2023
  
  8b182eb9
24 Apr, 2023 1 commit
- feat(router): use number of tokens in batch as input for dynamic batching (#226) · ebc74d56
  OlivierDehaene authored Apr 24, 2023
```
Co-authored-by: Nick Hill <nickhill@us.ibm.com>
```
  ebc74d56
21 Apr, 2023 1 commit
- feat(router): add device and dtype info (#215) · 343437c7
  OlivierDehaene authored Apr 21, 2023
  
  343437c7
20 Apr, 2023 1 commit
- feat(router): drop requests when client closes the channel (#202) · 709d8936
  OlivierDehaene authored Apr 20, 2023
  
  709d8936
18 Apr, 2023 1 commit
- feat(router): add info route (#196) · 2475aede
  OlivierDehaene authored Apr 18, 2023
```
close #125
```
  2475aede
09 Apr, 2023 2 commits
- feat(router): make router input validation optional (#164) · 99879600
  OlivierDehaene authored Apr 09, 2023
  
  99879600
- fix(router): use buckets for metrics histograms (#163) · 7dec65a2
  OlivierDehaene authored Apr 09, 2023
  
  7dec65a2
29 Mar, 2023 1 commit

feat: aws sagemaker compatible image (#147) · d503e8f0

OlivierDehaene authored Mar 29, 2023



The only difference is that now it pushes to
registry.internal.huggingface.tech/api-inference/community/text-generation-inference/sagemaker:...
instead of
registry.internal.huggingface.tech/api-inference/community/text-generation-inference:sagemaker-...

---------
Co-authored-by: Philipp Schmid <32632186+philschmid@users.noreply.github.com>

d503e8f0

09 Mar, 2023 3 commits
- feat(router): add best_of parameter (#117) · 55bd4fed
  OlivierDehaene authored Mar 09, 2023
  
  55bd4fed
- feat(router): support left truncation (#115) · e8bfe199
  OlivierDehaene authored Mar 09, 2023
```
closes #111
```
  e8bfe199
- feat: support typical sampling (#114) · 1a2d6825
  OlivierDehaene authored Mar 09, 2023
```
closes #112
```
  1a2d6825
07 Mar, 2023 1 commit
- feat(clients): Python client (#103) · 3fef90d5
  OlivierDehaene authored Mar 07, 2023
  
  3fef90d5
02 Mar, 2023 2 commits
- feat(server): add logits watermark (#90) · 9b8ea6a6
  OlivierDehaene authored Mar 02, 2023
  
  9b8ea6a6
- feat(router): add api-inference headers (#91) · f874c478
  OlivierDehaene authored Mar 02, 2023
  
  f874c478
28 Feb, 2023 1 commit
- feat(router): ask hf.co for pipelinetag to decide on compat_return_full_text (#89) · 4e685d90
  OlivierDehaene authored Feb 28, 2023
  
  4e685d90
27 Feb, 2023 1 commit
- feat(router): add legacy route for api-inference support (#88) · 21340f24
  OlivierDehaene authored Feb 27, 2023
  
  21340f24
24 Feb, 2023 1 commit
- feat(server): add special token bool (#85) · 0ac184ce
  OlivierDehaene authored Feb 24, 2023
  
  0ac184ce
17 Feb, 2023 1 commit
- feat(router): add cors allow origin options (#73) · 6796d38c
  OlivierDehaene authored Feb 17, 2023
  
  6796d38c
16 Feb, 2023 1 commit
- feat(router): add prometheus metrics scrape endpoint (#71) · 439fcaf8
  OlivierDehaene authored Feb 16, 2023
  
  439fcaf8
15 Feb, 2023 1 commit
- feat(router): add max_total_tokens and empty_input validation (#68) · 5437d49b
  OlivierDehaene authored Feb 15, 2023
```
closes #65
```
  5437d49b
13 Feb, 2023 1 commit
- feat: add distributed tracing (#62) · 9af45414
  OlivierDehaene authored Feb 13, 2023
  
  9af45414
08 Feb, 2023 1 commit
- fixed SSE naming (#61) · e520d5b3
  Yannic Kilcher authored Feb 08, 2023
```
https://en.wikipedia.org/wiki/Server-sent_events
```
  e520d5b3
03 Feb, 2023 1 commit
- feat(router): refactor API and add openAPI schemas (#53) · 20c3c594
  OlivierDehaene authored Feb 03, 2023
  
  20c3c594
02 Feb, 2023 1 commit

breaking(router): modify /generate API to only return generated text (#50) · b1482d90

OlivierDehaene authored Feb 02, 2023

@njhill, @yk FYI

generated_text was concatenated to the user prompt for legacy reason. We
want to remove this behaviour as we don't think it is useful and even
detrimonial to usability.

We also remove the unused Vec.

b1482d90

01 Feb, 2023 1 commit
- feat(server): support repetition penalty (#47) · 313194f6
  OlivierDehaene authored Feb 01, 2023
  
  313194f6
31 Jan, 2023 3 commits

feat: Add token streaming using ServerSideEvents support (#41) · 017a2a8c
OlivierDehaene authored Jan 31, 2023

017a2a8c
Revert "feat: Add token streaming using ServerSideEvents support" (#40) · 4f9ac67c
OlivierDehaene authored Jan 31, 2023
```
Reverts huggingface/text-generation-inference#36
```
4f9ac67c

feat: Add token streaming using ServerSideEvents support (#36) · 7fbfbb0d

OlivierDehaene authored Jan 31, 2023

Add token streaming using ServerSideEvents (SSE).

The signature of the SSE events is: 

```rust
struct Details {
    finish_reason: String,
    generated_tokens: u32,
    seed: Option<u64>,
}

struct StreamResponse {
    token: Token,
    generated_text: Option<String>,
    details: Option<Details>,
}

struct ErrorResponse {
    error: String,
}
```

7fbfbb0d