Commits · 45344244cf194c89c94fdbdfc1bdca724c7df4e5 · OpenDAS / text-generation-inference

25 Apr, 2023 1 commit
- Starting some routing tests. (#233) · 45344244
  Nicolas Patry authored Apr 25, 2023
  
  45344244
24 Apr, 2023 1 commit
- feat(router): use number of tokens in batch as input for dynamic batching (#226) · ebc74d56
  OlivierDehaene authored Apr 24, 2023
```
Co-authored-by: Nick Hill <nickhill@us.ibm.com>
```
  ebc74d56
17 Apr, 2023 1 commit
- fix(router): fix truncation (#190) · c13b9d87
  OlivierDehaene authored Apr 17, 2023
```
closes #189
```
  c13b9d87
09 Apr, 2023 1 commit
- feat(router): make router input validation optional (#164) · 99879600
  OlivierDehaene authored Apr 09, 2023
  
  99879600
30 Mar, 2023 1 commit
- feat(benchmark): tui based benchmarking tool (#149) · 610bb1f9
  OlivierDehaene authored Mar 30, 2023
  
  610bb1f9
16 Mar, 2023 1 commit
- fix(server): use server tokenizer as gt (#128) · b49dbf2d
  OlivierDehaene authored Mar 16, 2023
  
  b49dbf2d
09 Mar, 2023 3 commits
- feat(router): add best_of parameter (#117) · 55bd4fed
  OlivierDehaene authored Mar 09, 2023
  
  55bd4fed
- feat(router): support left truncation (#115) · e8bfe199
  OlivierDehaene authored Mar 09, 2023
```
closes #111
```
  e8bfe199
- feat: support typical sampling (#114) · 1a2d6825
  OlivierDehaene authored Mar 09, 2023
```
closes #112
```
  1a2d6825
07 Mar, 2023 1 commit
- feat(clients): Python client (#103) · 3fef90d5
  OlivierDehaene authored Mar 07, 2023
  
  3fef90d5
02 Mar, 2023 1 commit
- feat(server): add logits watermark (#90) · 9b8ea6a6
  OlivierDehaene authored Mar 02, 2023
  
  9b8ea6a6
16 Feb, 2023 1 commit
- feat(router): add prometheus metrics scrape endpoint (#71) · 439fcaf8
  OlivierDehaene authored Feb 16, 2023
  
  439fcaf8
15 Feb, 2023 1 commit
- feat(router): add max_total_tokens and empty_input validation (#68) · 5437d49b
  OlivierDehaene authored Feb 15, 2023
```
closes #65
```
  5437d49b
13 Feb, 2023 1 commit
- feat: add distributed tracing (#62) · 9af45414
  OlivierDehaene authored Feb 13, 2023
  
  9af45414
03 Feb, 2023 1 commit
- feat(router): refactor API and add openAPI schemas (#53) · 20c3c594
  OlivierDehaene authored Feb 03, 2023
  
  20c3c594
02 Feb, 2023 1 commit
- feat(router): use background task to manage request queue (#52) · 7b870e1e
  OlivierDehaene authored Feb 02, 2023
```
Co-authored-by: Nick Hill <nickhill@us.ibm.com>
```
  7b870e1e
01 Feb, 2023 1 commit
- feat(server): support repetition penalty (#47) · 313194f6
  OlivierDehaene authored Feb 01, 2023
  
  313194f6
31 Jan, 2023 4 commits
- feat: Add token streaming using ServerSideEvents support (#41) · 017a2a8c
  OlivierDehaene authored Jan 31, 2023
  
  017a2a8c
- fix(server): fix seeding with multiple shards (#44) · 54fec931
  OlivierDehaene authored Jan 31, 2023
  
  54fec931
- Revert "feat: Add token streaming using ServerSideEvents support" (#40) · 4f9ac67c
  OlivierDehaene authored Jan 31, 2023
```
Reverts huggingface/text-generation-inference#36
```
  4f9ac67c
- feat: Add token streaming using ServerSideEvents support (#36) · 7fbfbb0d
  OlivierDehaene authored Jan 31, 2023
```
Add token streaming using ServerSideEvents (SSE).

The signature of the SSE events is: 

```rust
  struct Details {
      finish_reason: String,
      generated_tokens: u32,
      seed: Option<u64>,
  }
  
  struct StreamResponse {
      token: Token,
      generated_text: Option<String>,
      details: Option<Details>,
  }
  
  struct ErrorResponse {
      error: String,
  }
```
```
  7fbfbb0d
20 Jan, 2023 1 commit
- feat(server): Support SantaCoder (#26) · 15511edc
  OlivierDehaene authored Jan 20, 2023
  
  15511edc
03 Jan, 2023 1 commit
- feat(router): Add const parameters to validation logic (#15) · 60472f9d
  Nick Hill authored Jan 03, 2023
```
I noticed some opportunity to collapse some of the logic, in case you
are interested.
```
  60472f9d
30 Dec, 2022 1 commit

fix(router): Include special tokens when tokenizing (#14) · 3efa5bbb

Nick Hill authored Dec 30, 2022

There's currently a discrepancy in the tokenization between the router
and python server code. The latter includes special tokens but former
does not.

This results in a token count mismatch for seq2seq models such as mt0
where the tokenizer emits an EOS token at the end.

This in turn results in some unexpected/incorrect output, in particular
when batch concatenation is involved, because the python code uses the
input length passed from the router for each row.

As far as I can tell, it is better to include this token in the encoder
`input_ids`, so I guess it's best to just adjust on the router side.

3efa5bbb

12 Dec, 2022 1 commit
- feat: Support stop sequences (#7) · 718096f6
  OlivierDehaene authored Dec 12, 2022
  
  718096f6
05 Dec, 2022 1 commit

fix(batching): Avoid theoretical hang in batcher loop (#5) · 31d76e23

Nick Hill authored Dec 05, 2022



- Avoid theoretical hang in batcher loop
- Avoid a couple of clones in the router generate method
- Keep attention mask tensors as integers
- Remove num_heads attribute
Co-authored-by: OlivierDehaene <Olivier.dehaene@gmail.com>

31d76e23

14 Nov, 2022 2 commits
- fix(router): Handle tokenizer errors · d6d5b12e
  OlivierDehaene authored Nov 14, 2022
  
  d6d5b12e
- fix(router): Fix HTTP status codes · 91f5f862
  OlivierDehaene authored Nov 14, 2022
  
  91f5f862
27 Oct, 2022 1 commit
- feat(server): Support bitsandbytes · 09674e6d
  OlivierDehaene authored Oct 27, 2022
  
  09674e6d
21 Oct, 2022 2 commits
- feat(router): Add max_waiting_tokens · c8378933
  OlivierDehaene authored Oct 21, 2022
  
  c8378933
- fix(validation): Fix error messages · 895a341d
  OlivierDehaene authored Oct 21, 2022
  
  895a341d
20 Oct, 2022 1 commit
- v0.1.0 · f16f2f5a
  Olivier Dehaene authored Oct 18, 2022
  
  f16f2f5a
17 Oct, 2022 2 commits
- feat: Add arguments to CLI · 92c1ecd0
  Olivier Dehaene authored Oct 17, 2022
  
  92c1ecd0
- feat: Improve error handling · 5e5d8766
  Olivier Dehaene authored Oct 17, 2022
  
  5e5d8766
11 Oct, 2022 1 commit
- Refactored gRPC interface · 4c693e65
  Olivier Dehaene authored Oct 11, 2022
```
Added validation logic
```
  4c693e65