Commits · c0795de2f2c0beded136aa058f963dc1e341177a · OpenDAS / text-generation-inference

09 Mar, 2023 1 commit
- feat: support typical sampling (#114) · 1a2d6825
  OlivierDehaene authored Mar 09, 2023
```
closes #112
```
  1a2d6825
07 Mar, 2023 1 commit
- feat(clients): Python client (#103) · 3fef90d5
  OlivierDehaene authored Mar 07, 2023
  
  3fef90d5
06 Mar, 2023 1 commit
- feat: allow local models (#101) · cd5961b5
  OlivierDehaene authored Mar 06, 2023
```
closes #99
```
  cd5961b5
03 Mar, 2023 1 commit
- v0.3.2 (#97) · 1c19b093
  OlivierDehaene authored Mar 03, 2023
  
  1c19b093
02 Mar, 2023 2 commits
- feat(server): add logits watermark (#90) · 9b8ea6a6
  OlivierDehaene authored Mar 02, 2023
  
  9b8ea6a6
- feat(router): add api-inference headers (#91) · f874c478
  OlivierDehaene authored Mar 02, 2023
  
  f874c478
28 Feb, 2023 1 commit
- feat(router): ask hf.co for pipelinetag to decide on compat_return_full_text (#89) · 4e685d90
  OlivierDehaene authored Feb 28, 2023
  
  4e685d90
27 Feb, 2023 1 commit
- feat(router): add legacy route for api-inference support (#88) · 21340f24
  OlivierDehaene authored Feb 27, 2023
  
  21340f24
24 Feb, 2023 2 commits
- feat(server): add special token bool (#85) · 0ac184ce
  OlivierDehaene authored Feb 24, 2023
  
  0ac184ce
- v0.3.1 (#84) · 4b1c9720
  OlivierDehaene authored Feb 24, 2023
  
  4b1c9720
17 Feb, 2023 1 commit
- feat(router): add cors allow origin options (#73) · 6796d38c
  OlivierDehaene authored Feb 17, 2023
  
  6796d38c
16 Feb, 2023 2 commits
- v0.3.0 (#72) · c720555a
  OlivierDehaene authored Feb 16, 2023
  
  c720555a
- feat(router): add prometheus metrics scrape endpoint (#71) · 439fcaf8
  OlivierDehaene authored Feb 16, 2023
  
  439fcaf8
15 Feb, 2023 1 commit
- feat(router): add max_total_tokens and empty_input validation (#68) · 5437d49b
  OlivierDehaene authored Feb 15, 2023
```
closes #65
```
  5437d49b
13 Feb, 2023 1 commit
- feat: add distributed tracing (#62) · 9af45414
  OlivierDehaene authored Feb 13, 2023
  
  9af45414
08 Feb, 2023 1 commit
- fixed SSE naming (#61) · e520d5b3
  Yannic Kilcher authored Feb 08, 2023
```
https://en.wikipedia.org/wiki/Server-sent_events
```
  e520d5b3
07 Feb, 2023 1 commit
- V0.2.1 (#58) · 2fe5e1b3
  OlivierDehaene authored Feb 07, 2023
  
  2fe5e1b3
03 Feb, 2023 1 commit
- feat(router): refactor API and add openAPI schemas (#53) · 20c3c594
  OlivierDehaene authored Feb 03, 2023
  
  20c3c594
02 Feb, 2023 2 commits

breaking(router): modify /generate API to only return generated text (#50) · b1482d90

OlivierDehaene authored Feb 02, 2023

@njhill, @yk FYI

generated_text was concatenated to the user prompt for legacy reason. We
want to remove this behaviour as we don't think it is useful and even
detrimonial to usability.

We also remove the unused Vec.

b1482d90

feat(router): use background task to manage request queue (#52) · 7b870e1e
OlivierDehaene authored Feb 02, 2023
```
Co-authored-by: Nick Hill <nickhill@us.ibm.com>
```
7b870e1e

01 Feb, 2023 1 commit
- feat(server): support repetition penalty (#47) · 313194f6
  OlivierDehaene authored Feb 01, 2023
  
  313194f6
31 Jan, 2023 4 commits
- feat: Add token streaming using ServerSideEvents support (#41) · 017a2a8c
  OlivierDehaene authored Jan 31, 2023
  
  017a2a8c
- fix(server): fix seeding with multiple shards (#44) · 54fec931
  OlivierDehaene authored Jan 31, 2023
  
  54fec931
- Revert "feat: Add token streaming using ServerSideEvents support" (#40) · 4f9ac67c
  OlivierDehaene authored Jan 31, 2023
```
Reverts huggingface/text-generation-inference#36
```
  4f9ac67c
- feat: Add token streaming using ServerSideEvents support (#36) · 7fbfbb0d
  OlivierDehaene authored Jan 31, 2023
```
Add token streaming using ServerSideEvents (SSE).

The signature of the SSE events is: 

```rust
  struct Details {
      finish_reason: String,
      generated_tokens: u32,
      seed: Option<u64>,
  }
  
  struct StreamResponse {
      token: Token,
      generated_text: Option<String>,
      details: Option<Details>,
  }
  
  struct ErrorResponse {
      error: String,
  }
```
```
  7fbfbb0d
30 Jan, 2023 1 commit
- feat: Support sampling seeding (#37) · cd298bc5
  OlivierDehaene authored Jan 30, 2023
```
Co-authored-by: Yannic Kilcher <yk@users.noreply.github.com>
```
  cd298bc5
26 Jan, 2023 1 commit
- feat(router): Remove second lock from batcher hot path (#27) · 1539d3cb
  OlivierDehaene authored Jan 26, 2023
```
@njhill
```
  1539d3cb
23 Jan, 2023 2 commits
- fix(router): fix api-inference deployment (#31) · 5c01e254
  OlivierDehaene authored Jan 23, 2023
  
  5c01e254
- feat(docker): Make the image compatible with api-inference (#29) · f9d0ec37
  OlivierDehaene authored Jan 23, 2023
  
  f9d0ec37
20 Jan, 2023 1 commit
- feat(server): Support SantaCoder (#26) · 15511edc
  OlivierDehaene authored Jan 20, 2023
  
  15511edc
17 Jan, 2023 2 commits
- fix(router): Obey max batch size (#23) · f7ac3949
  Nick Hill authored Jan 17, 2023
  
  f7ac3949
- fix(server): Minor refactorization using new_zeros (#24) · e6d3eb5d
  Nick Hill authored Jan 17, 2023
```
- Fix some type hints, in particular base tokenizer class
- Make use of `tensor.new_zero/empty` methods
- Simplify env var string parsing in launcher
```
  e6d3eb5d
03 Jan, 2023 1 commit
- feat(router): Add const parameters to validation logic (#15) · 60472f9d
  Nick Hill authored Jan 03, 2023
```
I noticed some opportunity to collapse some of the logic, in case you
are interested.
```
  60472f9d
30 Dec, 2022 1 commit

fix(router): Include special tokens when tokenizing (#14) · 3efa5bbb

Nick Hill authored Dec 30, 2022

There's currently a discrepancy in the tokenization between the router
and python server code. The latter includes special tokens but former
does not.

This results in a token count mismatch for seq2seq models such as mt0
where the tokenizer emits an EOS token at the end.

This in turn results in some unexpected/incorrect output, in particular
when batch concatenation is involved, because the python code uses the
input length passed from the router for each row.

As far as I can tell, it is better to include this token in the encoder
`input_ids`, so I guess it's best to just adjust on the router side.

3efa5bbb

15 Dec, 2022 1 commit
- feat: Return logprobs (#8) · 32a25306
  OlivierDehaene authored Dec 15, 2022
  
  32a25306
12 Dec, 2022 1 commit
- feat: Support stop sequences (#7) · 718096f6
  OlivierDehaene authored Dec 12, 2022
  
  718096f6
08 Dec, 2022 1 commit
- feat(server): Add model tests (#6) · a2985036
  OlivierDehaene authored Dec 08, 2022
  
  a2985036
05 Dec, 2022 1 commit

fix(batching): Avoid theoretical hang in batcher loop (#5) · 31d76e23

Nick Hill authored Dec 05, 2022



- Avoid theoretical hang in batcher loop
- Avoid a couple of clones in the router generate method
- Keep attention mask tensors as integers
- Remove num_heads attribute
Co-authored-by: OlivierDehaene <Olivier.dehaene@gmail.com>

31d76e23

14 Nov, 2022 2 commits
- fix(router): Handle tokenizer errors · d6d5b12e
  OlivierDehaene authored Nov 14, 2022
  
  d6d5b12e
- fix(router): Fix HTTP status codes · 91f5f862
  OlivierDehaene authored Nov 14, 2022
  
  91f5f862