- 10 Jul, 2023 1 commit
-
-
OlivierDehaene authored
Close #571
-
- 01 Jul, 2023 1 commit
-
-
OlivierDehaene authored
-
- 30 Jun, 2023 1 commit
-
-
OlivierDehaene authored
Closes #478
-
- 28 Jun, 2023 1 commit
-
-
Robert Kimball authored
# This PR adds an http header option to disable buffering for the generate_stream endpoint response stream. Problem: If a model is run behind a proxy server such as nginx that has buffering enabled then the response stream from generate_stream gets aggregated into a single response which basically disables streaming. Instead of getting a chunked response where each token is presented over time the response presents everything all at once. Solution: This change adds the `X-Accel-Buffering` http header which disables buffering for the generate_stream response, allowing the response to stream properly.
-
- 16 Jun, 2023 1 commit
-
-
OlivierDehaene authored
-
- 02 Jun, 2023 1 commit
-
-
OlivierDehaene authored
Close #288
-
- 23 May, 2023 1 commit
-
-
OlivierDehaene authored
@njhill FYI
-
- 09 May, 2023 2 commits
-
-
OlivierDehaene authored
-
Sai Vinay G authored
-
- 02 May, 2023 1 commit
-
-
Nicolas Patry authored
-
- 26 Apr, 2023 2 commits
-
-
Nicolas Patry authored
Co-authored-by:
OlivierDehaene <23298448+OlivierDehaene@users.noreply.github.com> Co-authored-by:
OlivierDehaene <olivier@huggingface.co>
-
Nicolas Patry authored
-
- 25 Apr, 2023 1 commit
-
-
OlivierDehaene authored
-
- 24 Apr, 2023 1 commit
-
-
OlivierDehaene authored
Co-authored-by:Nick Hill <nickhill@us.ibm.com>
-
- 21 Apr, 2023 1 commit
-
-
OlivierDehaene authored
-
- 20 Apr, 2023 1 commit
-
-
OlivierDehaene authored
-
- 18 Apr, 2023 1 commit
-
-
OlivierDehaene authored
close #125
-
- 09 Apr, 2023 2 commits
-
-
OlivierDehaene authored
-
OlivierDehaene authored
-
- 29 Mar, 2023 1 commit
-
-
OlivierDehaene authored
The only difference is that now it pushes to registry.internal.huggingface.tech/api-inference/community/text-generation-inference/sagemaker:... instead of registry.internal.huggingface.tech/api-inference/community/text-generation-inference:sagemaker-... --------- Co-authored-by:Philipp Schmid <32632186+philschmid@users.noreply.github.com>
-
- 09 Mar, 2023 3 commits
-
-
OlivierDehaene authored
-
OlivierDehaene authored
closes #111
-
OlivierDehaene authored
closes #112
-
- 07 Mar, 2023 1 commit
-
-
OlivierDehaene authored
-
- 02 Mar, 2023 2 commits
-
-
OlivierDehaene authored
-
OlivierDehaene authored
-
- 28 Feb, 2023 1 commit
-
-
OlivierDehaene authored
-
- 27 Feb, 2023 1 commit
-
-
OlivierDehaene authored
-
- 24 Feb, 2023 1 commit
-
-
OlivierDehaene authored
-
- 17 Feb, 2023 1 commit
-
-
OlivierDehaene authored
-
- 16 Feb, 2023 1 commit
-
-
OlivierDehaene authored
-
- 15 Feb, 2023 1 commit
-
-
OlivierDehaene authored
closes #65
-
- 13 Feb, 2023 1 commit
-
-
OlivierDehaene authored
-
- 08 Feb, 2023 1 commit
-
-
- 03 Feb, 2023 1 commit
-
-
OlivierDehaene authored
-
- 02 Feb, 2023 1 commit
-
-
OlivierDehaene authored
@njhill, @yk FYI generated_text was concatenated to the user prompt for legacy reason. We want to remove this behaviour as we don't think it is useful and even detrimonial to usability. We also remove the unused Vec.
-
- 01 Feb, 2023 1 commit
-
-
OlivierDehaene authored
-
- 31 Jan, 2023 3 commits
-
-
OlivierDehaene authored
-
OlivierDehaene authored
Reverts huggingface/text-generation-inference#36
-
OlivierDehaene authored
Add token streaming using ServerSideEvents (SSE). The signature of the SSE events is: ```rust struct Details { finish_reason: String, generated_tokens: u32, seed: Option<u64>, } struct StreamResponse { token: Token, generated_text: Option<String>, details: Option<Details>, } struct ErrorResponse { error: String, } ```
-