- 09 Dec, 2024 2 commits
-
-
Nicolas Patry authored
* New version. * Link fixup. * Update docs. * FIxup.
-
Nicolas Patry authored
* V3 document. * Updating asset.
-
- 22 Nov, 2024 1 commit
-
-
OlivierDehaene authored
* chore: prepare 2.4.1 release * fix tests * fmt
-
- 25 Oct, 2024 1 commit
-
-
OlivierDehaene authored
-
- 10 Oct, 2024 1 commit
-
-
vb authored
Update to most recent stable version of TGI.
-
- 02 Oct, 2024 1 commit
-
-
drbh authored
allow revision for lora adapters from launcher Co-authored-by:
Sida <sida@kulamind.com> Co-authored-by:
teamclouday <teamclouday@gmail.com>
-
- 24 Sep, 2024 1 commit
-
-
Nicholas Broad authored
specify how to call local adapters
-
- 06 Sep, 2024 1 commit
-
-
Martin Iglesias Goyanes authored
* Add links to Adyen blogpost * Adding to toctree. * Update external.md * Update _toctree.yml --------- Co-authored-by:Nicolas Patry <patry.nicolas@protonmail.com>
-
- 05 Sep, 2024 1 commit
-
-
Nicolas Patry authored
-
- 16 Aug, 2024 1 commit
-
-
Vaibhav Srivastav authored
* Improve the Consuming TGI docs. * Fix erronous update to . * add info about Open AI client. * More updates. * Apply suggestions from code review Co-authored-by:
Erik Kaunismäki <erik.kaum@gmail.com> * Suggestions from Lucain. * Update Gradio snippet. * Up. * Apply suggestions from code review Co-authored-by:
Lucain <lucainp@gmail.com> * Update docs/source/basic_tutorials/consuming_tgi.md Co-authored-by:
Lucain <lucainp@gmail.com> * Up. * Apply suggestions from code review Co-authored-by:
Omar Sanseviero <osanseviero@gmail.com> * Up. * Up. * Doc review from Nico. * Doc review from Nico. x2 * Last nit --------- Co-authored-by:
Erik Kaunismäki <erik.kaum@gmail.com> Co-authored-by:
Lucain <lucainp@gmail.com> Co-authored-by:
Omar Sanseviero <osanseviero@gmail.com>
-
- 09 Aug, 2024 2 commits
-
-
Nicolas Patry authored
* Using an enum for flash backens (paged/flashdecoding/flashinfer) * Early exit on server too. * Clippy. * Fix clippy and fmt.
-
Vaibhav Srivastav authored
* Minor doc fixes * up. * Other minor updates.
-
- 08 Aug, 2024 1 commit
-
-
Vaibhav Srivastav authored
* Update Quantization docs and minor doc fix. * update readme with latest quants info * Apply suggestions from code review Co-authored-by:
Pedro Cuenca <pedro@huggingface.co> * up --------- Co-authored-by:
Pedro Cuenca <pedro@huggingface.co>
-
- 05 Aug, 2024 1 commit
-
-
drbh authored
-
- 25 Jun, 2024 1 commit
-
-
drbh authored
* feat: first draft load multiple lora * feat: load weights within layer and refactor lora pass * fix: refactor and reduce lora math * feat: baseline impl single request multi lora support * feat: prefer lorax implementation and port loading logic * fix: prefer adapter_data and refactors * feat: perfer loraxs custom punica kernels and add mlp loras * fix: adjust batch for bgmv * fix: adjust adapter_segments logic when in batch * fix: refactor and move changes to v3 proto * fix: pass model_id for all flash causal lms * fix: pass model_id for all causal and seq2seq lms * fix: add model_id to model test * feat: add lora support to mistral and refactors * feat: prefer model id in request * fix: include rust code for adapter id * feat: bump launcher and add new lora docs * feat: support base model generation and refactors * fix: rename doc to retry ci build * feat: support if vlm models * fix: add adapter_data param and avoid missing layers * fix: add adapter_data param to phi and neox * fix: update all models forwards to include adapter_data * fix: add model_id to IdeficsCausalLM * Update lora.md Fixed a typo * Update lora.md Fixing spam image * fix: add lora kernel to dockerfile, support running without kernels and refactors * fix: avoid dockerfile conflict * fix: refactors and adjust flash llama lora logic * fix: skip llama test due to CI issue (temp) * fix: skip llama test CI (temp) 2 * fix: revert skips and prefer updated ci token for tests * fix: refactors and helpful comments * fix: add noop in TensorParallelAdapterRowLinear too * fix: refactor and move shard_lora_weights logic * fix: exit early if no adapter_data --------- Co-authored-by:Derek <datavistics@gmail.com>
-
- 30 May, 2024 1 commit
-
-
Daniël de Kok authored
Mostly straightforward, changes to existing code: * Wrap quantizer parameters in a small wrapper to avoid passing around untyped tuples and needing to repack them as a dict. * Move scratch space computation to warmup, because we need the maximum input sequence length to avoid allocating huge scratch buffers that OOM.
-
- 28 May, 2024 1 commit
-
-
Nicolas Patry authored
- Axum upgraded to hyper 1.0 and most of the ecosystem switched so it's our time now - [ngrok-rust](https://github.com/ngrok/ngrok-rust/pull/137/files) hasn't yet, and hasn't for several months now, so let's disabled the feature for the time being. # What does this PR do? <!-- Congratulations! You've made it this far! You're not quite done yet though. Once merged, your PR is going to appear in the release notes with the title you set, so make sure it's a great title that fully reflects the extent of your awesome contribution. Then, please replace this with a description of the change and which issue is fixed (if applicable). Please also include relevant motivation and context. List any dependencies (if any) that are required for this change. Once you're done, someone will review your PR shortly (see the section "Who can review?" below to tag some potential reviewers). They may suggest changes to make the code even better. If no one reviewed your PR after a week has passed, don't hesitate to post a new comment @-mentioning the same persons---sometimes notifications get lost. --> <!-- Remove if not applicable --> Fixes # (issue) ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Did you read the [contributor guideline](https://github.com/huggingface/transformers/blob/main/CONTRIBUTING.md#start-contributing-pull-requests), Pull Request section? - [ ] Was this discussed/approved via a Github issue or the [forum](https://discuss.huggingface.co/)? Please add a link to it if that's the case. - [ ] Did you make sure to update the documentation with your changes? Here are the [documentation guidelines](https://github.com/huggingface/transformers/tree/main/docs), and [here are tips on formatting docstrings](https://github.com/huggingface/transformers/tree/main/docs#writing-source-documentation). - [ ] Did you write any new necessary tests? ## Who can review? Anyone in the community is free to review the PR once the tests have passed. Feel free to tag members/contributors who may be interested in your PR. <!-- Your PR will be replied to more quickly if you can figure out the right person to tag with @ @OlivierDehaene OR @Narsil -->
-
- 27 May, 2024 1 commit
-
-
Moritz Laurer authored
# What does this PR do? Fix a typo; fix a broken link; add one sentence in the guidance docs to make the word "grammar" less abstract ## Before submitting - [x] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Did you read the [contributor guideline](https://github.com/huggingface/transformers/blob/main/CONTRIBUTING.md#start-contributing-pull-requests), Pull Request section? - [ ] Was this discussed/approved via a Github issue or the [forum](https://discuss.huggingface.co/)? Please add a link to it if that's the case. - [ ] Did you make sure to update the documentation with your changes? Here are the [documentation guidelines](https://github.com/huggingface/transformers/tree/main/docs), and [here are tips on formatting docstrings](https://github.com/huggingface/transformers/tree/main/docs#writing-source-documentation). - [ ] Did you write any new necessary tests? ## Who can review? Anyone in the community is free to review the PR once the tests have passed. Feel free to tag members/contributors who may be interested in your PR. @drbh
-
- 23 May, 2024 1 commit
-
-
drbh authored
This PR adds a tutorial to self distill and train medusa heads for a specific model --------- Co-authored-by:Nicolas Patry <patry.nicolas@protonmail.com>
-
- 14 May, 2024 1 commit
-
-
Brandon Lockaby authored
Fix typo in link to 'using guidance' article
-
- 01 May, 2024 1 commit
-
-
drbh authored
This PR improves the guidance docs and adds a section that explains how grammars are applied on a technical level
-
- 30 Apr, 2024 1 commit
-
-
drbh authored
This PR adds a short "how it works" section to guidance and includes a mention to the outlines library that enables grammars/tools *and a small formatting change --------- Co-authored-by:Mishig <mishig.davaadorj@coloradocollege.edu>
-
- 25 Apr, 2024 1 commit
-
-
dr3s authored
# What does this PR do? Update guidance docs to reflect grammar support in API. The previous wording was vague and made it sound like openai API supported the grammar parameter. https://github.com/huggingface/text-generation-inference/blob/main/router/src/server.rs#L654 confirms that support for grammar is TGI only at this time. ## Before submitting - [ x] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [x ] Did you read the [contributor guideline](https://github.com/huggingface/transformers/blob/main/CONTRIBUTING.md#start-contributing-pull-requests), Pull Request section? - [ x] Was this discussed/approved via a Github issue or the [forum](https://discuss.huggingface.co/)? Please add a link to it if that's the case. - [ x] Did you make sure to update the documentation with your changes? Here are the [documentation guidelines](https://github.com/huggingface/transformers/tree/main/docs), and [here are tips on formatting docstrings](https://github.com/huggingface/transformers/tree/main/docs#writing-source-documentation). - [ x] Did you write any new necessary tests? ## Who can review? Anyone in the community is free to review the PR once the tests have passed. Feel free to tag members/contributors who may be interested in your PR. <!-- Your PR will be replied to more quickly if you can figure out the right person to tag with @ @OlivierDehaene OR @Narsil -->
-
- 23 Apr, 2024 1 commit
-
-
Nicolas Patry authored
# What does this PR do? <!-- Congratulations! You've made it this far! You're not quite done yet though. Once merged, your PR is going to appear in the release notes with the title you set, so make sure it's a great title that fully reflects the extent of your awesome contribution. Then, please replace this with a description of the change and which issue is fixed (if applicable). Please also include relevant motivation and context. List any dependencies (if any) that are required for this change. Once you're done, someone will review your PR shortly (see the section "Who can review?" below to tag some potential reviewers). They may suggest changes to make the code even better. If no one reviewed your PR after a week has passed, don't hesitate to post a new comment @-mentioning the same persons---sometimes notifications get lost. --> <!-- Remove if not applicable --> Fixes # (issue) ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Did you read the [contributor guideline](https://github.com/huggingface/transformers/blob/main/CONTRIBUTING.md#start-contributing-pull-requests), Pull Request section? - [ ] Was this discussed/approved via a Github issue or the [forum](https://discuss.huggingface.co/)? Please add a link to it if that's the case. - [ ] Did you make sure to update the documentation with your changes? Here are the [documentation guidelines](https://github.com/huggingface/transformers/tree/main/docs), and [here are tips on formatting docstrings](https://github.com/huggingface/transformers/tree/main/docs#writing-source-documentation). - [ ] Did you write any new necessary tests? ## Who can review? Anyone in the community is free to review the PR once the tests have passed. Feel free to tag members/contributors who may be interested in your PR. <!-- Your PR will be replied to more quickly if you can figure out the right person to tag with @ @OlivierDehaene OR @Narsil -->
-
- 22 Apr, 2024 1 commit
-
-
Moritz Laurer authored
# What does this PR do? Fix some small typos in the docs; add minor clarifications; add guidance to features on landing page ## Before submitting - [x] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Did you read the [contributor guideline](https://github.com/huggingface/transformers/blob/main/CONTRIBUTING.md#start-contributing-pull-requests), Pull Request section? - [ ] Was this discussed/approved via a Github issue or the [forum](https://discuss.huggingface.co/)? Please add a link to it if that's the case. - [ ] Did you make sure to update the documentation with your changes? Here are the [documentation guidelines](https://github.com/huggingface/transformers/tree/main/docs), and [here are tips on formatting docstrings](https://github.com/huggingface/transformers/tree/main/docs#writing-source-documentation). - [ ] Did you write any new necessary tests? ## Who can review? @OlivierDehaene
-
- 12 Apr, 2024 1 commit
-
-
Ikko Eltociear Ashimine authored
# What does this PR do? compliation -> compilation ## Before submitting - [x] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [x] Did you read the [contributor guideline](https://github.com/huggingface/transformers/blob/main/CONTRIBUTING.md#start-contributing-pull-requests), Pull Request section? - [ ] Was this discussed/approved via a Github issue or the [forum](https://discuss.huggingface.co/)? Please add a link to it if that's the case. - [ ] Did you make sure to update the documentation with your changes? Here are the [documentation guidelines](https://github.com/huggingface/transformers/tree/main/docs), and [here are tips on formatting docstrings](https://github.com/huggingface/transformers/tree/main/docs#writing-source-documentation). - [ ] Did you write any new necessary tests? ## Who can review? Anyone in the community is free to review the PR once the tests have passed. Feel free to tag members/contributors who may be interested in your PR. <!-- Your PR will be replied to more quickly if you can figure out the right person to tag with @ @OlivierDehaene OR @Narsil -->
-
- 28 Feb, 2024 3 commits
-
-
OlivierDehaene authored
-
Nicolas Patry authored
# What does this PR do? <!-- Congratulations! You've made it this far! You're not quite done yet though. Once merged, your PR is going to appear in the release notes with the title you set, so make sure it's a great title that fully reflects the extent of your awesome contribution. Then, please replace this with a description of the change and which issue is fixed (if applicable). Please also include relevant motivation and context. List any dependencies (if any) that are required for this change. Once you're done, someone will review your PR shortly (see the section "Who can review?" below to tag some potential reviewers). They may suggest changes to make the code even better. If no one reviewed your PR after a week has passed, don't hesitate to post a new comment @-mentioning the same persons---sometimes notifications get lost. --> <!-- Remove if not applicable --> Fixes # (issue) ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Did you read the [contributor guideline](https://github.com/huggingface/transformers/blob/main/CONTRIBUTING.md#start-contributing-pull-requests), Pull Request section? - [ ] Was this discussed/approved via a Github issue or the [forum](https://discuss.huggingface.co/)? Please add a link to it if that's the case. - [ ] Did you make sure to update the documentation with your changes? Here are the [documentation guidelines](https://github.com/huggingface/transformers/tree/main/docs), and [here are tips on formatting docstrings](https://github.com/huggingface/transformers/tree/main/docs#writing-source-documentation). - [ ] Did you write any new necessary tests? ## Who can review? Anyone in the community is free to review the PR once the tests have passed. Feel free to tag members/contributors who may be interested in your PR. <!-- Your PR will be replied to more quickly if you can figure out the right person to tag with @ @OlivierDehaene OR @Narsil -->
-
Nicolas Patry authored
# What does this PR do? It was meant to be in seconds float <!-- Congratulations! You've made it this far! You're not quite done yet though. Once merged, your PR is going to appear in the release notes with the title you set, so make sure it's a great title that fully reflects the extent of your awesome contribution. Then, please replace this with a description of the change and which issue is fixed (if applicable). Please also include relevant motivation and context. List any dependencies (if any) that are required for this change. Once you're done, someone will review your PR shortly (see the section "Who can review?" below to tag some potential reviewers). They may suggest changes to make the code even better. If no one reviewed your PR after a week has passed, don't hesitate to post a new comment @-mentioning the same persons---sometimes notifications get lost. --> <!-- Remove if not applicable --> Fixes # (issue) ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Did you read the [contributor guideline](https://github.com/huggingface/transformers/blob/main/CONTRIBUTING.md#start-contributing-pull-requests), Pull Request section? - [ ] Was this discussed/approved via a Github issue or the [forum](https://discuss.huggingface.co/)? Please add a link to it if that's the case. - [ ] Did you make sure to update the documentation with your changes? Here are the [documentation guidelines](https://github.com/huggingface/transformers/tree/main/docs), and [here are tips on formatting docstrings](https://github.com/huggingface/transformers/tree/main/docs#writing-source-documentation). - [ ] Did you write any new necessary tests? ## Who can review? Anyone in the community is free to review the PR once the tests have passed. Feel free to tag members/contributors who may be interested in your PR. <!-- Your PR will be replied to more quickly if you can figure out the right person to tag with @ @OlivierDehaene OR @Narsil -->
-
- 16 Feb, 2024 1 commit
-
-
OlivierDehaene authored
-
- 03 Oct, 2023 1 commit
-
-
Fluder-Paradyne authored
# What does this PR do? <!-- Congratulations! You've made it this far! You're not quite done yet though. Once merged, your PR is going to appear in the release notes with the title you set, so make sure it's a great title that fully reflects the extent of your awesome contribution. Then, please replace this with a description of the change and which issue is fixed (if applicable). Please also include relevant motivation and context. List any dependencies (if any) that are required for this change. Once you're done, someone will review your PR shortly (see the section "Who can review?" below to tag some potential reviewers). They may suggest changes to make the code even better. If no one reviewed your PR after a week has passed, don't hesitate to post a new comment @-mentioning the same persons---sometimes notifications get lost. --> <!-- Remove if not applicable --> Just removed `--` from the arguments. With `--` bitsandbytes and bitsandbytes-nf4 are considered an option which they are not ## Before submitting - [x] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Did you read the [contributor guideline](https://github.com/huggingface/transformers/blob/main/CONTRIBUTING.md#start-contributing-pull-requests), Pull Request section? - [ ] Was this discussed/approved via a Github issue or the [forum](https://discuss.huggingface.co/)? Please add a link to it if that's the case. - [ ] Did you make sure to update the documentation with your changes? Here are the [documentation guidelines](https://github.com/huggingface/transformers/tree/main/docs), and [here are tips on formatting docstrings](https://github.com/huggingface/transformers/tree/main/docs#writing-source-documentation). - [ ] Did you write any new necessary tests? ## Who can review? Anyone in the community is free to review the PR once the tests have passed. Feel free to tag members/contributors who may be interested in your PR. <!-- Your PR will be replied to more quickly if you can figure out the right person to tag with @ @OlivierDehaene OR @Narsil -->
-
- 12 Sep, 2023 2 commits
-
-
Merve Noyan authored
Co-authored-by:
Nicolas Patry <patry.nicolas@protonmail.com> Co-authored-by:
Pedro Cuenca <pedro@huggingface.co>
-
Merve Noyan authored
Co-authored-by:
Nicolas Patry <patry.nicolas@protonmail.com> Co-authored-by:
Omar Sanseviero <osanseviero@gmail.com> Co-authored-by:
Pedro Cuenca <pedro@huggingface.co>
-
- 08 Sep, 2023 1 commit
-
-
Merve Noyan authored
-
- 07 Sep, 2023 1 commit
-
-
Merve Noyan authored
IDK what else to add in this guide, I looked for relevant code in TGI codebase and saw that it's used in quantization as well (maybe I could add that?)
-
- 06 Sep, 2023 3 commits
-
-
Omar Sanseviero authored
Co-authored-by:OlivierDehaene <olivier@huggingface.co>
-
Merve Noyan authored
PR for conceptual guide on flash attention. I will add more info unless I'm told otherwise. --------- Co-authored-by:
Nicolas Patry <patry.nicolas@protonmail.com> Co-authored-by:
Omar Sanseviero <osanseviero@gmail.com>
-
Julien Bouquillon authored
Looks like an error
-
- 18 Aug, 2023 1 commit
-
-
Omar Sanseviero authored
Co-authored-by:
Lucain <lucainp@gmail.com> Co-authored-by:
Merve Noyan <merveenoyan@gmail.com>
-