- 07 Jul, 2023 1 commit
-
-
Nicolas Patry authored
- Look at `transformers` base class to check for `_key_to_ignore_on_load_missing` or `_tied_weights` which are the standard attributes to select the keys to NOT save on disk (since they are ignored) - Modified safetensors code (to be reflected in safetensors even if it's an internal function). - Will not work for trust_remote_code=True repos (like santacoder). Should help with : https://github.com/huggingface/text-generation-inference/issues/555 and : https://github.com/huggingface/text-generation-inference/pull/501 and https://github.com/huggingface/text-generation-inference/issues/556 and https://github.com/huggingface/text-generation-inference/issues/482#issuecomment-1623713593
-
- 08 Jun, 2023 1 commit
-
-
Nicolas Patry authored
# What does this PR do? Reworked the loading logic. Idea is to use cleaner loading code: - Remove need for `no_init_weights` - Remove all weird `bnb_linear` and `load_weights` and `post_load_weights`. New code layout: - New class `Weights` in charge of handling loading the weights from multiple files into appropiate tensors (potentially sharded) - TP layers now are "shells", they contain the code to know what kind of sharding we need + eventual `all_reduce`. They do not inherit from linear, but they contain some kind of Linear instead - the contained linear can be either FastLinear, BnbLinear or GPTq Linear next. - All modeling code is explictly made for sharding, process group is just no-ops for non sharded code (removes a lot of test cases)  --------- Co-authored-by:
Ubuntu <ubuntu@ip-172-31-41-161.taildb5d.ts.net> Co-authored-by:
Ubuntu <ubuntu@ip-172-31-41-161.ec2.internal> Co-authored-by:
OlivierDehaene <olivier@huggingface.co> Co-authored-by:
OlivierDehaene <23298448+OlivierDehaene@users.noreply.github.com>
-
- 02 Jun, 2023 1 commit
-
-
OlivierDehaene authored
Close #288
-
- 26 May, 2023 1 commit
-
-
OlivierDehaene authored
Co-authored-by:Joel Lamy-Poirier <joel.lamy-poirier@servicenow.com>
-
- 24 May, 2023 1 commit
-
-
OlivierDehaene authored
Closes #307 #308
-
- 16 May, 2023 1 commit
-
-
OlivierDehaene authored
Fixes #333 --------- Co-authored-by:Nicolas Patry <patry.nicolas@protonmail.com>
-
- 27 Apr, 2023 1 commit
-
-
Ehsan M. Kermani authored
-
- 24 Apr, 2023 2 commits
-
-
OlivierDehaene authored
Co-authored-by:Nick Hill <nickhill@us.ibm.com>
-
Nick Hill authored
-
- 20 Apr, 2023 1 commit
-
-
OlivierDehaene authored
-
- 11 Apr, 2023 1 commit
-
-
OlivierDehaene authored
-
- 09 Apr, 2023 1 commit
-
-
OlivierDehaene authored
-
- 05 Apr, 2023 1 commit
-
-
OlivierDehaene authored
-
- 16 Mar, 2023 1 commit
-
-
OlivierDehaene authored
-
- 09 Mar, 2023 1 commit
-
-
OlivierDehaene authored
closes #112
-
- 07 Mar, 2023 1 commit
-
-
OlivierDehaene authored
-
- 06 Mar, 2023 1 commit
-
-
OlivierDehaene authored
-
- 24 Feb, 2023 1 commit
-
-
OlivierDehaene authored
-
- 14 Feb, 2023 1 commit
-
-
OlivierDehaene authored
-
- 03 Feb, 2023 1 commit
-
-
OlivierDehaene authored
-
- 02 Feb, 2023 1 commit
-
-
OlivierDehaene authored
@njhill, @yk FYI generated_text was concatenated to the user prompt for legacy reason. We want to remove this behaviour as we don't think it is useful and even detrimonial to usability. We also remove the unused Vec.
-
- 01 Feb, 2023 1 commit
-
-
OlivierDehaene authored
-
- 31 Jan, 2023 4 commits
-
-
OlivierDehaene authored
-
OlivierDehaene authored
-
OlivierDehaene authored
Reverts huggingface/text-generation-inference#36
-
OlivierDehaene authored
Add token streaming using ServerSideEvents (SSE). The signature of the SSE events is: ```rust struct Details { finish_reason: String, generated_tokens: u32, seed: Option<u64>, } struct StreamResponse { token: Token, generated_text: Option<String>, details: Option<Details>, } struct ErrorResponse { error: String, } ```
-
- 20 Jan, 2023 2 commits
-
-
OlivierDehaene authored
-
OlivierDehaene authored
-
- 16 Dec, 2022 1 commit
-
-
OlivierDehaene authored
-
- 15 Dec, 2022 1 commit
-
-
OlivierDehaene authored
-
- 12 Dec, 2022 1 commit
-
-
OlivierDehaene authored
-
- 08 Dec, 2022 1 commit
-
-
OlivierDehaene authored
-