- 01 Jul, 2023 1 commit
-
-
OlivierDehaene authored
-
- 08 Jun, 2023 1 commit
-
-
Nicolas Patry authored
# What does this PR do? Reworked the loading logic. Idea is to use cleaner loading code: - Remove need for `no_init_weights` - Remove all weird `bnb_linear` and `load_weights` and `post_load_weights`. New code layout: - New class `Weights` in charge of handling loading the weights from multiple files into appropiate tensors (potentially sharded) - TP layers now are "shells", they contain the code to know what kind of sharding we need + eventual `all_reduce`. They do not inherit from linear, but they contain some kind of Linear instead - the contained linear can be either FastLinear, BnbLinear or GPTq Linear next. - All modeling code is explictly made for sharding, process group is just no-ops for non sharded code (removes a lot of test cases)  --------- Co-authored-by:
Ubuntu <ubuntu@ip-172-31-41-161.taildb5d.ts.net> Co-authored-by:
Ubuntu <ubuntu@ip-172-31-41-161.ec2.internal> Co-authored-by:
OlivierDehaene <olivier@huggingface.co> Co-authored-by:
OlivierDehaene <23298448+OlivierDehaene@users.noreply.github.com>
-
- 01 Jun, 2023 1 commit
-
-
OlivierDehaene authored
-
- 31 May, 2023 1 commit
-
-
OlivierDehaene authored
-
- 30 May, 2023 1 commit
-
-
OlivierDehaene authored
-
- 23 May, 2023 2 commits
-
-
OlivierDehaene authored
-
OlivierDehaene authored
Fixes #338
-
- 22 May, 2023 1 commit
-
-
OlivierDehaene authored
Fixes #347
-
- 25 Apr, 2023 1 commit
-
-
OlivierDehaene authored
-
- 24 Apr, 2023 1 commit
-
-
OlivierDehaene authored
-
- 21 Apr, 2023 1 commit
-
-
OlivierDehaene authored
-
- 19 Apr, 2023 2 commits
-
-
OlivierDehaene authored
-
OlivierDehaene authored
-
- 16 Apr, 2023 1 commit
-
-
OlivierDehaene authored
-
- 13 Apr, 2023 1 commit
-
-
OlivierDehaene authored
-
- 11 Apr, 2023 2 commits
-
-
OlivierDehaene authored
-
OlivierDehaene authored
-
- 30 Mar, 2023 2 commits
-
-
OlivierDehaene authored
-
OlivierDehaene authored
-
- 26 Mar, 2023 1 commit
-
-
OlivierDehaene authored
-
- 09 Mar, 2023 1 commit
-
-
OlivierDehaene authored
-
- 07 Mar, 2023 1 commit
-
-
OlivierDehaene authored
-
- 03 Mar, 2023 2 commits
-
-
OlivierDehaene authored
-
OlivierDehaene authored
-
- 24 Feb, 2023 1 commit
-
-
OlivierDehaene authored
-
- 18 Feb, 2023 1 commit
-
-
OlivierDehaene authored
-
- 16 Feb, 2023 1 commit
-
-
OlivierDehaene authored
-
- 13 Feb, 2023 1 commit
-
-
OlivierDehaene authored
-
- 07 Feb, 2023 1 commit
-
-
OlivierDehaene authored
-
- 03 Feb, 2023 1 commit
-
-
OlivierDehaene authored
-
- 31 Jan, 2023 1 commit
-
-
OlivierDehaene authored
-
- 05 Jan, 2023 1 commit
-
-
OlivierDehaene authored
Co-authored-by:Nick Hill <nickhill@us.ibm.com>
-
- 08 Dec, 2022 1 commit
-
-
OlivierDehaene authored
-
- 07 Nov, 2022 1 commit
-
-
OlivierDehaene authored
-
- 02 Nov, 2022 1 commit
-
-
OlivierDehaene authored
-
- 28 Oct, 2022 1 commit
-
-
OlivierDehaene authored
-
- 27 Oct, 2022 1 commit
-
-
OlivierDehaene authored
-
- 20 Oct, 2022 1 commit
-
-
Olivier Dehaene authored
-
- 17 Oct, 2022 1 commit
-
-
Olivier Dehaene authored
-
- 14 Oct, 2022 1 commit
-
-
Olivier Dehaene authored
-