- 12 Mar, 2025 1 commit
-
-
Bruce MacDonald authored
Softcap isn't in the whitepaper/implementation for the language model so we should remove it. There is no discernible difference in output with it removed.
-
- 11 Mar, 2025 13 commits
-
-
Jesse Gross authored
Currently we are using positions, which are relative to a sequence and may not be unique.
-
Jesse Gross authored
-
Michael Yang authored
-
Michael Yang authored
-
Michael Yang authored
-
Michael Yang authored
-
Patrick Devine authored
-
Michael Yang authored
-
Patrick Devine authored
-
Michael Yang authored
-
Jesse Gross authored
-
Michael Yang authored
-
Patrick Devine authored
-