Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
OpenDAS
deepspeed
Commits
7925d0c3
"git@developer.sourcefind.cn:OpenDAS/vision.git" did not exist on "49468279d9070a5631b6e0198ee562c00ecedb10"
Unverified
Commit
7925d0c3
authored
Mar 11, 2021
by
Stas Bekman
Committed by
GitHub
Mar 11, 2021
Browse files
small tweaks (#839)
parent
e0f36ed5
Changes
2
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
5 additions
and
5 deletions
+5
-5
docs/_tutorials/zero.md
docs/_tutorials/zero.md
+1
-1
docs/code-docs/source/zero3.rst
docs/code-docs/source/zero3.rst
+4
-4
No files found.
docs/_tutorials/zero.md
View file @
7925d0c3
...
@@ -227,7 +227,7 @@ class ParallelTransformerLayer(MegatronModule):
...
@@ -227,7 +227,7 @@ class ParallelTransformerLayer(MegatronModule):
#### Allocating Massive Megatron-LM Models
#### Allocating Massive Megatron-LM Models
We make two further changes to model initalization in order to support models
We make two further changes to model initalization in order to support models
that exceed
*local*
system memory, but not
not
*total*
system memory.
that exceed
*local*
system memory, but not
*total*
system memory.
1.
Allocate the model in a memory-scalable fashion. The model parameters will
1.
Allocate the model in a memory-scalable fashion. The model parameters will
be allocated and immediately partitioned across the data parallel group. If
be allocated and immediately partitioned across the data parallel group. If
...
...
docs/code-docs/source/zero3.rst
View file @
7925d0c3
...
@@ -21,13 +21,13 @@ Getting Started
...
@@ -21,13 +21,13 @@ Getting Started
If you are new to DeepSpeed, check out our `Getting Started <https://www.deepspeed.ai/getting-started/>`_ page.
If you are new to DeepSpeed, check out our `Getting Started <https://www.deepspeed.ai/getting-started/>`_ page.
Once you are training with DeepSpeed, enabling ZeRO-3
o
ffload is as simple as enabling it
Once you are training with DeepSpeed, enabling ZeRO-3
O
ffload is as simple as enabling it
in your DeepSpeed configuration! Below are a few examples of ZeRO-3 configurations. Please see
in your DeepSpeed configuration! Below are a few examples of ZeRO-3 configurations. Please see
our `config guide <https://www.deepspeed.ai/docs/config-json/#zero-optimizations-for-fp16-training>`_
our `config guide <https://www.deepspeed.ai/docs/config-json/#zero-optimizations-for-fp16-training>`_
for a complete list of options for configuration and performance tuning.
for a complete list of options for configuration and performance tuning.
.. note::
.. note::
ZeRO-Offload works best with our heavily optimized
ZeRO-
3
Offload works best with our heavily optimized
:class:`deepspeed.ops.adam.DeepSpeedCPUAdam` optimizer. We recommend using
:class:`deepspeed.ops.adam.DeepSpeedCPUAdam` optimizer. We recommend using
our `optimizer config <https://www.deepspeed.ai/docs/config-json/#optimizer-parameters>`_
our `optimizer config <https://www.deepspeed.ai/docs/config-json/#optimizer-parameters>`_
to instruct :meth:`deepspeed.initialize` to build the optimizer for you.
to instruct :meth:`deepspeed.initialize` to build the optimizer for you.
...
@@ -149,8 +149,8 @@ DeepSpeed provides mechanisms for collecting (or *gathering*) a partitioned para
...
@@ -149,8 +149,8 @@ DeepSpeed provides mechanisms for collecting (or *gathering*) a partitioned para
Some
models
partitioned
with
:
class
:`
deepspeed
.
zero
.
Init
`
may
need
to
access
Some
models
partitioned
with
:
class
:`
deepspeed
.
zero
.
Init
`
may
need
to
access
a
module
’
s
weights
outside
of
the
class
constructor
or
its
``
forward
()``
a
module
’
s
weights
outside
of
the
class
constructor
or
its
``
forward
()``
method
.
We
refer
to
these
weights
as
**
external
parameters
**,
since
the
y
method
.
We
refer
to
these
weights
as
**
external
parameters
**,
since
the
se
parameters
are
accessed
outside
of
the
module
that
created
i
t
.
To
do
so
,
use
parameters
are
accessed
outside
of
the
module
that
created
t
hem
.
To
do
so
,
use
:
class
:`
deepspeed
.
zero
.
GatheredParameters
`
or
:
meth
:`
deepspeed
.
zero
.
register_external_parameter
`.
:
class
:`
deepspeed
.
zero
.
GatheredParameters
`
or
:
meth
:`
deepspeed
.
zero
.
register_external_parameter
`.
..
autoclass
::
deepspeed
.
zero
.
GatheredParameters
..
autoclass
::
deepspeed
.
zero
.
GatheredParameters
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment