environment_variables.rst 29.4 KB
Newer Older
yuguo's avatar
yuguo committed
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
Environment Variables
================================================

OneFlow has an extensive set of environment variables to tune for specific usage.

`ONEFLOW_COMM_NET_IB_HCA <https://github.com/Oneflow-Inc/oneflow/blob/v0.9.0/oneflow/core/comm_network/ibverbs/ibverbs_comm_network.cpp#L47>`_
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

When there are multiple IB NIC(which can be checked by ``ibstatus`` on the server, the system uses the first IB NIC for comm_net communication by default.

When this environment variable is set, the system will check all IB NIC and find the NIC with the corresponding name. `#5626 <https://github.com/Oneflow-Inc/oneflow/pull/5626>`_

Values accepted
^^^^^^^^^^^^^^^
The default value is empty, such as ``mlx5_0:1``、 ``mlx5_1:1``. When the port is 0, the default value is 1, representing the first port.

`ONEFLOW_COMM_NET_IB_GID_INDEX <https://github.com/Oneflow-Inc/oneflow/blob/v0.9.0/oneflow/core/comm_network/ibverbs/ibverbs_comm_network.cpp#L142>`_
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

For the query of `ibv_query_gid <https://www.ibm.com/docs/en/aix/7.2?topic=management-ibv-query-gid>`_, and 0 represents success. It often used with ``ONEFLOW_COMM_NET_IB_HCA``. GID means the Global ID, QP under RoCE network must be built by this value, instead of just using the LID as in the IB network. `#5626 <https://github.com/Oneflow-Inc/oneflow/pull/5626>`_

Values accepted
^^^^^^^^^^^^^^^
The default value is 0, representing the port index value

`ONEFLOW_COMM_NET_IB_QUEUE_DEPTH <https://github.com/Oneflow-Inc/oneflow/blob/v0.9.0/oneflow/core/comm_network/ibverbs/ibverbs_qp.cpp#L44>`_
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Queue length of jobs in IB network.

This value effectively controls the size of the module without instead of using IB's default size, such as ``ONEFLOW_COMM_NET_IB_MEM_BLOCK_SIZE``.

Values accepted
^^^^^^^^^^^^^^^
The default value is ``1024``, receiving ``int64_t``. The system would compare with ``max_qp_wr`` (Maximum number of outstanding WR on any work queue), and take the smaller one.

`ONEFLOW_COMM_NET_IB_MEM_BLOCK_SIZE <https://github.com/Oneflow-Inc/oneflow/blob/v0.9.0/oneflow/core/comm_network/ibverbs/ibverbs_qp.cpp#L68>`_
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

The size of the module read when communicating.

The value can calculate the amount of module, and transmit it after encapsulation.

Values accepted
^^^^^^^^^^^^^^^
The default value is ``8388608`` (8M)

`ONEFLOW_STREAM_CUDA_EVENT_FLAG_BLOCKING_SYNC <https://github.com/Oneflow-Inc/oneflow/blob/v0.9.0/oneflow/core/ep/cuda/cuda_device.cpp#L59>`_
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Represents stream, and marks Blocking synchronization in cuda. `Detailed information <https://www.cnblogs.com/1024incn/p/5891051.html>`_, `#5612 <https://github.com/Oneflow-Inc/oneflow/pull/5612>`_, `#5837 <https://github.com/Oneflow-Inc/oneflow/pull/5837>`_

Values accepted
^^^^^^^^^^^^^^^
Define and set to ``false``, and would be ``true` only when the value is ``1``, ``true``, ``yes``, ``on`` and ``y``.

`ONEFLOW_LIBIBVERBS_PATH <https://github.com/Oneflow-Inc/oneflow/blob/v0.9.0/oneflow/core/platform/lib/ibv_wrapper.cpp#L24>`_
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

To load the DynamicLibrary by dlopen at runtime, to find symbols of ibverbs functions by dlopen without linking during compile for better compatibility. `#4852 <https://github.com/Oneflow-Inc/oneflow/pull/4852>`_.

If it failed, it will output ``libibverbs not available, ibv_fork_init skipped``, if it worked, the ``import oneflow`` will output such as ``loaded library: /usr/lib/x86_64-linux-gnu/libibverbs.so.1``

Values accepted
^^^^^^^^^^^^^^^
The default value is empty, but will load ``libibverbs.so.1``, ``libibverbs.so``.

`ONEFLOW_DEBUG_MODE <https://github.com/Oneflow-Inc/oneflow/blob/v0.9.0/oneflow/core/common/env_var/debug_mode.h#L23>`_
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Enable ``debug`` mode, ``ONEFLOW_DEBUG`` can do.

If ``debug`` mode is on, it will output more INFO level logs, different ``prototxt`` and ``dot`` to files. The automatically inserted boxing information will be printed to the log file under eager global mode.

Values accepted
^^^^^^^^^^^^^^^
The default value is empty, but will receive any string.

`ONEFLOW_DRY_RUN <https://github.com/Oneflow-Inc/oneflow/blob/v0.9.0/oneflow/core/job/resource_desc.cpp#L65>`_
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Only for test running, it can generate log files like ``dot``.

Exit once the test is succeed, do not try real training.

Values accepted
^^^^^^^^^^^^^^^
The default value is empty, but will receive any string.

`ONEFLOW_DEBUG_KERNEL_SYNC_CHECK_NUMERICS <https://github.com/Oneflow-Inc/oneflow/blob/v0.9.0/oneflow/core/lazy/stream_context/cuda/cuda_stream_context.cpp#L66>`_
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Only used when debugging because the performance would be affected, it could detect which op in the network appears nan or inf.

It will create ``CpuCheckNumericsKernelObserver`` under ``cpu`` , and ``CudaCheckNumericsKernelObserver`` under ``cuda`` `#6052 <https://github.com/Oneflow-Inc/oneflow/pull/6052>`_ .

Values accepted
^^^^^^^^^^^^^^^
Define and set to ``false``, and would be ``true`` only when the value is ``1``, ``true``, ``yes``, ``on`` and ``y``.

`ONEFLOW_DEBUG_KERNEL_SYNC_CHECK <https://github.com/Oneflow-Inc/oneflow/blob/v0.9.0/oneflow/core/job/env_global_objects_scope.cpp#L193>`_
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Only used when debugging because the performance would be affected.

It will create ``SyncCheckKernelObserver`` and will be synced after each kernel.

It could be used to debug cuda errors. `#6052 <https://github.com/Oneflow-Inc/oneflow/pull/6052>`_

Values accepted
^^^^^^^^^^^^^^^
Define and set to ``false``, and would be ``true`` only when the value is ``1``, ``true``, ``yes``, ``on`` and ``y``.

`ONEFLOW_PROFILER_KERNEL_PROFILE_CUDA_MEMORY_BANDWIDTH <https://github.com/Oneflow-Inc/oneflow/blob/v0.9.0/oneflow/core/profiler/kernel.cpp#L34>`_
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Used when generate profiler files by nsys.

Profiler is only valid for lazy temporarily.

It can estimate the memory bandwidth reached by kernel by counting the execution time of the GPU kernel and the size of the input and output memory, and help find potential kernels that can be optimized. `Details <https://github.com/Oneflow-Inc/oneflow/blob/02e29f9648f63a4d936cd818061e90064d027005/oneflow/core/profiler/kernel.cpp#L53>`_

Values accepted
^^^^^^^^^^^^^^^
Define and set to ``false``. When using, the compiled package needs to enable ``BUILD_PROFILER``.

`ONEFLOW_PROFILER_KERNEL_PROFILE_KERNEL_FORWARD_RANGE <https://github.com/Oneflow-Inc/oneflow/blob/v0.9.0/oneflow/core/profiler/kernel.cpp#L36>`_
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

The same as above. collect `op name <https://github.com/Oneflow-Inc/oneflow/blob/v0.9.0/oneflow/core/profiler/kernel.cpp#L62>`_

Values accepted
^^^^^^^^^^^^^^^
Define and set to ``false``. When using, the compiled package needs to enable ``BUILD_PROFILER``.

`ONEFLOW_KERNEL_DISABLE_BLOB_ACCESS_CHECKER <https://github.com/Oneflow-Inc/oneflow/blob/v0.9.0/oneflow/core/job/env_global_objects_scope.cpp#L199>`_
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Only use blob_access_checker after enabling, because blob_access_checker is for correctness assurance, and closing it in some cases can increase the kernel overhead. `#5728 <https://github.com/Oneflow-Inc/oneflow/pull/5728>`_

Values accepted
^^^^^^^^^^^^^^^
Define and set to ``false``, and would be ``true`` only when the value is ``1``, ``true``, ``yes``, ``on`` and ``y``.

`ONEFLOW_KERNEL_ENABLE_CUDA_GRAPH <https://github.com/Oneflow-Inc/oneflow/blob/v0.9.0/oneflow/core/kernel/user_kernel.cpp#L692>`_
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Takes effect under ``WITH_CUDA_GRAPHS`` and the default value is ``false``. It uses more memory, so when there's just enough memory, it won't run.

Turning on CUDA_GRAPH will use up more memory CUDA Graphs support. `#5868 <https://github.com/Oneflow-Inc/oneflow/pull/5868>`_

Values accepted
^^^^^^^^^^^^^^^
Define and set to ``false``, and would be ``true`` only when the value is ``1``, ``true``, ``yes``, ``on`` and ``y``.

`ONEFLOW_ACTOR_ENABLE_LIGHT_ACTOR <https://github.com/Oneflow-Inc/oneflow/blob/v0.9.0/oneflow/core/thread/thread.cpp#L30>`_
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

LightActor is a new type of Actor that only handles NormalForward and similar tasks where all regst_num is 1 or tasks with only one kernel. `#5868 <https://github.com/Oneflow-Inc/oneflow/pull/5868>`_. ``export ONEFLOW_KERNEL_ENABLE_CUDA_GRAPH=1`` (Would use more memories), ``export ONEFLOW_THREAD_ENABLE_LOCAL_MESSAGE_QUEUE=1``, ``export ONEFLOW_KERNEL_DISABLE_BLOB_ACCESS_CHECKER=1``, ``export ONEFLOW_ACTOR_ENABLE_LIGHT_ACTOR=1``, ``export ONEFLOW_STREAM_REUSE_CUDA_EVENT=1`` can be used together.

Values accepted
^^^^^^^^^^^^^^^
Define and set to ``false``, and would be ``true`` only when the value is ``1``, ``true``, ``yes``, ``on`` and ``y``.

`ONEFLOW_THREAD_ENABLE_LOCAL_MESSAGE_QUEUE <https://github.com/Oneflow-Inc/oneflow/blob/v0.9.0/oneflow/core/thread/thread.cpp#L29>`_
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

`#5720 <https://github.com/Oneflow-Inc/oneflow/pull/5720>`_. It is used to enable local message queue, ``oneflow.config.thread_enable_local_message_queue(True)`` is no longer used.

Values accepted
^^^^^^^^^^^^^^^
Define and set to ``false``, and would be ``true`` only when the value is ``1``, ``true``, ``yes``, ``on`` and ``y``.

`ONEFLOW_PERSISTENT_IN_STREAM_BUFFER_SIZE_BYTES <https://github.com/Oneflow-Inc/oneflow/blob/v0.9.0/oneflow/core/persistence/persistent_in_stream.cpp#L30>`_
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Represents the size of each read from disk. `#5162 <https://github.com/Oneflow-Inc/oneflow/pull/5162>`_

Values accepted
^^^^^^^^^^^^^^^
The default value is empty. If an invalid string or negative number is entered, the default value would be ``32 * 1024``; 32KB.

`ONEFLOW_DECODER_ENABLE_NVJPEG_HARDWARE_ACCELERATION <https://github.com/Oneflow-Inc/oneflow/blob/v0.9.0/oneflow/core/kernel/image_decoder_random_crop_resize_kernel.cpp#L290>`_
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

``NVJPEG_VER_MAJOR`` need to be bigger than ``11``. It can accelerate nvjpeg hardware, warm up jpeg decoder and hw_jpeg decoder, `#5851 <https://github.com/Oneflow-Inc/oneflow/pull/5851>`_.

Hardware JPEG decoder and NVIDIA nvJPEG library on NVIDIA A100 GPUs

Values accepted
^^^^^^^^^^^^^^^
Define and set to ``true``, and would be ``true`` only when the value is ``1``, ``true``, ``yes``, ``on`` and ``y``.

`ONEFLOW_SERVING_DEBUG <https://github.com/Oneflow-Inc/oneflow/blob/v0.9.0/oneflow/api/cpp/framework/graph.cpp#L213>`_
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

For printing information of OneFlow Serving Debug

Values accepted
^^^^^^^^^^^^^^^
The default value is ``false``

`ONEFLOW_DISABLE_VIEW <https://github.com/Oneflow-Inc/oneflow/blob/v0.9.0/oneflow/core/framework/tensor_methods.cpp#L35>`_
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

To disable view mechanism, which means op related to view would stop running.

Values accepted
^^^^^^^^^^^^^^^
The default value is ``false``

`ONEFLOW_BOXING_DISABLE_MIDDLE_NODE_AND_CHECK <https://github.com/Oneflow-Inc/oneflow/blob/v0.9.0/oneflow/core/auto_parallel/boxing_collector.cpp#L82>`_
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Whether to disable Middle Node. When it is false, all inter-SBP communication is supported

Values accepted
^^^^^^^^^^^^^^^
The default value is ``false``

`ONEFLOW_ONE_EMBEDDING_DISABLE_NUMA_AWARE_ALLOCATION <https://github.com/Oneflow-Inc/oneflow/blob/v0.9.0/oneflow/core/embedding/full_cache.cu#L414>`_
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Whether to disable NUMA_AWARE memory allocation when the OneEmbedding module allocates video memory.

NUMA_AWARE memory allocation means that when allocating pinned host memory, the cpu close to the gpu will be considered (for example, if it is gpu 0 1, memory will be allocated on cpu0)

Values accepted
^^^^^^^^^^^^^^^
The default value is ``false``

`ONEFLOW_EP_CUDA_ENABLE_TF32_EXECUTION <https://github.com/Oneflow-Inc/oneflow/blob/v0.9.0/oneflow/core/ep/cuda/cuda_stream.cpp#L96>`_
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Whether to allow CUDA to use TF32 numeric types for computation

Values accepted
^^^^^^^^^^^^^^^
The default value is ``true``

`ONEFLOW_FUNCTOR_DISABLE_FUSED_MLP <https://github.com/Oneflow-Inc/oneflow/blob/v0.9.0/oneflow/core/functional/impl/nn_functor.cpp#L554>`_
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Whether to disable the fused_mlp operator implemented by cublasLt in FusedMLPFunctor, if disabled, it will degenerate into a multiple matrix multiplication operation.

Values accepted
^^^^^^^^^^^^^^^
The default value is ``false``

`ONEFLOW_ONE_EMBEDDING_EMBEDDING_SHUFFLE_INDEPENTENT_STREAM <https://github.com/Oneflow-Inc/oneflow/blob/v0.9.0/oneflow/core/job_rewriter/replace_embedding_ops_pass.cpp#L192>`_
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Whether to put the EmbeddingShuffle of the OneEmbedding module on a separate stream for overlapping execution.

Values accepted
^^^^^^^^^^^^^^^
The default value is ``false``

`ONEFLOW_ONE_EMBEDDING_GRADIENT_SHUFFLE_USE_FP16 <https://github.com/Oneflow-Inc/oneflow/blob/v0.9.0/oneflow/core/job_rewriter/replace_embedding_ops_pass.cpp#L209>`_
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Whether to allow the EmbeddingGradientShuffle operator of the OneEmbedding module to use the FP16 data type in the AMP case.

Values accepted
^^^^^^^^^^^^^^^
The default value is ``true``

`ONEFLOW_ONE_EMBEDDING_NOT_FUSE_CAST_TO_UPDATE <https://github.com/Oneflow-Inc/oneflow/blob/v0.9.0/oneflow/core/job_rewriter/replace_embedding_ops_pass.cpp#L260>`_
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Whether to disable the fusion of cast type conversion and parameter update of OneEmbedding parameters into one operator in the case of AMP

Values accepted
^^^^^^^^^^^^^^^
The default value is ``false``

`ONEFLOW_DEBUG_KERNEL_SYNC_CHECK_NUMERICS_DUMP <https://github.com/Oneflow-Inc/oneflow/blob/v0.9.0/oneflow/core/kernel/cpu_numerics_kernel_observer.cpp#L65>`_
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

When the value appears NaN or Inf, save the data Dump.

Values accepted
^^^^^^^^^^^^^^^
The default value is ``false``

`ONEFLOW_MLIR_ENABLE_IR_PRINTING <https://github.com/Oneflow-Inc/oneflow/blob/v0.9.0/oneflow/ir/lib/OneFlow/Passes.cpp#L768>`_
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Control whether to print ir when running each pass when debugging

Values accepted
^^^^^^^^^^^^^^^
The default value is ``false``

`ONEFLOW_MLIR_STDOUT <https://github.com/Oneflow-Inc/oneflow/blob/v0.9.0/oneflow/ir/oneflow-extension/extension.cpp#L151>`_
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Control whether MLIR outputs log information in the console

Values accepted
^^^^^^^^^^^^^^^
The default value is ``false``

`ONEFLOW_MLIR_DUMP_IR <https://github.com/Oneflow-Inc/oneflow/blob/v0.9.0/oneflow/ir/oneflow-extension/extension.cpp#L152>`_
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Control whether to dump ir files

Values accepted
^^^^^^^^^^^^^^^
The default value is ``false``

`ONEFLOW_MLIR_ENABLE_ROUND_TRIP <https://github.com/Oneflow-Inc/oneflow/blob/v0.9.0/oneflow/ir/oneflow-extension/ir_pass.cpp#L157>`_
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Control whether Oneflow Job goes into MLIR

Values accepted
^^^^^^^^^^^^^^^
The default value is ``false``

`ONEFLOW_KERNEL_REDUCE_SUM_USE_MATMUL <https://github.com/Oneflow-Inc/oneflow/blob/v0.9.0/oneflow/user/kernels/reduce_kernel.cpp#L333>`_
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

whether to use matrix multiplication for reduce_sum

Values accepted
^^^^^^^^^^^^^^^
The default value is ``false``

`ONEFLOW_ONE_EMBEDDING_ENABLE_QUANTIZED_COMM <https://github.com/Oneflow-Inc/oneflow/blob/dd580f21ffb6e4d23a899c7e0ac6d2bc502f3f1a/oneflow/core/job_rewriter/fuse_embedding_interaction_pass.cpp#L35>`_
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Whether to quantify the shuffle application communication in the case of OneEmbedding multi-card

Values accepted
^^^^^^^^^^^^^^^
The default value is ``false``

`ONEFLOW_TENSOR_BUFFER_ALIGNED_SIZE <https://github.com/Oneflow-Inc/oneflow/blob/v0.9.0/oneflow/core/common/tensor_buffer.cpp#L29>`_
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Align size when allocating TensorBuffer memory

Values accepted
^^^^^^^^^^^^^^^
The default value is ``1024``

`ONEFLOW_TENSOR_BUFFER_POOL_THREAD_LOCAL_CACHE_SIZE <https://github.com/Oneflow-Inc/oneflow/blob/v0.9.0/oneflow/core/common/tensor_buffer.cpp#L206>`_
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Control the size of ``thread_local_cache`` in TensorBufferPool

Values accepted
^^^^^^^^^^^^^^^
The default value is ``64``

`ONEFLOW_GRPC_MAX_MESSAGE_BYTE_SIZE <https://github.com/Oneflow-Inc/oneflow/blob/v0.9.0/oneflow/core/control/ctrl_service.cpp#L45>`_
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Set the maximum size of the gRPC transport message

Values accepted
^^^^^^^^^^^^^^^
The default value is ``-1``

`ONEFLOW_ONE_EMBEDDING_PERSISTENT_TABLE_CAPACITY_HINT <https://github.com/Oneflow-Inc/oneflow/blob/v0.9.0/oneflow/core/embedding/persistent_table.cpp#L410>`_
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Control the initial capacity of the PersistentTable of OneEmbedding to avoid frequent expansion

Values accepted
^^^^^^^^^^^^^^^
OneEmbedding will calculate according to the actual situation, and users can also choose to configure a larger capacity.

`ONEFLOW_ONE_EMBEDDING_PERSISTENT_TABLE_NUM_WORKERS <https://github.com/Oneflow-Inc/oneflow/blob/v0.9.0/oneflow/core/embedding/persistent_table.cpp#L435>`_
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

The number of threads used for reading and writing the PersistentTable of OneEmbedding

Values accepted
^^^^^^^^^^^^^^^
The default value is ``4``

`ONEFLOW_EP_CUDA_CONST_BUFFER_ELEMENT_COUNT <https://github.com/Oneflow-Inc/oneflow/blob/v0.9.0/oneflow/core/ep/cuda/cuda_device.cpp#L62>`_
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Specify the size of the all zero and all one buffers on the CUDA device.

This buffer can be used with matrix multiplication to implement operations such as reduce_sum

Values accepted
^^^^^^^^^^^^^^^
The default value is ``1024x1024``

`OMP_NUM_THREADS <https://github.com/Oneflow-Inc/oneflow/blob/v0.9.0/oneflow/core/job/env_global_objects_scope.cpp#L96>`_
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Set the number of threads used by OMP

Values accepted
^^^^^^^^^^^^^^^
The default value will be generated by specific `computational logic <https://github.com/Oneflow-Inc/oneflow/blob/v0.9.0/oneflow/core/job/env_global_objects_scope.cpp#L106-L108>`_.

`SBP_INFER_RULE_TAG <https://github.com/Oneflow-Inc/oneflow/blob/v0.9.0/oneflow/core/operator/operator.cpp#L718>`_
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Specify SBP derivation rules

Values accepted
^^^^^^^^^^^^^^^
When the default vaule is ``1`` , select the SBP that satisfies the producer or the SBP with the smallest cost as much as possible.

When the default value is ``2``, select the SBP that matches the most.

When the default value is ``3``, select the SBP with the smallest cost.

`ONEFLOW_TENSOR_BUFFER_GROWTH_FACTOR <https://github.com/Oneflow-Inc/oneflow/blob/v0.9.0/oneflow/core/common/tensor_buffer.cpp#L35>`_
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Control the growth factor of TensorBuffer

Values accepted
^^^^^^^^^^^^^^^
The default value is ``1.0``

`ONEFLOW_TENSOR_BUFFER_SHRINK_FACTOR <https://github.com/Oneflow-Inc/oneflow/blob/v0.9.0/oneflow/core/common/tensor_buffer.cpp#L41>`_
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Controls the shrink factor of TensorBuffer

Values accepted
^^^^^^^^^^^^^^^
The default value is ``0.7``

`ONEFLOW_TENSOR_BUFFER_POOL_SIZE_FACTOR <https://github.com/Oneflow-Inc/oneflow/blob/v0.9.0/oneflow/core/common/tensor_buffer.cpp#L200>`_
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Controls the size factor of TensorBuffer

Values accepted
^^^^^^^^^^^^^^^
The default value is ``2.0``

`AUTO_PARALLEL_TRANSFER_COST <https://github.com/Oneflow-Inc/oneflow/blob/v0.9.0/oneflow/core/framework/sbp_infer_util.cpp#L544>`_
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Control the size of the automatic parallel transfer cost

Values accepted
^^^^^^^^^^^^^^^
The default value is ``1.65e8``


`ONEFLOW_DEBUG_PASS <https://github.com/Oneflow-Inc/oneflow/blob/v0.9.0/oneflow/core/job/job_build_and_infer_ctx.cpp#L991>`_
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Pass names and print job before and after a specific pass, such as ``export ONEFLOW_DEBUG_PASS="FuseAddToOutputPass``.

Or ALL, print job before and after a specific pass, such as ``export ONEFLOW_DEBUG_PASS="ALL"``.

Values accepted
^^^^^^^^^^^^^^^
The default value is ``empty``

`ONEFLOW_PROFILER_HOST_THREAD_NAME_PREFIX <https://github.com/Oneflow-Inc/oneflow/blob/v0.9.0/oneflow/core/profiler/profiler.cpp#L39>`_
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Add a prefix to the name of the named host thread in the profiling context to facilitate sorting in the visualization tool (nsight)

Values accepted
^^^^^^^^^^^^^^^
The default value is ``empty``