"vscode:/vscode.git/clone" did not exist on "0b1e85fbea14b3a9ed6269b53ec921dc4eb02668"
testing.rst 36.6 KB
Newer Older
1
Testing
Sylvain Gugger's avatar
Sylvain Gugger committed
2
=======================================================================================================================
3
4
5
6
7
8
9
10
11
12


Let's take a look at how 🤗 Transformer models are tested and how you can write new tests and improve the existing ones.

There are 2 test suites in the repository:

1. ``tests`` -- tests for the general API
2. ``examples`` -- tests primarily for various applications that aren't part of the API

How transformers are tested
Sylvain Gugger's avatar
Sylvain Gugger committed
13
-----------------------------------------------------------------------------------------------------------------------
14

Sylvain Gugger's avatar
Sylvain Gugger committed
15
16
17
18
1. Once a PR is submitted it gets tested with 9 CircleCi jobs. Every new commit to that PR gets retested. These jobs
   are defined in this `config file <https://github.com/huggingface/transformers/blob/master/.circleci/config.yml>`__,
   so that if needed you can reproduce the same environment on your machine.

19
   These CI jobs don't run ``@slow`` tests.
Sylvain Gugger's avatar
Sylvain Gugger committed
20

21
22
2. There are 3 jobs run by `github actions <https://github.com/huggingface/transformers/actions>`__:

Sylvain Gugger's avatar
Sylvain Gugger committed
23
24
25
26
27
28
29
30
   * `torch hub integration
     <https://github.com/huggingface/transformers/blob/master/.github/workflows/github-torch-hub.yml>`__: checks
     whether torch hub integration works.

   * `self-hosted (push) <https://github.com/huggingface/transformers/blob/master/.github/workflows/self-push.yml>`__:
     runs fast tests on GPU only on commits on ``master``. It only runs if a commit on ``master`` has updated the code
     in one of the following folders: ``src``, ``tests``, ``.github`` (to prevent running on added model cards,
     notebooks, etc.)
31

Sylvain Gugger's avatar
Sylvain Gugger committed
32
33
34
   * `self-hosted runner
     <https://github.com/huggingface/transformers/blob/master/.github/workflows/self-scheduled.yml>`__: runs normal and
     slow tests on GPU in ``tests`` and ``examples``:
35
36
37

   .. code-block:: bash

Stas Bekman's avatar
Stas Bekman committed
38
39
    RUN_SLOW=1 pytest tests/
    RUN_SLOW=1 pytest examples/
40
41
42
43
44
45

   The results can be observed `here <https://github.com/huggingface/transformers/actions>`__.



Running tests
Sylvain Gugger's avatar
Sylvain Gugger committed
46
-----------------------------------------------------------------------------------------------------------------------
47
48
49
50
51
52





Choosing which tests to run
Sylvain Gugger's avatar
Sylvain Gugger committed
53
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
54

Sylvain Gugger's avatar
Sylvain Gugger committed
55
56
This document goes into many details of how tests can be run. If after reading everything, you need even more details
you will find them `here <https://docs.pytest.org/en/latest/usage.html>`__.
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87

Here are some most useful ways of running tests.

Run all:

.. code-block:: console

   pytest

or:

.. code-block:: bash

   make test

Note that the latter is defined as:

.. code-block:: bash

   python -m pytest -n auto --dist=loadfile -s -v ./tests/

which tells pytest to:

* run as many test processes as they are CPU cores (which could be too many if you don't have a ton of RAM!)
* ensure that all tests from the same file will be run by the same test process
* do not capture output
* run in verbose mode



Getting the list of all tests
Sylvain Gugger's avatar
Sylvain Gugger committed
88
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
89
90
91
92
93
94
95
96
97
98
99
100
101
102

All tests of the test suite:

.. code-block:: bash

   pytest --collect-only -q

All tests of a given test file:

.. code-block:: bash

   pytest tests/test_optimization.py --collect-only -q


Sylvain Gugger's avatar
Sylvain Gugger committed
103

104
Run a specific test module
Sylvain Gugger's avatar
Sylvain Gugger committed
105
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
106
107
108
109
110
111

To run an individual test module:

.. code-block:: bash

   pytest tests/test_logging.py
Sylvain Gugger's avatar
Sylvain Gugger committed
112

113
114

Run specific tests
Sylvain Gugger's avatar
Sylvain Gugger committed
115
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
116

Sylvain Gugger's avatar
Sylvain Gugger committed
117
118
Since unittest is used inside most of the tests, to run specific subtests you need to know the name of the unittest
class containing those tests. For example, it could be:
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144

.. code-block:: bash

   pytest tests/test_optimization.py::OptimizationTest::test_adam_w

Here:

* ``tests/test_optimization.py`` - the file with tests
* ``OptimizationTest`` - the name of the class
* ``test_adam_w`` - the name of the specific test function

If the file contains multiple classes, you can choose to run only tests of a given class. For example:

.. code-block:: bash

   pytest tests/test_optimization.py::OptimizationTest


will run all the tests inside that class.

As mentioned earlier you can see what tests are contained inside the ``OptimizationTest`` class by running:

.. code-block:: bash

   pytest tests/test_optimization.py::OptimizationTest --collect-only -q

Sylvain Gugger's avatar
Sylvain Gugger committed
145

146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
You can run tests by keyword expressions.

To run only tests whose name contains ``adam``:

.. code-block:: bash

   pytest -k adam tests/test_optimization.py

To run all tests except those whose name contains ``adam``:

.. code-block:: bash

   pytest -k "not adam" tests/test_optimization.py

And you can combine the two patterns in one:


.. code-block:: bash

   pytest -k "ada and not adam" tests/test_optimization.py



Run only modified tests
Sylvain Gugger's avatar
Sylvain Gugger committed
170
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
171

Sylvain Gugger's avatar
Sylvain Gugger committed
172
173
174
You can run the tests related to the unstaged files or the current branch (according to Git) by using `pytest-picked
<https://github.com/anapaulagomes/pytest-picked>`__. This is a great way of quickly testing your changes didn't break
anything, since it won't run the tests related to files you didn't touch.
175
176
177
178
179
180
181
182
183

.. code-block:: bash

    pip install pytest-picked

.. code-block:: bash

    pytest --picked

Sylvain Gugger's avatar
Sylvain Gugger committed
184
All tests will be run from files and folders which are modified, but not yet committed.
185
186

Automatically rerun failed tests on source modification
Sylvain Gugger's avatar
Sylvain Gugger committed
187
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
188

Sylvain Gugger's avatar
Sylvain Gugger committed
189
190
191
`pytest-xdist <https://github.com/pytest-dev/pytest-xdist>`__ provides a very useful feature of detecting all failed
tests, and then waiting for you to modify files and continuously re-rerun those failing tests until they pass while you
fix them. So that you don't need to re start pytest after you made the fix. This is repeated until all tests pass after
192
193
194
195
196
197
198
199
which again a full run is performed.

.. code-block:: bash

    pip install pytest-xdist

To enter the mode: ``pytest -f`` or ``pytest --looponfail``

Sylvain Gugger's avatar
Sylvain Gugger committed
200
201
202
File changes are detected by looking at ``looponfailroots`` root directories and all of their contents (recursively).
If the default for this value does not work for you, you can change it in your project by setting a configuration
option in ``setup.cfg``:
203
204
205
206
207
208
209
210
211
212
213
214
215

.. code-block:: ini

    [tool:pytest]
    looponfailroots = transformers tests

or ``pytest.ini``/``tox.ini`` files:

.. code-block:: ini

    [pytest]
    looponfailroots = transformers tests

Sylvain Gugger's avatar
Sylvain Gugger committed
216
217
This would lead to only looking for file changes in the respective directories, specified relatively to the ini-files
directory.
218

Sylvain Gugger's avatar
Sylvain Gugger committed
219
`pytest-watch <https://github.com/joeyespo/pytest-watch>`__ is an alternative implementation of this functionality.
220
221
222


Skip a test module
Sylvain Gugger's avatar
Sylvain Gugger committed
223
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
224

Sylvain Gugger's avatar
Sylvain Gugger committed
225
226
If you want to run all test modules, except a few you can exclude them by giving an explicit list of tests to run. For
example, to run all except ``test_modeling_*.py`` tests:
227
228
229
230
231
232
233

.. code-block:: bash

   pytest `ls -1 tests/*py | grep -v test_modeling`


Clearing state
Sylvain Gugger's avatar
Sylvain Gugger committed
234
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
235

Sylvain Gugger's avatar
Sylvain Gugger committed
236
CI builds and when isolation is important (against speed), cache should be cleared:
237
238
239
240
241
242

.. code-block:: bash

    pytest --cache-clear tests

Running tests in parallel
Sylvain Gugger's avatar
Sylvain Gugger committed
243
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
244

Sylvain Gugger's avatar
Sylvain Gugger committed
245
246
As mentioned earlier ``make test`` runs tests in parallel via ``pytest-xdist`` plugin (``-n X`` argument, e.g. ``-n 2``
to run 2 parallel jobs).
247

Sylvain Gugger's avatar
Sylvain Gugger committed
248
249
``pytest-xdist``'s ``--dist=`` option allows one to control how the tests are grouped. ``--dist=loadfile`` puts the
tests located in one file onto the same process.
250

Sylvain Gugger's avatar
Sylvain Gugger committed
251
252
253
254
Since the order of executed tests is different and unpredictable, if running the test suite with ``pytest-xdist``
produces failures (meaning we have some undetected coupled tests), use `pytest-replay
<https://github.com/ESSS/pytest-replay>`__ to replay the tests in the same order, which should help with then somehow
reducing that failing sequence to a minimum.
255
256

Test order and repetition
Sylvain Gugger's avatar
Sylvain Gugger committed
257
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
258

Sylvain Gugger's avatar
Sylvain Gugger committed
259
260
261
It's good to repeat the tests several times, in sequence, randomly, or in sets, to detect any potential
inter-dependency and state-related bugs (tear down). And the straightforward multiple repetition is just good to detect
some problems that get uncovered by randomness of DL.
262
263
264


Repeat tests
Sylvain Gugger's avatar
Sylvain Gugger committed
265
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
266
267
268
269
270
271
272
273
274
275
276
277

* `pytest-flakefinder <https://github.com/dropbox/pytest-flakefinder>`__:

.. code-block:: bash

   pip install pytest-flakefinder

And then run every test multiple times (50 by default):

.. code-block:: bash

   pytest --flake-finder --flake-runs=5 tests/test_failing_test.py
Sylvain Gugger's avatar
Sylvain Gugger committed
278

279
280
.. note::
   This plugin doesn't work with ``-n`` flag from ``pytest-xdist``.
Sylvain Gugger's avatar
Sylvain Gugger committed
281

282
283
284
285
286
.. note::
   There is another plugin ``pytest-repeat``, but it doesn't work with ``unittest``.


Run tests in a random order
Sylvain Gugger's avatar
Sylvain Gugger committed
287
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
288
289
290
291
292

.. code-block:: bash

    pip install pytest-random-order

Sylvain Gugger's avatar
Sylvain Gugger committed
293
294
Important: the presence of ``pytest-random-order`` will automatically randomize tests, no configuration change or
command line options is required.
295

Sylvain Gugger's avatar
Sylvain Gugger committed
296
297
As explained earlier this allows detection of coupled tests - where one test's state affects the state of another. When
``pytest-random-order`` is installed it will print the random seed it used for that session, e.g:
298
299
300
301
302
303
304
305

.. code-block:: bash

   pytest tests
   [...]
   Using --random-order-bucket=module
   Using --random-order-seed=573663

Sylvain Gugger's avatar
Sylvain Gugger committed
306
So that if the given particular sequence fails, you can reproduce it by adding that exact seed, e.g.:
307
308
309
310
311
312
313
314

.. code-block:: bash

   pytest --random-order-seed=573663
   [...]
   Using --random-order-bucket=module
   Using --random-order-seed=573663

Sylvain Gugger's avatar
Sylvain Gugger committed
315
316
317
It will only reproduce the exact order if you use the exact same list of tests (or no list at all). Once you start to
manually narrowing down the list you can no longer rely on the seed, but have to list them manually in the exact order
they failed and tell pytest to not randomize them instead using ``--random-order-bucket=none``, e.g.:
318
319
320
321
322
323
324
325
326
327
328

.. code-block:: bash

   pytest --random-order-bucket=none tests/test_a.py tests/test_c.py tests/test_b.py

To disable the shuffling for all tests:

.. code-block:: bash

    pytest --random-order-bucket=none

Sylvain Gugger's avatar
Sylvain Gugger committed
329
330
331
By default ``--random-order-bucket=module`` is implied, which will shuffle the files on the module levels. It can also
shuffle on ``class``, ``package``, ``global`` and ``none`` levels. For the complete details please see its
`documentation <https://github.com/jbasko/pytest-random-order>`__.
332

Sylvain Gugger's avatar
Sylvain Gugger committed
333
334
335
Another randomization alternative is: ``pytest-randomly`` <https://github.com/pytest-dev/pytest-randomly>`__. This
module has a very similar functionality/interface, but it doesn't have the bucket modes available in
``pytest-random-order``. It has the same problem of imposing itself once installed.
336
337

Look and feel variations
Sylvain Gugger's avatar
Sylvain Gugger committed
338
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
339
340

pytest-sugar
Sylvain Gugger's avatar
Sylvain Gugger committed
341
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
342

Sylvain Gugger's avatar
Sylvain Gugger committed
343
344
`pytest-sugar <https://github.com/Frozenball/pytest-sugar>`__ is a plugin that improves the look-n-feel, adds a
progressbar, and show tests that fail and the assert instantly. It gets activated automatically upon installation.
345
346

.. code-block:: bash
Sylvain Gugger's avatar
Sylvain Gugger committed
347

348
349
350
351
352
353
354
355
356
357
358
359
360
   pip install pytest-sugar

To run tests without it, run:

.. code-block:: bash

    pytest -p no:sugar

or uninstall it.



Report each sub-test name and its progress
Sylvain Gugger's avatar
Sylvain Gugger committed
361
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
362

Sylvain Gugger's avatar
Sylvain Gugger committed
363
For a single or a group of tests via ``pytest`` (after ``pip install pytest-pspec``):
364
365
366
367
368
369
370
371

.. code-block:: bash

   pytest --pspec tests/test_optimization.py 



Instantly shows failed tests
Sylvain Gugger's avatar
Sylvain Gugger committed
372
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
373

Sylvain Gugger's avatar
Sylvain Gugger committed
374
375
`pytest-instafail <https://github.com/pytest-dev/pytest-instafail>`__ shows failures and errors instantly instead of
waiting until the end of test session.
376
377
378
379
380
381
382
383
384
385

.. code-block:: bash

    pip install pytest-instafail

.. code-block:: bash

    pytest --instafail

To GPU or not to GPU
Sylvain Gugger's avatar
Sylvain Gugger committed
386
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
387
388
389
390

On a GPU-enabled setup, to test in CPU-only mode add ``CUDA_VISIBLE_DEVICES=""``:

.. code-block:: bash
Sylvain Gugger's avatar
Sylvain Gugger committed
391

392
393
    CUDA_VISIBLE_DEVICES="" pytest tests/test_logging.py

Sylvain Gugger's avatar
Sylvain Gugger committed
394
395
or if you have multiple gpus, you can specify which one is to be used by ``pytest``. For example, to use only the
second gpu if you have gpus ``0`` and ``1``, you can run:
396
397

.. code-block:: bash
Sylvain Gugger's avatar
Sylvain Gugger committed
398

399
400
401
    CUDA_VISIBLE_DEVICES="1" pytest tests/test_logging.py

This is handy when you want to run different tasks on different GPUs.
402

Sylvain Gugger's avatar
Sylvain Gugger committed
403
404
Some tests must be run on CPU-only, others on either CPU or GPU or TPU, yet others on multiple-GPUs. The following skip
decorators are used to set the requirements of tests CPU/GPU/TPU-wise:
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425

* ``require_torch`` - this test will run only under torch
* ``require_torch_gpu`` - as ``require_torch`` plus requires at least 1 GPU
* ``require_torch_multigpu`` - as ``require_torch`` plus requires at least 2 GPUs
* ``require_torch_non_multigpu`` - as ``require_torch`` plus requires 0 or 1 GPUs
* ``require_torch_tpu`` - as ``require_torch`` plus requires at least 1 TPU

For example, here is a test that must be run only when there are 2 or more GPUs available and pytorch is installed:

.. code-block:: python

    @require_torch_multigpu
    def test_example_with_multigpu():

If a test requires ``tensorflow`` use the ``require_tf`` decorator. For example:

.. code-block:: python

    @require_tf
    def test_tf_thing_with_tensorflow():

Sylvain Gugger's avatar
Sylvain Gugger committed
426
427
These decorators can be stacked. For example, if a test is slow and requires at least one GPU under pytorch, here is
how to set it up:
428
429
430
431
432
433

.. code-block:: python

    @require_torch_gpu
    @slow
    def test_example_slow_on_gpu():
434

Sylvain Gugger's avatar
Sylvain Gugger committed
435
436
Some decorators like ``@parametrized`` rewrite test names, therefore ``@require_*`` skip decorators have to be listed
last for them to work correctly. Here is an example of the correct usage:
437
438
439
440

.. code-block:: python

    @parameterized.expand(...)
441
    @require_torch_multigpu
442
443
    def test_integration_foo():

Sylvain Gugger's avatar
Sylvain Gugger committed
444
445
This order problem doesn't exist with ``@pytest.mark.parametrize``, you can put it first or last and it will still
work. But it only works with non-unittests.
446
447
448
449
450
451
452
453
454
455

Inside tests:

* How many GPUs are available:

.. code-block:: bash

   torch.cuda.device_count()


Sylvain Gugger's avatar
Sylvain Gugger committed
456

457
458
459
Distributed training
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Sylvain Gugger's avatar
Sylvain Gugger committed
460
461
462
``pytest`` can't deal with distributed training directly. If this is attempted - the sub-processes don't do the right
thing and end up thinking they are ``pytest`` and start running the test suite in loops. It works, however, if one
spawns a normal process that then spawns off multiple workers and manages the IO pipes.
463
464
465

This is still under development but you can study 2 different tests that perform this successfully:

Sylvain Gugger's avatar
Sylvain Gugger committed
466
467
468
469
470
471
* `test_seq2seq_examples_multi_gpu.py
  <https://github.com/huggingface/transformers/blob/master/examples/seq2seq/test_seq2seq_examples_multi_gpu.py>`__ - a
  ``pytorch-lightning``-running test (had to use PL's ``ddp`` spawning method which is the default)
* `test_finetune_trainer.py
  <https://github.com/huggingface/transformers/blob/master/examples/seq2seq/test_finetune_trainer.py>`__ - a normal
  (non-PL) test
472
473
474
475
476
477
478
479
480

To jump right into the execution point, search for the ``execute_async_std`` function in those tests.

You will need at least 2 GPUs to see these tests in action:

.. code-block:: bash

   CUDA_VISIBLE_DEVICES="0,1" RUN_SLOW=1 pytest -sv examples/seq2seq/test_finetune_trainer.py \
   examples/seq2seq/test_seq2seq_examples_multi_gpu.py
481
482
483


Output capture
Sylvain Gugger's avatar
Sylvain Gugger committed
484
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
485

Sylvain Gugger's avatar
Sylvain Gugger committed
486
487
During test execution any output sent to ``stdout`` and ``stderr`` is captured. If a test or a setup method fails, its
according captured output will usually be shown along with the failure traceback.
488

Sylvain Gugger's avatar
Sylvain Gugger committed
489
To disable output capturing and to get the ``stdout`` and ``stderr`` normally, use ``-s`` or ``--capture=no``:
490
491
492
493
494
495
496
497
498
499
500
501
502

.. code-block:: bash

   pytest -s tests/test_logging.py

To send test results to JUnit format output:

.. code-block:: bash

   py.test tests --junitxml=result.xml


Color control
Sylvain Gugger's avatar
Sylvain Gugger committed
503
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
504
505
506
507
508
509
510
511
512
513

To have no color (e.g., yellow on white background is not readable):

.. code-block:: bash

   pytest --color=no tests/test_logging.py



Sending test report to online pastebin service
Sylvain Gugger's avatar
Sylvain Gugger committed
514
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
515
516
517
518
519
520
521

Creating a URL for each test failure:

.. code-block:: bash

   pytest --pastebin=failed tests/test_logging.py

Sylvain Gugger's avatar
Sylvain Gugger committed
522
523
This will submit test run information to a remote Paste service and provide a URL for each failure. You may select
tests as usual or add for example -x if you only want to send one particular failure.
524
525
526
527
528
529
530
531
532
533

Creating a URL for a whole test session log:

.. code-block:: bash

   pytest --pastebin=all tests/test_logging.py



Writing tests
Sylvain Gugger's avatar
Sylvain Gugger committed
534
-----------------------------------------------------------------------------------------------------------------------
535

Sylvain Gugger's avatar
Sylvain Gugger committed
536
537
🤗 transformers tests are based on ``unittest``, but run by ``pytest``, so most of the time features from both systems
can be used.
538

Sylvain Gugger's avatar
Sylvain Gugger committed
539
540
541
You can read `here <https://docs.pytest.org/en/stable/unittest.html>`__ which features are supported, but the important
thing to remember is that most ``pytest`` fixtures don't work. Neither parametrization, but we use the module
``parameterized`` that works in a similar way.
542
543
544


Parametrization
Sylvain Gugger's avatar
Sylvain Gugger committed
545
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
546

Sylvain Gugger's avatar
Sylvain Gugger committed
547
548
Often, there is a need to run the same test multiple times, but with different arguments. It could be done from within
the test, but then there is no way of running that test for just one set of arguments.
549
550

.. code-block:: python
Sylvain Gugger's avatar
Sylvain Gugger committed
551

552
553
    # test_this1.py
    import unittest
554
555
    from parameterized import parameterized
    class TestMathUnitTest(unittest.TestCase):
556
557
558
559
560
561
562
        @parameterized.expand([
            ("negative", -1.5, -2.0),
            ("integer", 1, 1.0),
            ("large fraction", 1.6, 1),
        ])
        def test_floor(self, name, input, expected):
            assert_equal(math.floor(input), expected)
563

Sylvain Gugger's avatar
Sylvain Gugger committed
564
565
Now, by default this test will be run 3 times, each time with the last 3 arguments of ``test_floor`` being assigned the
corresponding arguments in the parameter list.
566
567

and you could run just the ``negative`` and ``integer`` sets of params with:
568
569
570
571
572
573
574
575
576
577
578

.. code-block:: bash

   pytest -k "negative and integer" tests/test_mytest.py

or all but ``negative`` sub-tests, with:

.. code-block:: bash

   pytest -k "not negative" tests/test_mytest.py

Sylvain Gugger's avatar
Sylvain Gugger committed
579
580
581
Besides using the ``-k`` filter that was just mentioned, you can find out the exact name of each sub-test and run any
or all of them using their exact names.

582
.. code-block:: bash
Sylvain Gugger's avatar
Sylvain Gugger committed
583

584
585
586
    pytest test_this1.py --collect-only -q

and it will list:
Sylvain Gugger's avatar
Sylvain Gugger committed
587

588
589
590
591
592
593
594
595
596
597
598
599
.. code-block:: bash

    test_this1.py::TestMathUnitTest::test_floor_0_negative
    test_this1.py::TestMathUnitTest::test_floor_1_integer
    test_this1.py::TestMathUnitTest::test_floor_2_large_fraction

So now you can run just 2 specific sub-tests:

.. code-block:: bash

    pytest test_this1.py::TestMathUnitTest::test_floor_0_negative  test_this1.py::TestMathUnitTest::test_floor_1_integer

Sylvain Gugger's avatar
Sylvain Gugger committed
600
601
602
603
604
The module `parameterized <https://pypi.org/project/parameterized/>`__ which is already in the developer dependencies
of ``transformers`` works for both: ``unittests`` and ``pytest`` tests.

If, however, the test is not a ``unittest``, you may use ``pytest.mark.parametrize`` (or you may see it being used in
some existing tests, mostly under ``examples``).
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622

Here is the same example, this time using ``pytest``'s ``parametrize`` marker:

.. code-block:: python

    # test_this2.py
    import pytest
    @pytest.mark.parametrize(
        "name, input, expected",
        [
            ("negative", -1.5, -2.0),
            ("integer", 1, 1.0),
            ("large fraction", 1.6, 1),
        ],
    )
    def test_floor(name, input, expected):
        assert_equal(math.floor(input), expected)

Sylvain Gugger's avatar
Sylvain Gugger committed
623
624
625
626
Same as with ``parameterized``, with ``pytest.mark.parametrize`` you can have a fine control over which sub-tests are
run, if the ``-k`` filter doesn't do the job. Except, this parametrization function creates a slightly different set of
names for the sub-tests. Here is what they look like:

627
.. code-block:: bash
Sylvain Gugger's avatar
Sylvain Gugger committed
628

629
630
631
    pytest test_this2.py --collect-only -q

and it will list:
Sylvain Gugger's avatar
Sylvain Gugger committed
632

633
634
635
636
637
638
639
640
641
642
643
644
645
646
.. code-block:: bash

    test_this2.py::test_floor[integer-1-1.0]
    test_this2.py::test_floor[negative--1.5--2.0]
    test_this2.py::test_floor[large fraction-1.6-1]       

So now you can run just the specific test:

.. code-block:: bash

    pytest test_this2.py::test_floor[negative--1.5--2.0] test_this2.py::test_floor[integer-1-1.0]

as in the previous example.

Sylvain Gugger's avatar
Sylvain Gugger committed
647

648

649
Temporary files and directories
Sylvain Gugger's avatar
Sylvain Gugger committed
650
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
651

Sylvain Gugger's avatar
Sylvain Gugger committed
652
653
654
Using unique temporary files and directories are essential for parallel test running, so that the tests won't overwrite
each other's data. Also we want to get the temp files and directories removed at the end of each test that created
them. Therefore, using packages like ``tempfile``, which address these needs is essential.
655

Sylvain Gugger's avatar
Sylvain Gugger committed
656
657
However, when debugging tests, you need to be able to see what goes into the temp file or directory and you want to
know it's exact path and not having it randomized on every test re-run.
658

Sylvain Gugger's avatar
Sylvain Gugger committed
659
660
A helper class :obj:`transformers.test_utils.TestCasePlus` is best used for such purposes. It's a sub-class of
:obj:`unittest.TestCase`, so we can easily inherit from it in the test modules.
661
662
663
664
665
666
667
668
669
670
671
672

Here is an example of its usage:

.. code-block:: python

    from transformers.testing_utils import TestCasePlus
    class ExamplesTests(TestCasePlus):
    def test_whatever(self):
        tmp_dir = self.get_auto_remove_tmp_dir()

This code creates a unique temporary directory, and sets :obj:`tmp_dir` to its location.

Sylvain Gugger's avatar
Sylvain Gugger committed
673
674
In this and all the following scenarios the temporary directory will be auto-removed at the end of test, unless
``after=False`` is passed to the helper function.
675

Sylvain Gugger's avatar
Sylvain Gugger committed
676
677
* Create a temporary directory of my choice and delete it at the end - useful for debugging when you want to monitor a
  specific directory:
678
679
680
681
682
683

.. code-block:: python

    def test_whatever(self):
        tmp_dir = self.get_auto_remove_tmp_dir(tmp_dir="./tmp/run/test")

Sylvain Gugger's avatar
Sylvain Gugger committed
684
685
* Create a temporary directory of my choice and do not delete it at the end---useful for when you want to look at the
  temp results:
686
687
688
689
690
691

.. code-block:: python

    def test_whatever(self):
        tmp_dir = self.get_auto_remove_tmp_dir(tmp_dir="./tmp/run/test", after=False)

Sylvain Gugger's avatar
Sylvain Gugger committed
692
693
* Create a temporary directory of my choice and ensure to delete it right away---useful for when you disabled deletion
  in the previous test run and want to make sure the that temporary directory is empty before the new test is run:
694
695
696
697
698
699
700

.. code-block:: python

   def test_whatever(self):
        tmp_dir = self.get_auto_remove_tmp_dir(tmp_dir="./tmp/run/test", before=True)

.. note::
Sylvain Gugger's avatar
Sylvain Gugger committed
701
702
703
   In order to run the equivalent of ``rm -r`` safely, only subdirs of the project repository checkout are allowed if
   an explicit obj:`tmp_dir` is used, so that by mistake no ``/tmp`` or similar important part of the filesystem will
   get nuked. i.e. please always pass paths that start with ``./``.
704
705

.. note::
Sylvain Gugger's avatar
Sylvain Gugger committed
706
707
   Each test can register multiple temporary directories and they all will get auto-removed, unless requested
   otherwise.
708
709
710


Skipping tests
Sylvain Gugger's avatar
Sylvain Gugger committed
711
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
712

Sylvain Gugger's avatar
Sylvain Gugger committed
713
714
This is useful when a bug is found and a new test is written, yet the bug is not fixed yet. In order to be able to
commit it to the main repository we need make sure it's skipped during ``make test``.
715
716
717

Methods:

Sylvain Gugger's avatar
Sylvain Gugger committed
718
719
720
-  A **skip** means that you expect your test to pass only if some conditions are met, otherwise pytest should skip
   running the test altogether. Common examples are skipping windows-only tests on non-windows platforms, or skipping
   tests that depend on an external resource which is not available at the moment (for example a database).
721

Sylvain Gugger's avatar
Sylvain Gugger committed
722
723
724
-  A **xfail** means that you expect a test to fail for some reason. A common example is a test for a feature not yet
   implemented, or a bug not yet fixed. When a test passes despite being expected to fail (marked with
   pytest.mark.xfail), its an xpass and will be reported in the test summary.
725

Sylvain Gugger's avatar
Sylvain Gugger committed
726
727
One of the important differences between the two is that ``skip`` doesn't run the test, and ``xfail`` does. So if the
code that's buggy causes some bad state that will affect other tests, do not use ``xfail``.
728
729

Implementation
Sylvain Gugger's avatar
Sylvain Gugger committed
730
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793

- Here is how to skip whole test unconditionally:

.. code-block:: python

    @unittest.skip("this bug needs to be fixed")
    def test_feature_x():

or via pytest:

.. code-block:: python

    @pytest.mark.skip(reason="this bug needs to be fixed")

or the ``xfail`` way:

.. code-block:: python

    @pytest.mark.xfail
    def test_feature_x():

Here is how to skip a test based on some internal check inside the test:

.. code-block:: python

    def test_feature_x():
        if not has_something():
            pytest.skip("unsupported configuration")

or the whole module:

.. code-block:: python

    import pytest
    if not pytest.config.getoption("--custom-flag"):
        pytest.skip("--custom-flag is missing, skipping tests", allow_module_level=True)

or the ``xfail`` way:

.. code-block:: python

    def test_feature_x():
        pytest.xfail("expected to fail until bug XYZ is fixed")

Here is how to skip all tests in a module if some import is missing:

.. code-block:: python

    docutils = pytest.importorskip("docutils", minversion="0.3")

-  Skip a test based on a condition:

.. code-block:: python

    @pytest.mark.skipif(sys.version_info < (3,6), reason="requires python3.6 or higher")
    def test_feature_x():

or:

.. code-block:: python

    @unittest.skipIf(torch_device == "cpu", "Can't do half precision")
    def test_feature_x():
Sylvain Gugger's avatar
Sylvain Gugger committed
794

795
796
797
798
799
800
801
802
803
804
or skip the whole module:

.. code-block:: python

    @pytest.mark.skipif(sys.platform == 'win32', reason="does not run on windows")
    class TestClass():
        def test_feature_x(self):

More details, example and ways are `here <https://docs.pytest.org/en/latest/skipping.html>`__.

805
Slow tests
Sylvain Gugger's avatar
Sylvain Gugger committed
806
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
807

Sylvain Gugger's avatar
Sylvain Gugger committed
808
809
810
The library of tests is ever-growing, and some of the tests take minutes to run, therefore we can't afford waiting for
an hour for the test suite to complete on CI. Therefore, with some exceptions for essential tests, slow tests should be
marked as in the example below:
811
812
813
814
815
816
817

.. code-block:: python

    from transformers.testing_utils import slow
    @slow
    def test_integration_foo():

818
Once a test is marked as ``@slow``, to run such tests set ``RUN_SLOW=1`` env var, e.g.:
819
820
821
822

.. code-block:: bash

    RUN_SLOW=1 pytest tests
Sylvain Gugger's avatar
Sylvain Gugger committed
823
824
825

Some decorators like ``@parameterized`` rewrite test names, therefore ``@slow`` and the rest of the skip decorators
``@require_*`` have to be listed last for them to work correctly. Here is an example of the correct usage:
Stas Bekman's avatar
Stas Bekman committed
826
827

.. code-block:: python
828

Stas Bekman's avatar
Stas Bekman committed
829
830
831
    @parameterized.expand(...)
    @slow
    def test_integration_foo():
832

Sylvain Gugger's avatar
Sylvain Gugger committed
833
834
835
836
As explained at the beginning of this document, slow tests get to run on a scheduled basis, rather than in PRs CI
checks. So it's possible that some problems will be missed during a PR submission and get merged. Such problems will
get caught during the next scheduled CI job. But it also means that it's important to run the slow tests on your
machine before submitting the PR.
837
838
839

Here is a rough decision making mechanism for choosing which tests should be marked as slow:

Sylvain Gugger's avatar
Sylvain Gugger committed
840
841
842
843
If the test is focused on one of the library's internal components (e.g., modeling files, tokenization files,
pipelines), then we should run that test in the non-slow test suite. If it's focused on an other aspect of the library,
such as the documentation or the examples, then we should run these tests in the slow test suite. And then, to refine
this approach we should have exceptions:
844

Sylvain Gugger's avatar
Sylvain Gugger committed
845
846
847
* All tests that need to download a heavy set of weights (e.g., model or tokenizer integration tests, pipeline
  integration tests) should be set to slow. If you're adding a new model, you should create and upload to the hub a
  tiny version of it (with random weights) for integration tests. This is discussed in the following paragraphs.
848
* All tests that need to do a training not specifically optimized to be fast should be set to slow.
Sylvain Gugger's avatar
Sylvain Gugger committed
849
850
851
* We can introduce exceptions if some of these should-be-non-slow tests are excruciatingly slow, and set them to
  ``@slow``. Auto-modeling tests, which save and load large files to disk, are a good example of tests that are marked
  as ``@slow``.
852
853
* If a test completes under 1 second on CI (including downloads if any) then it should be a normal test regardless.

Sylvain Gugger's avatar
Sylvain Gugger committed
854
855
856
857
Collectively, all the non-slow tests need to cover entirely the different internals, while remaining fast. For example,
a significant coverage can be achieved by testing with specially created tiny models with random weights. Such models
have the very minimal number of layers (e.g., 2), vocab size (e.g., 1000), etc. Then the ``@slow`` tests can use large
slow models to do qualitative testing. To see the use of these simply look for *tiny* models with:
858
859
860
861
862

.. code-block:: bash

    grep tiny tests examples

Sylvain Gugger's avatar
Sylvain Gugger committed
863
864
865
866
Here is a an example of a `script
<https://github.com/huggingface/transformers/blob/master/scripts/fsmt/fsmt-make-tiny-model.py>`__ that created the tiny
model `stas/tiny-wmt19-en-de <https://huggingface.co/stas/tiny-wmt19-en-de>`__. You can easily adjust it to your
specific model's architecture.
867

Sylvain Gugger's avatar
Sylvain Gugger committed
868
869
870
It's easy to measure the run-time incorrectly if for example there is an overheard of downloading a huge model, but if
you test it locally the downloaded files would be cached and thus the download time not measured. Hence check the
execution speed report in CI logs instead (the output of ``pytest --durations=0 tests``).
871

Sylvain Gugger's avatar
Sylvain Gugger committed
872
873
874
That report is also useful to find slow outliers that aren't marked as such, or which need to be re-written to be fast.
If you notice that the test suite starts getting slow on CI, the top listing of this report will show the slowest
tests.
875
876


877
Testing the stdout/stderr output
Sylvain Gugger's avatar
Sylvain Gugger committed
878
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
879

Sylvain Gugger's avatar
Sylvain Gugger committed
880
881
In order to test functions that write to ``stdout`` and/or ``stderr``, the test can access those streams using the
``pytest``'s `capsys system <https://docs.pytest.org/en/latest/capture.html>`__. Here is how this is accomplished:
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899

.. code-block:: python

    import sys
    def print_to_stdout(s): print(s)
    def print_to_stderr(s): sys.stderr.write(s)
    def test_result_and_stdout(capsys):
        msg = "Hello"
        print_to_stdout(msg)
        print_to_stderr(msg)
        out, err = capsys.readouterr() # consume the captured output streams
        # optional: if you want to replay the consumed streams:
        sys.stdout.write(out)
        sys.stderr.write(err)
        # test:
        assert msg in out
        assert msg in err

Sylvain Gugger's avatar
Sylvain Gugger committed
900
901
And, of course, most of the time, ``stderr`` will come as a part of an exception, so try/except has to be used in such
a case:
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932

.. code-block:: python

    def raise_exception(msg): raise ValueError(msg)
    def test_something_exception():
        msg = "Not a good value"
        error = ''
        try:
            raise_exception(msg)
        except Exception as e:
            error = str(e)
            assert msg in error, f"{msg} is in the exception:\n{error}"

Another approach to capturing stdout is via ``contextlib.redirect_stdout``:

.. code-block:: python

    from io import StringIO
    from contextlib import redirect_stdout
    def print_to_stdout(s): print(s)
    def test_result_and_stdout():
        msg = "Hello"
        buffer = StringIO()
        with redirect_stdout(buffer):
            print_to_stdout(msg)
        out = buffer.getvalue()
        # optional: if you want to replay the consumed streams:
        sys.stdout.write(out)
        # test:
        assert msg in out

Sylvain Gugger's avatar
Sylvain Gugger committed
933
934
935
936
An important potential issue with capturing stdout is that it may contain ``\r`` characters that in normal ``print``
reset everything that has been printed so far. There is no problem with ``pytest``, but with ``pytest -s`` these
characters get included in the buffer, so to be able to have the test run with and without ``-s``, you have to make an
extra cleanup to the captured output, using ``re.sub(r'~.*\r', '', buf, 0, re.M)``.
937

Sylvain Gugger's avatar
Sylvain Gugger committed
938
939
But, then we have a helper context manager wrapper to automatically take care of it all, regardless of whether it has
some ``\r``'s in it or not, so it's a simple:
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958

.. code-block:: python

    from transformers.testing_utils import CaptureStdout
    with CaptureStdout() as cs:
        function_that_writes_to_stdout()
    print(cs.out)

Here is a full test example:

.. code-block:: python

    from transformers.testing_utils import CaptureStdout
    msg = "Secret message\r"
    final = "Hello World"
    with CaptureStdout() as cs:
        print(msg + final)
    assert cs.out == final+"\n", f"captured: {cs.out}, expecting {final}"

Sylvain Gugger's avatar
Sylvain Gugger committed
959
If you'd like to capture ``stderr`` use the :obj:`CaptureStderr` class instead:
960
961
962
963
964
965
966
967

.. code-block:: python

    from transformers.testing_utils import CaptureStderr
    with CaptureStderr() as cs:
        function_that_writes_to_stderr()
    print(cs.err)

Sylvain Gugger's avatar
Sylvain Gugger committed
968
If you need to capture both streams at once, use the parent :obj:`CaptureStd` class:
969
970
971
972
973
974
975
976
977
978
979

.. code-block:: python

    from transformers.testing_utils import CaptureStd
    with CaptureStd() as cs:
        function_that_writes_to_stdout_and_stderr()
    print(cs.err, cs.out)



Capturing logger stream
Sylvain Gugger's avatar
Sylvain Gugger committed
980
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997

If you need to validate the output of a logger, you can use :obj:`CaptureLogger`:

.. code-block:: python

    from transformers import logging
    from transformers.testing_utils import CaptureLogger

    msg = "Testing 1, 2, 3"
    logging.set_verbosity_info()
    logger = logging.get_logger("transformers.tokenization_bart")
    with CaptureLogger(logger) as cl:
        logger.info(msg)
    assert cl.out, msg+"\n"


Testing with environment variables
Sylvain Gugger's avatar
Sylvain Gugger committed
998
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
999

Sylvain Gugger's avatar
Sylvain Gugger committed
1000
1001
If you want to test the impact of environment variables for a specific test you can use a helper decorator
``transformers.testing_utils.mockenv``
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012

.. code-block:: python

    from transformers.testing_utils import mockenv
    class HfArgumentParserTest(unittest.TestCase):
        @mockenv(TRANSFORMERS_VERBOSITY="error")
        def test_env_override(self):
            env_level_str = os.getenv("TRANSFORMERS_VERBOSITY", None)


Getting reproducible results
Sylvain Gugger's avatar
Sylvain Gugger committed
1013
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1014

Sylvain Gugger's avatar
Sylvain Gugger committed
1015
1016
In some situations you may want to remove randomness for your tests. To get identical reproducable results set, you
will need to fix the seed:
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039

.. code-block:: python

    seed = 42

    # python RNG
    import random
    random.seed(seed)

    # pytorch RNGs
    import torch
    torch.manual_seed(seed)
    torch.backends.cudnn.deterministic = True
    if torch.cuda.is_available(): torch.cuda.manual_seed_all(seed)

    # numpy RNG
    import numpy as np
    np.random.seed(seed)

    # tf RNG
    tf.random.set_seed(seed)

Debugging tests
Sylvain Gugger's avatar
Sylvain Gugger committed
1040
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1041
1042
1043
1044
1045
1046

To start a debugger at the point of the warning, do this:

.. code-block:: bash

    pytest tests/test_logging.py -W error::UserWarning --pdb