typing_recursion.rst 4.92 KB
Newer Older
dugupeiwen's avatar
dugupeiwen committed
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
========================
NBEP 6: Typing Recursion
========================

:Author: Siu Kwan Lam
:Date: Sept 2016
:Status: Draft

Introduction
============

This document proposes an enhancement to the type inference algorithm to
support recursion without explicitly annotating the function signature.
As a result, the proposal enables numba to type-infer both self-recursive and
mutual-recursive functions under some limitations.  In practice, these
limitations can be easily overcome by specifying a compilation order.


The Current State
=================

Recursion support in numba is currently limited to self-recursion with explicit
type annotation for the function.  This limitation comes from the inability to
determine the return type of a recursive call.  This is because the callee is
either the current function (for self-recursion) or a parent function
(mutual-recursion) and its type inference process has been suspended while waiting for
the function-type of its callee.  This results in the formation of a cyclic
dependency.  For example, given a function ``foo()`` that calls ``bar()``,
which in turns call ``foo()``::

    def foo(x):
        if x > 0:
            return bar(x)
        else:
            return 1

    def bar(x):
        return foo(x - 1)


The type inference process of ``foo()`` depends on that of ``bar()``,
which depends on ``foo()``.  Therefore ``foo()`` depends on itself and the type
inference algorithm cannot terminate.


The Solution
============

The proposed solution has two components:

1. The introduction of a compile-time *callstack* that tracks the compiling functions.
2. The allowance of a partial type inference on functions by leveraging the return type
   on non-recursive control-flow paths.

The compile-time callstack stores typing information of the functions being
compiled.  Like an ordinary callstack, it pushes a new record every time a
function is "called".  Since this occurs at compile-time, a "call" triggers
a compilation of the callee.

To detect recursion, the compile-time callstack is searched bottom-up
(stack grows downward) for a record that matches the callee.
As the record contains a reference to the type inference state,
the type inference process can be resumed to determine the return type.

Recall that the type inference process cannot be resumed normally because of the cyclic
dependency of the return type.  In practice, we can assume that a useful
program must have a terminating condition, a path that does not recurse.  So,
the type inference process can make an initial guess for the return-type at the recursive
call by using the return-type determined by the non-recursive paths.  This
allows type information to propagate on the recursive paths to generate the
final return type, which is used to refine the type information by the
subsequent iteration in the type inference process.

The following figure illustrates the compile-time callstack when the compiler
reaches the recursive call to ``foo()`` from ``bar()``:

.. image:: recursion_callstack.svg
    :width: 400px

At this time, the type inference process of ``foo()`` is suspended and that of ``bar()``
is active.  The compiler can see that the callee is already compiling by
searching the callstack.  Knowing that it is a recursive call, the compiler
can resume the type-inference on ``foo()`` by ignoring the paths that contain
recursive calls.  This means only the ``else`` branch is considered and we can
easily tell that ``foo()`` returns an ``int`` in this case.  The compiler will
then set the initial return type of ``foo()`` and ``bar()`` to ``int``.  The
subsequent type propagation can use this information to complete the type
inference of both functions, unifying the return-type of all returning paths.


Limitations
===========

For the proposed type inference algorithm to terminate, it assumes that
at least one of the control path leads to a return-statement without undertaking
a recursive call.  Should this not be the case, the algorithm will raise an
exception indicating a potential runaway recursion.

For example::

    @jit
    def first(x):
        # The recursing call must have a path that is non-recursing.
        if x > 0:
            return second(x)
        else:
            return 1

    @jit
    def second(x):
        return third(x)

    @jit
    def third(x):
        return first(x - 1)


The ``first()`` function must be the compiled first for the type inference algorithm to
complete successfully.  Compiling any other function first will lead to a failure
in type inference.  The type inference algorithm will treat it as a runaway
recursion due to the lack of a non-recursive exit in the recursive callee.

For example, compiling ``second()`` first will move the recursive call to
``first()``.  When the compiler tries to resume the type inference process of
``second()``, it will fail to find a non-recursive path.

This is a small limitation and can be overcome easily by code restructuring or
precompiling in a specific order.