python-guidelines.md 12.3 KB
Newer Older
1
2
3
4
5
6
7
8
9
10
11
# Python Guidelines

Rules and conventions for Python code in this repository.

## Critical Rules

These are the conventions that matter most for code quality and maintainability.
Exceptions exist, but they should be rare and justified in a code comment.

### Keep imports at the top of the file

12
13
14
15
**Always flag** any `import` statement that appears inside a function body, method,
or class. Imports inside functions hide dependencies, make the module harder to
understand at a glance, and can mask missing packages until a specific code path
is hit at runtime.
16

17
18
19
20
**Always flag** `try/except ImportError` around imports (except for the documented
exceptions below). This pattern creates two execution modes -- one with the library
and one without -- which doubles the testing surface and produces confusing
behavior when the dependency is unexpectedly absent.
21
22
23

```python
# BAD -- import inside function; dependency is invisible until this path runs
24
# ALWAYS FLAG THIS PATTERN
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
def process_data():
    import json
    return json.loads(data)

# BAD -- try/except hides a missing dependency behind a flag
try:
    import requests
    HAS_REQUESTS = True
except ImportError:
    HAS_REQUESTS = False

# GOOD -- direct import at module top; fails immediately if missing
import json
import requests
```

Failing immediately on a missing dependency is better than hiding the problem
until a user hits an obscure code path in production. If a package is required,
add it to `requirements.txt` or `pyproject.toml` so it's installed upfront.

**Exception: optional dependencies and pytest collection.** This repo has multiple
optional backends (vLLM, SGLang, TRT-LLM, cupy, etc.) that are not installed in
every environment. Using `try/except ImportError` is the correct pattern when:

- An optional backend dependency (e.g. `tritonclient.grpc`, `vllm_omni`,
  `torch_memory_saver`) may not be installed, and the code provides a
  fallback or sets the import to `None`.
- A test file needs to skip collection when optional packages are absent
  (e.g. `except ImportError: pytest.skip(..., allow_module_level=True)`).
- A stdlib module has version-dependent availability
  (e.g. `tomllib` on Python 3.11+ vs `tomli` fallback).

```python
# OK -- optional backend, graceful fallback to None
try:
    from vllm_omni.diffusion.data import DiffusionParallelConfig
except ImportError:
    DiffusionParallelConfig = None

# OK -- skip test collection when optional deps are missing
try:
    from dynamo.profiler.rapid import WorkloadSpec
except ImportError as e:
    pytest.skip(f"Skip (missing dependency): {e}", allow_module_level=True)

# OK -- stdlib version compatibility
try:
    import tomllib
except ImportError:
    import tomli as tomllib
```

### Prefer failing fast over hiding errors

From PEP 20 (The Zen of Python):

> *"Errors should never pass silently. Unless explicitly silenced."*
> *"Explicit is better than implicit."*

Failing immediately when something goes wrong is better than silently continuing
with bad state, because:

- The person who caused the error sees it right away, while the context is fresh.
- The stack trace points directly at the root cause, not at a downstream symptom
  three function calls later.
- Hidden errors compound -- a swallowed exception in one place produces confusing
  behavior somewhere else, and the debugging cost grows exponentially with distance
  from the original failure.

```python
# BAD -- all of these hide errors
except Exception:
    pass

except Exception as e:
    logging.error(e)        # logs but silently continues!
    return []               # returns a fake default

# BAD -- bare except catches KeyboardInterrupt and SystemExit too,
# making the process impossible to kill with Ctrl-C or sys.exit()
try:
    do_work()
except:
    log_error()

# GOOD -- just let it crash
result = something()

# GOOD -- catch SPECIFIC exceptions you can actually handle
try:
    result = json.loads(text)
except json.JSONDecodeError:
    result = {}

# GOOD -- if you must catch broad, catch Exception (not bare except)
# and re-raise after logging
try:
    result = something()
except Exception as e:
    logger.error(f"Failed: {e}")
    raise
```

**Three rules:**
1. Remove the try/except if possible -- let it crash.
2. Catch **specific** exceptions only (`FileNotFoundError`, `ValueError`, `json.JSONDecodeError`, etc.).
3. If you must catch broadly, use `except Exception:` (never bare `except:`) and **always** re-raise after logging.

### NO defensive `getattr()` on known types

135
136
137
138
139
**Always flag** `getattr(obj, "attr", default)` when the object's type is known
and the attribute is part of its definition (class attribute, `__init__` parameter,
dataclass field, etc.). Using `getattr()` with a default hides bugs by silently
returning a fallback when the attribute should always exist. Direct attribute
access fails loudly if the type contract changes, which is what you want.
140
141

```python
142
143
144
145
146
# BAD -- cfg is a ServiceConfig with host/port; getattr hides AttributeError
# ALWAYS FLAG THIS PATTERN
cfg = ServiceConfig(host="0.0.0.0", port=8080)
host = getattr(cfg, "host", "localhost")
port = getattr(cfg, "port", 9999)
147
148

# GOOD -- direct access, fails loudly if something is wrong
149
150
host = cfg.host
port = cfg.port
151
152
153
154
```

---

155
156
157
158
159
## Anti-Patterns That Must Be Flagged in Review

Every item below is a **mandatory review check**. If any of these patterns appear
in a pull request, flag it and request changes. These are not style preferences --
they are sources of real bugs, resource leaks, and CI flakiness.
160
161
162

### Mutable default arguments

163
164
165
**Always flag** any function whose default argument is a mutable object (`[]`, `{}`,
`set()`). Default values are evaluated once at function definition time and shared
across all calls, so mutations accumulate silently between invocations.
166
167

```python
168
# BAD -- the list is shared across all calls; flag this
169
170
171
172
173
174
175
def add_item(item, items=[]):
    items.append(item)
    return items

add_item("a")  # ["a"]
add_item("b")  # ["a", "b"] -- not ["b"]!

176
# GOOD -- use None sentinel, create a new list each call
177
178
179
180
181
182
183
def add_item(item, items=None):
    if items is None:
        items = []
    items.append(item)
    return items
```

184
### Leaked file handles -- always use context managers
185

186
187
188
189
**Always flag** any `open()` call that is not wrapped in a `with` statement.
Files, network connections, subprocesses, and locks must be opened with `with`
so they are released even if an exception occurs. Bare `open()` followed by
manual `.close()` leaks the handle when an exception fires between the two calls.
190
191

```python
192
# BAD -- file handle leaks if json.load raises; flag this
193
194
195
196
197
198
199
200
201
202
203
204
205
206
f = open("data.json")
data = json.load(f)
f.close()

# GOOD
with open("data.json") as f:
    data = json.load(f)
```

This is especially important in this project where tests manage subprocesses,
etcd/NATS connections, and temp directories. Use `ManagedProcess`,
`tempfile.TemporaryDirectory`, and similar context managers rather than
manual setup/teardown.

207
### Shadowing built-in names
208

209
210
211
212
213
**Always flag** any variable named `list`, `dict`, `id`, `type`, `input`, `open`,
`format`, `set`, `map`, `filter`, `range`, `str`, `int`, `float`, `bool`, `bytes`,
`tuple`, `hash`, `len`, `min`, `max`, `sum`, `any`, `all`, `zip`, `enumerate`,
`sorted`, `reversed`, or `next`. Assigning to these names overwrites the built-in
and causes confusing `TypeError`s later in the same scope.
214
215

```python
216
217
# BAD -- shadows built-in list(); flag this
list = get_items()
218
219
220
221
222
223
224
225
226
filtered = list(some_gen)   # TypeError: 'list' object is not callable

# GOOD
items = get_items()
filtered = list(some_gen)
```

### Use `is` for None / True / False comparisons

227
228
229
**Always flag** `== None`, `== True`, `== False`, `!= None`, `!= True`, and
`!= False`. These invoke `__eq__`, which can be overridden and produce
surprising results. Use `is` / `is not` for singleton comparisons.
230
231

```python
232
# BAD -- flag these
233
234
if result == None:
if flag == True:
235
if done == False:
236
237
238
239

# GOOD
if result is None:
if flag is True:     # or just: if flag:
240
if not done:
241
242
243
244
```

### Do not modify a collection while iterating

245
246
247
**Always flag** any loop that adds, removes, or deletes from the collection it is
iterating over. This causes skipped elements, `RuntimeError` (for dicts), or
infinite loops. Build a new collection or iterate over a copy.
248
249

```python
250
# BAD -- RuntimeError on dict, skips elements on list; flag this
251
252
253
254
255
256
257
258
259
260
for item in items:
    if item.is_stale():
        items.remove(item)

# GOOD
items = [item for item in items if not item.is_stale()]
```

### Prefer `join()` over string concatenation in loops

261
262
**Always flag** `+=` on a string variable inside a loop. Repeated `+=` on strings
creates a new string object each time, which is O(n^2) for large loops.
263
264

```python
265
# BAD -- O(n^2) string building; flag this
266
267
268
269
result = ""
for line in lines:
    result += line + "\n"

270
# GOOD -- O(n) with join
271
272
273
result = "\n".join(lines)
```

274
### Late-binding closures in loops
275

276
277
278
279
**Always flag** lambdas or inner functions created inside a loop that reference the
loop variable without binding it as a default argument. Closures capture the
variable reference, not its value at the time of creation, so all closures end up
with the final loop value.
280
281

```python
282
# BAD -- all lambdas return 4 (the final value of i); flag this
283
284
285
286
287
288
289
290
291
292
fns = [lambda: i for i in range(5)]
[f() for f in fns]  # [4, 4, 4, 4, 4]

# GOOD -- default argument captures current value
fns = [lambda i=i: i for i in range(5)]
[f() for f in fns]  # [0, 1, 2, 3, 4]
```

### Do not use `assert` for runtime validation

293
294
295
296
**Always flag** `assert` statements used to validate function arguments, request
payloads, configuration, or any data that comes from outside the current function.
Assertions are stripped when Python runs with `-O` (optimize), silently removing
the validation. Use explicit `if/raise` for checks that must always execute.
297
298

```python
299
# BAD -- silently skipped under python -O; flag this
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
assert user_id is not None, "user_id required"

# GOOD
if user_id is None:
    raise ValueError("user_id required")
```

---

## Code Style

- Follow PEP 8.
- `snake_case` for variables and functions.
- `PascalCase` for classes.
- Add type hints where they improve readability.
- Use docstrings for public functions and classes.
- Use `dataclass` instead of plain dicts when a structure has >4 fields (better
  type inference and IDE support).

## File Organization

- `__init__.py` for package initialization.
- Clear module separation.
- Tests in `tests/` directory.

## Formatting and Linting

### Preferred workflow

Auto-fix formatting, then lint:

```bash
ruff format <touched_paths>
ruff check --fix <touched_paths>
```

Or use pre-commit (runs isort, black, flake8, ruff, and more):

```bash
pre-commit run --files <touched_files>
pre-commit run --all-files    # for broad changes
```

### Pre-commit hooks

The repo's `.pre-commit-config.yaml` runs these Python hooks:

- **isort** -- import sorting (`profile = "black"`, configured in `pyproject.toml`)
- **black** -- code formatting
- **flake8** -- style checks (`max-line-length=88`)
- **ruff** -- fast linting with auto-fix
- **codespell** -- spelling checks
- **trailing-whitespace**, **end-of-file-fixer**, **check-yaml**, **check-json**, **check-toml**

### Before committing

Always run:

```bash
pre-commit run --files <changed_files>
```

For broader changes or if in doubt:

```bash
pre-commit run --all-files
```

### Indentation verification

Indentation errors are common and hard to spot visually. After editing Python
files, verify mechanically:

```bash
ruff format <touched_paths>                     # auto-fix (preferred)
python3 -m compileall -q <touched_paths>        # fast parse-only check
```

When fixing indentation errors, always read 20-30 lines of surrounding context
and fix the whole block -- adjacent lines often share the same mistake.

## Error Handling

See the **Critical Rules** section above for the full policy. Summary:

- Let exceptions propagate by default.
- Catch only specific exceptions you can actually handle.
- If you catch `Exception`, you must re-raise after logging.
- No `except Exception: pass`. Ever.

### Regex caution

Be careful with escaping in raw strings (`\s` vs `\\s`). When changing a critical
regex, add a one-line test to prove it matches.

## Import Order

Imports are sorted by isort with `profile = "black"` (configured in `pyproject.toml`).
The order is:

1. Standard library
2. Third-party (known: `vllm`, `tensorrt_llm`, `sglang`, `aiconfigurator`)
3. First-party (`dynamo`, `deploy`)

Run `isort` or `pre-commit run isort` to auto-fix ordering.