D — Driver
Rules that flag operations which pull data from the distributed cluster to the driver node. These are the most dangerous antipatterns — they can silently OOM the driver or stall a pipeline entirely.
| Rule | Title |
|---|---|
| D001 | Avoid using collect() |
| D002 | Avoid accessing .rdd |
| D003 | Avoid .show() in production |
| D004 | Avoid .count() on large DataFrames |
| D005 | Avoid .rdd.isEmpty() — use .isEmpty() directly |
| D006 | Avoid df.count() == 0 — use .isEmpty() |
| D007 | Avoid .filter(...).count() == 0 — use .filter(...).isEmpty() |
| D008 | Avoid .display() in production |
| D009 | Avoid .count() as a boolean — use .isEmpty() |