Rule F009
Avoid nested when(); prefer stacked when().when().otherwise()
Severity
🟢 LOW — Minor performance impact.
PySpark version
Compatible with PySpark 1.4 and later.
Information
Using a when() inside another when() can lead to:
- Hard-to-read and complex conditional logic
- Difficulty in understanding which conditions are applied in which order
- Increased chance of errors and maintenance issues
Best practices
- Use stacked
when().when().otherwise()to handle multiple conditions clearly - Each condition should be a separate chained
when()call for readability - Always include a final
otherwise()to handle unmatched cases
Rule of thumb: Replace nested when() calls with stacked when().when().otherwise() for clarity and maintainability.
Example
Bad:
Good: