Skip to content

Rule F009

Avoid nested when(); prefer stacked when().when().otherwise()

Severity

🟢 LOW — Minor performance impact.

PySpark version

Compatible with PySpark 1.4 and later.

Information

Using a when() inside another when() can lead to:

  • Hard-to-read and complex conditional logic
  • Difficulty in understanding which conditions are applied in which order
  • Increased chance of errors and maintenance issues

Best practices

  • Use stacked when().when().otherwise() to handle multiple conditions clearly
  • Each condition should be a separate chained when() call for readability
  • Always include a final otherwise() to handle unmatched cases

Rule of thumb: Replace nested when() calls with stacked when().when().otherwise() for clarity and maintainability.

Example

Bad:

when(when(col("score") > 90, "A"), "B")

Good:

when(col("score") > 90, "A").when(col("score") > 70, "B").otherwise("C")