Skip to content

Rule F010

Always include otherwise() with when()

Severity

🟢 LOW — Minor performance impact.

PySpark version

Compatible with PySpark 1.4 and later.

Information

Using when() without a corresponding otherwise() can lead to:

  • Unexpected null values for unmatched cases
  • Reduced readability and unclear logic flow
  • Harder maintenance and debugging for future changes

Best practices

  • Always chain an otherwise() after when(), even if the default is None
  • Makes the conditional logic explicit and easier to understand
  • Prepares the code for future improvements or additional conditions

Rule of thumb: Include otherwise() with every when() to improve clarity, maintainability, and robustness.

Example

Bad:

df.withColumn("grade", when(col("score") > 90, "A"))

Good:

df.withColumn("grade", when(col("score") > 90, "A").otherwise("B"))