F — Format

Rules that enforce idiomatic use of the PySpark DataFrame API — covering code style, readability, and correct function usage.

Rule	Title
F001	Avoid chaining `withColumn()` and `withColumnRenamed()`
F002	Avoid `drop()` — use `select()` for explicit columns
F003	Avoid `selectExpr()` — prefer `select()` with `col()`
F004	Avoid `spark.sql()` — prefer native DataFrame API
F005	Avoid stacking multiple `withColumn()` — use `withColumns()`
F006	Avoid stacking multiple `withColumnRenamed()` — use `withColumnsRenamed()`
F007	Prefer `filter()` before `select()` for clarity
F008	Avoid `print()` — prefer the logging module
F009	Avoid nested `when()` — use stacked `.when().when().otherwise()`
F010	Always include `otherwise()` at the end of a `when()` chain
F011	Avoid backslash line continuation — use parentheses
F012	Always wrap literal values with `lit()`
F013	Avoid reserved column names with `__` prefix and `__` suffix
F014	Avoid `explode_outer()` — handle nulls with higher-order functions
F015	Avoid multiple consecutive `filter()` calls — combine conditions
F016	Avoid long DataFrame renaming chains — overwrite the same variable
F017	Avoid `expr()` — use native PySpark functions instead
F018	Use Spark native datetime functions instead of Python datetime objects
F019	Avoid `inferSchema=True` and `mergeSchema=True` in Spark read options
F020	Avoid `select("*")` — use explicit column names