This document explains why Impala requires partition refresh operations, what exactly happens internally, what breaks if you skip them, and all available strategies to make newly written Parquet data queryable and performant.
This is written to be understandable for new engineers, while still being accurate for production systems.