Using and Abusing Auto Loader's Inferred Schema
Problem Databricks' Auto Loader has the ability to infer a schema from a sample of files. This means that you don’t have to provide a schema, which is really handy when you’re dealing with an unknown schema or a wide and complex schema, which you don’t always want to define up-front.
But what happens if the schema that has been inferred isn’t the schema you were expecting or it contains fields which you definitely don’t want to ingest - like PCI or PII data fields?