In any data-oriented process the “garbage in, garbage out” issue is always a possibility. To mitigate it, we make use of data validation, a process composed of a set of rules to ensure that the data reaches a minimum quality standard. A couple of examples of validation checks are:
- Data type validation: Checks whether the data is of the expected type (eg. integer, string) and conforms to the expected format.
- Range and constraint validation: Checks if the observed values fall within a valid range. For example, temperature values must be above absolute zero (or likely a higher minimum depending on the operating range of the equipment being used to record them.)