top of page

Which quality dimensions are considered?

The Data Quality Assistant considers six different dimensions:

Missing values: 

Empty cells or cells just filled with “.”, “-”, """"

Cells with a different format compared to the main format in the respective column, i.e., date types, numbers vs letters, different currencies, format differences (e.g., European vs American dates)

Values that are extremely high or low compared to others in the same column (based on the interquartile range). The sensitivity to detect outliers can be adjusted in the settings. In addition, further outlier specifications can be customized in the settings

Identification of similar rows. The sensitivity (% of characters that need to match for different rows) to detect duplicates can be adjusted in the settings menu

Identification of a divergent formulas compared to the remaining column. This can be, e.g., a different formula (e.g., sum() vs average()) or a different range (e.g., sum(F12:H12) vs sum(F13 vs G13)

Additional data checks to further customize an analysis

Identification of Personal Identifiable Information (PII). This includes, e.g., names, email addresses, phone numbers, and IP addresses. The categories can be adjusted in the settings

See below for an overview:

image.png
bottom of page