Dataframe validator
DataFrameMissingColumnError
Bases: ValueError
Raise this when there's a DataFrame column error.
Source code in quinn/dataframe_validator.py
11 12 |
|
DataFrameMissingStructFieldError
Bases: ValueError
Raise this when there's a DataFrame column error.
Source code in quinn/dataframe_validator.py
15 16 |
|
DataFrameProhibitedColumnError
Bases: ValueError
Raise this when a DataFrame includes prohibited columns.
Source code in quinn/dataframe_validator.py
19 20 |
|
validate_absence_of_columns(df, prohibited_col_names)
Validate that none of the prohibited column names are present among specified DataFrame columns.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
df |
DataFrame
|
DataFrame containing columns to be checked. |
required |
prohibited_col_names |
list[str]
|
List of prohibited column names. |
required |
Raises:
Type | Description |
---|---|
DataFrameProhibitedColumnError
|
If the prohibited column names are present among the specified DataFrame columns. |
Source code in quinn/dataframe_validator.py
76 77 78 79 80 81 82 83 84 85 86 87 88 |
|
validate_presence_of_columns(df, required_col_names)
Validate the presence of column names in a DataFrame.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
df |
DataFrame
|
A spark DataFrame. |
required |
required_col_names |
list[str]
|
List of the required column names for the DataFrame. |
required |
Returns:
Type | Description |
---|---|
None
|
None. |
Raises:
Type | Description |
---|---|
DataFrameMissingColumnError
|
if any of the requested column names are not present in the DataFrame. |
Source code in quinn/dataframe_validator.py
23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 |
|
validate_schema(df, required_schema, ignore_nullable=False)
Function that validate if a given DataFrame has a given StructType as its schema.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
df |
DataFrame
|
DataFrame to validate |
required |
required_schema |
StructType
|
StructType required for the DataFrame |
required |
ignore_nullable |
bool
|
(Optional) A flag for if nullable fields should be ignored during validation |
False
|
Raises:
Type | Description |
---|---|
DataFrameMissingStructFieldError
|
if any StructFields from the required schema are not included in the DataFrame schema |
Source code in quinn/dataframe_validator.py
41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 |
|