Config
loadx.scd2.config
SCD2ColumnNames
dataclass
Custom names for SCD2 output columns.
All fields are optional and default to their standard names. Pass an instance
of this class to SCD2Loader.slowly_changing_dimension() via scd_columns
to rename any subset of output columns.
Attributes:
| Name | Type | Description |
|---|---|---|
valid_from |
str
|
Date when the record became active. |
valid_until |
str
|
Date when the record was superseded. |
active_flag |
str
|
|
delete_flag |
str
|
|
row_hash |
str
|
SHA-256 hash of non-key columns, used for change detection. |
insert_date |
str
|
Timestamp when this record version was written to the target table. |
latest_record_flag |
str
|
|
Example
from loadx import SCD2ColumnNames
SCD2ColumnNames(valid_from="eff_start_date", valid_until="eff_end_date")
Source code in loadx/scd2/config.py
32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 | |
from_dict(data: dict[str, Any]) -> SCD2ColumnNames
classmethod
Create SCD2ColumnNames instance from dictionary, filtering valid fields.
Source code in loadx/scd2/config.py
66 67 68 69 70 71 | |
field_list() -> list[str]
Get list of all field attribute names.
Source code in loadx/scd2/config.py
73 74 75 | |
column_list() -> list[str]
Get list of all actual output column names (respects user-defined renames).
Source code in loadx/scd2/config.py
77 78 79 | |
SCD2Config
dataclass
Configuration for SCD2 processing.
Source code in loadx/scd2/config.py
82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 | |
__post_init__() -> None
Initialize default values for optional fields.
Source code in loadx/scd2/config.py
95 96 97 98 99 100 | |
create(business_keys: list[str] | str, date_column: str = DEFAULT_DATE_COLUMN, ignore_columns: list[str] | None = None, non_copy_fields: list[str] | None = None, open_end_date: datetime | None = OPEN_END_DATE, scd_columns: SCD2ColumnNames | dict[str, str] | None = None, enable_latest_record_flag: bool = False, source_type: SourceType = SourceType.FULL) -> SCD2Config
classmethod
Create an SCD2Config instance with coercion and defaults applied.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
business_keys
|
list[str] | str
|
Column(s) that uniquely identify a dimension row. A single string is automatically wrapped in a list. |
required |
date_column
|
str
|
Column containing the snapshot date. |
DEFAULT_DATE_COLUMN
|
ignore_columns
|
list[str] | None
|
Columns excluded from hash-based change detection. |
None
|
non_copy_fields
|
list[str] | None
|
Source columns excluded from the output DataFrame. |
None
|
open_end_date
|
datetime | None
|
Value written to |
OPEN_END_DATE
|
scd_columns
|
SCD2ColumnNames | dict[str, str] | None
|
Override default SCD2 output column names. Accepts an
|
None
|
enable_latest_record_flag
|
bool
|
When |
False
|
source_type
|
SourceType
|
Whether the source is a |
FULL
|
Source code in loadx/scd2/config.py
102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 | |