Lineage Client
The DataHub Lineage Client provides a client for adding and retrieving lineage information from DataHub.
If you’re looking for higher-level introduction to adding and getting lineage using the SDK, see the lineage guide.
LineageClient
Bases: object
- Parameters:client (
DataHubClient) –
add_datajob_lineage(*, datajob, upstreams=None, downstreams=None)
Add lineage between a datajob and datasets/datajobs.
- Parameters:
- datajob (
Union[str,DataJobUrn]) – The datajob URN to connect lineage with - upstreams (
Optional[List[Union[str,DatasetUrn,DataJobUrn]]]) – List of upstream datasets or datajobs that serve as inputs to the datajob - downstreams (
Optional[List[Union[str,DatasetUrn]]]) – List of downstream datasets that are outputs of the datajob
- datajob (
- Return type:
None
add_dataset_copy_lineage(*, upstream, downstream, column_lineage='auto_fuzzy')
- Parameters:
- upstream (
Union[str,DatasetUrn]) – - downstream (
Union[str,DatasetUrn]) – - column_lineage (
Union[None,Dict[str,List[str]],Literal['auto_fuzzy','auto_strict']])
- upstream (
- Return type:
None
add_dataset_transform_lineage(*, upstream, downstream, column_lineage=None, transformation_text=None)
- Parameters:
- upstream (
Union[str,DatasetUrn]) – - downstream (
Union[str,DatasetUrn]) – - column_lineage (
Optional[Dict[str,List[str]]]) - transformation_text (
Optional[str])
- upstream (
- Return type:
None
add_lineage(*, upstream, downstream, column_lineage=False, transformation_text=None)
Add lineage between two entities.
This flexible method handles different combinations of entity types:
- dataset to dataset
- dataset to datajob
- datajob to dataset
- datajob to datajob
- dashboard to dataset
- dashboard to chart
- dashboard to dashboard
- dataset to chart
- Parameters:
- upstream (
Union[str,DatasetUrn,DataJobUrn,DashboardUrn,ChartUrn]) – URN of the upstream entity (dataset or datajob) - downstream (
Union[str,DatasetUrn,DataJobUrn,DashboardUrn,ChartUrn]) – URN of the downstream entity (dataset or datajob) - column_lineage (
Union[bool,Dict[str,List[str]],Literal['auto_fuzzy','auto_strict']]) – Optional boolean to indicate if column-level lineage should be added or a lineage mapping type (auto_fuzzy, auto_strict, or a mapping of column-level lineage) - transformation_text (
Optional[str]) – Optional SQL query text that defines the transformation (only applicable for dataset-to-dataset lineage)
- upstream (
- Raises:
- InvalidUrnError – If the URNs provided are invalid
- SdkUsageError – If certain parameter combinations are not supported
- Return type:
None
get_lineage(*, source_urn, source_column=None, direction='upstream', max_hops=1, filter=None, count=500)
Retrieve lineage entities connected to a source entity.
:type source_urn: Union[str, Urn]
:param source_urn: Source URN for the lineage search
:type source_column: Optional[str]
:param source_column: Source column for the lineage search
:type direction: Literal['upstream', 'downstream']
:param direction: Direction of lineage traversal
:type max_hops: int
:param max_hops: Maximum number of hops to traverse
:type filter: Union[_And, _Or, _Not, _EntityTypeFilter, _EntitySubtypeFilter, _StatusFilter, _PlatformFilter, _DomainFilter, _EnvFilter, _CustomCondition, None]
:param filter: Filters to apply to the lineage search
:type count: int
:param count: Maximum number of results to return
- Return type:
List[LineageResult] - Returns: List of lineage results
- Raises:SdkUsageError for invalid filter values –
- Parameters:
- source_urn (str | Urn) –
- source_column (str | None)
- direction (Literal [ 'upstream' ,'downstream'])
- max_hops (int)
- filter ( _And | _Or | _Not | _EntityTypeFilter | _EntitySubtypeFilter | _StatusFilter | _PlatformFilter | _DomainFilter | _EnvFilter | _CustomCondition | None)
- count (int)
infer_lineage_from_sql(*, query_text, platform, platform_instance=None, env='PROD', default_db=None, default_schema=None, override_dialect=None)
Add lineage by parsing a SQL query.
- Parameters:
- query_text (
str) - platform (
str) - platform_instance (
Optional[str]) - env (
str) - default_db (
Optional[str]) - default_schema (
Optional[str]) - override_dialect (
Optional[str])
- query_text (
- Return type:
None
LineagePath
Bases: object
- Parameters:
- urn (
str) - entity_name (
str) - column_name (
Optional[str])
- urn (
column_name : Optional[str] = None
entity_name : str
urn : str
LineageResult
Bases: object
- Parameters:
- urn (
str) - type (
str) - hops (
int) - direction (
Literal['upstream','downstream']) - platform (
Optional[str]) - name (
Optional[str]) - description (
Optional[str]) - paths (
Optional[List[LineagePath]]) –
- urn (