Entities
The DataHub SDK provides a set of entities that can be used to interact with DataHub’s metadata.
Dataset
Bases: HasPlatformInstance, HasSubtype, HasContainer, HasOwnership, HasInstitutionalMemory, HasTags, HasTerms, HasDomain, HasStructuredProperties, Entity
Represents a dataset in DataHub.
A dataset represents a collection of data, such as a table, view, or file. This class provides methods for managing dataset metadata including schema, lineage, and various aspects like ownership, tags, and terms.
- Parameters:
- platform (str)
- name (str)
- platform_instance (Optional [str])
- env (str)
- description (Optional [str])
- display_name (Optional [str])
- qualified_name (Optional [str])
- external_url (Optional [str])
- custom_properties (Optional [Dict[str,str] ])
- created (Optional [datetime])
- last_modified (Optional [datetime])
- parent_container (ParentContainerInputType | Unset)
- subtype (Optional [str])
- owners (Optional [OwnersInputType])
- links (Optional [LinksInputType])
- tags (Optional [TagsInputType])
- terms (Optional [TermsInputType])
- domain (Optional [DomainInputType])
- schema (Optional [SchemaFieldsInputType])
- upstreams (Optional [models.UpstreamLineageClass])
- structured_properties (Optional [StructuredPropertyInputType])
- extra_aspects (ExtraAspectsType)
property created : datetime | None
Get the creation timestamp of the dataset.
- Returns: The creation timestamp if set, None otherwise.
property custom_properties : Dict[str, str]
Get the custom properties of the dataset.
- Returns: Dictionary of custom properties.
property description : str | None
Get the description of the dataset.
- Returns: The description if set, None otherwise.
property display_name : str | None
Get the display name of the dataset.
- Returns: The display name if set, None otherwise.
property external_url : str | None
Get the external URL of the dataset.
- Returns: The external URL if set, None otherwise.
classmethod get_urn_type()
Get the URN type for datasets.
- Return type:
Type[DatasetUrn] - Returns: The DatasetUrn class.
property last_modified : datetime | None
Get the last modification timestamp of the dataset.
- Returns: The last modification timestamp if set, None otherwise.
property qualified_name : str | None
Get the qualified name of the dataset.
- Returns: The qualified name if set, None otherwise.
property schema : List[SchemaField]
Get the schema fields of the dataset.
- Returns: List of SchemaField objects representing the dataset’s schema.
set_created(created)
Set the creation timestamp of the dataset.
- Parameters:created (
datetime) – The creation timestamp to set. - Return type:
None
set_custom_properties(custom_properties)
Set the custom properties of the dataset.
- Parameters:custom_properties (
Dict[str,str]) – Dictionary of custom properties to set. - Return type:
None
set_description(description)
Set the description of the dataset.
- Parameters:description (
str) – The description to set. - Return type:
None
NOTE
If called during ingestion, this will warn if overwriting a non-ingestion description.
set_display_name(display_name)
Set the display name of the dataset.
- Parameters:display_name (
str) – The display name to set. - Return type:
None
set_external_url(external_url)
Set the external URL of the dataset.
- Parameters:external_url (
str) – The external URL to set. - Return type:
None
set_last_modified(last_modified)
- Parameters:last_modified (
datetime) - Return type:
None
set_qualified_name(qualified_name)
Set the qualified name of the dataset.
- Parameters:qualified_name (
str) – The qualified name to set. - Return type:
None
set_upstreams(upstreams)
- Parameters:upstreams (
Union[UpstreamLineageClass,List[Union[str,DatasetUrn,UpstreamClass,FineGrainedLineageClass]],Dict[Union[str,DatasetUrn],Dict[str,List[str]]]]) – - Return type:
None
property upstreams : UpstreamLineageClass | None
property urn : DatasetUrn
Get the entity’s URN.
- Returns: The URN that uniquely identifies this entity.
SchemaField
Bases: object
- Parameters:
- parent (
Dataset) – - field_path (
str)
- parent (
add_tag(tag)
- Parameters:tag (
Union[str,TagUrn,TagAssociationClass]) – - Return type:
None
add_term(term)
- Parameters:term (
Union[str,GlossaryTermUrn,GlossaryTermAssociationClass]) – - Return type:
None
property description : str | None
property field_path : str
property mapped_type : SchemaFieldDataTypeClass
property native_type : str
remove_tag(tag)
- Parameters:tag (
Union[str,TagUrn,TagAssociationClass]) – - Return type:
None
remove_term(term)
- Parameters:term (
Union[str,GlossaryTermUrn,GlossaryTermAssociationClass]) – - Return type:
None
set_description(description)
- Parameters:description (
str) - Return type:
None
set_tags(tags)
- Parameters:tags (
List[Union[str,TagUrn,TagAssociationClass]]) – - Return type:
None
set_terms(terms)
- Parameters:terms (
List[Union[str,GlossaryTermUrn,GlossaryTermAssociationClass]]) – - Return type:
None
property tags : List[TagAssociationClass] | None
property terms : List[GlossaryTermAssociationClass] | None
parse_cll_mapping
- Parameters:
- upstream (
Union[str,DatasetUrn]) – - downstream (
Union[str,DatasetUrn]) – - cll_mapping (
Dict[str,List[str]])
- upstream (
- Return type:
List[FineGrainedLineageClass]
Container
Bases: HasPlatformInstance, HasSubtype, HasContainer, HasOwnership, HasInstitutionalMemory, HasStructuredProperties, HasTags, HasTerms, HasDomain, Entity
- Parameters:
- container_key (
ContainerKey) – - display_name (
str) - qualified_name (
Optional[str]) - description (
Optional[str]) - external_url (
Optional[str]) - extra_properties (
Optional[Dict[str,str]]) - created (
Optional[datetime]) - last_modified (
Optional[datetime]) - parent_container (
Union[Auto,Container,ContainerKey,List[Union[Urn,str]],None]) – - subtype (
Optional[str]) - owners (
Optional[List[Union[CorpUserUrn,CorpGroupUrn,Tuple[Union[CorpUserUrn,CorpGroupUrn],Union[str,OwnershipTypeUrn]],OwnerClass]]]) – - links (
Optional[Sequence[Union[str,Tuple[str,str],InstitutionalMemoryMetadataClass]]]) – - tags (
Optional[List[Union[str,TagUrn,TagAssociationClass]]]) – - terms (
Optional[List[Union[str,GlossaryTermUrn,GlossaryTermAssociationClass]]]) – - domain (
Union[str,DomainUrn,None]) – - structured_properties (
Optional[Dict[Union[str,StructuredPropertyUrn],Sequence[Union[str,float,int]]]]) – - extra_aspects (
Optional[List[TypeVar(Aspect, bound=_Aspect)]]) –
- container_key (
property created : datetime | None
property custom_properties : Dict[str, str] | None
property description : str | None
property display_name : str
property external_url : str | None
classmethod get_urn_type()
Get the URN type for this entity class.
- Return type:
Type[ContainerUrn] - Returns: The URN type class that corresponds to this entity type.
property last_modified : datetime | None
property qualified_name : str | None
set_created(created)
- Parameters:created (
datetime) - Return type:
None
set_custom_properties(custom_properties)
- Parameters:custom_properties (
Dict[str,str]) - Return type:
None
set_description(description)
- Parameters:description (
str) - Return type:
None
set_display_name(value)
- Parameters:value (
str) - Return type:
None
set_external_url(external_url)
- Parameters:external_url (
str) - Return type:
None
set_last_modified(last_modified)
- Parameters:last_modified (
datetime) - Return type:
None
set_qualified_name(qualified_name)
- Parameters:qualified_name (
str) - Return type:
None
MLModel
Bases: HasPlatformInstance, HasOwnership, HasInstitutionalMemory, HasTags, HasTerms, HasDomain, HasVersion, HasStructuredProperties, Entity
- Parameters:
- id (
str) - platform (
str) - version (
Optional[str]) - aliases (
Optional[List[str]]) - platform_instance (
Optional[str]) - env (
str) - name (
Optional[str]) - description (
Optional[str]) - training_metrics (
Union[List[MLMetricClass],Dict[str,Optional[str]],None]) – - hyper_params (
Union[List[MLHyperParamClass],Dict[str,Optional[str]],None]) – - external_url (
Optional[str]) - custom_properties (
Optional[Dict[str,str]]) - created (
Optional[datetime]) - last_modified (
Optional[datetime]) - owners (
Optional[List[Union[CorpUserUrn,CorpGroupUrn,Tuple[Union[CorpUserUrn,CorpGroupUrn],Union[str,OwnershipTypeUrn]],OwnerClass]]]) – - links (
Optional[Sequence[Union[str,Tuple[str,str],InstitutionalMemoryMetadataClass]]]) – - tags (
Optional[List[Union[str,TagUrn,TagAssociationClass]]]) – - terms (
Optional[List[Union[str,GlossaryTermUrn,GlossaryTermAssociationClass]]]) – - domain (
Union[str,DomainUrn,None]) – - model_group (
Union[str,MlModelGroupUrn,None]) – - training_jobs (
Optional[Sequence[Union[str,DataProcessInstanceUrn]]]) – - downstream_jobs (
Optional[Sequence[Union[str,DataProcessInstanceUrn]]]) – - structured_properties (
Optional[Dict[Union[str,StructuredPropertyUrn],Sequence[Union[str,float,int]]]]) – - extra_aspects (
Optional[List[TypeVar(Aspect, bound=_Aspect)]]) –
- id (
add_downstream_job(downstream_job)
- Parameters:downstream_job (
Union[str,DataProcessInstanceUrn]) – - Return type:
None
add_hyper_params(params)
- Parameters:params (
Union[List[MLHyperParamClass],Dict[str,Optional[str]]]) – - Return type:
None
add_training_job(training_job)
- Parameters:training_job (
Union[str,DataProcessInstanceUrn]) – - Return type:
None
add_training_metrics(metrics)
- Parameters:metrics (
Union[List[MLMetricClass],Dict[str,Optional[str]]]) – - Return type:
None
property created : datetime | None
property custom_properties : Dict[str, str] | None
property description : str | None
property downstream_jobs : List[str] | None
property external_url : str | None
classmethod get_urn_type()
Get the URN type for this entity class.
- Return type:
Type[MlModelUrn] - Returns: The URN type class that corresponds to this entity type.
property hyper_params : List[MLHyperParamClass] | None
property last_modified : datetime | None
property model_group : str | None
property name : str | None
remove_downstream_job(downstream_job)
- Parameters:downstream_job (
Union[str,DataProcessInstanceUrn]) – - Return type:
None
remove_training_job(training_job)
- Parameters:training_job (
Union[str,DataProcessInstanceUrn]) – - Return type:
None
set_created(created)
- Parameters:created (
datetime) - Return type:
None
set_custom_properties(custom_properties)
- Parameters:custom_properties (
Dict[str,str]) - Return type:
None
set_description(description)
- Parameters:description (
str) - Return type:
None
set_downstream_jobs(downstream_jobs)
- Parameters:downstream_jobs (
Sequence[Union[str,DataProcessInstanceUrn]]) – - Return type:
None
set_external_url(external_url)
- Parameters:external_url (
str) - Return type:
None
set_hyper_params(params)
- Parameters:params (
Union[List[MLHyperParamClass],Dict[str,Optional[str]]]) – - Return type:
None
set_last_modified(last_modified)
- Parameters:last_modified (
datetime) - Return type:
None
set_model_group(group)
- Parameters:group (
Union[str,MlModelGroupUrn]) – - Return type:
None
set_name(name)
- Parameters:name (
str) - Return type:
None
set_training_jobs(training_jobs)
- Parameters:training_jobs (
Sequence[Union[str,DataProcessInstanceUrn]]) – - Return type:
None
set_training_metrics(metrics)
- Parameters:metrics (
Union[List[MLMetricClass],Dict[str,Optional[str]]]) – - Return type:
None
property training_jobs : List[str] | None
property training_metrics : List[MLMetricClass] | None
property urn : MlModelUrn
Get the entity’s URN.
- Returns: The URN that uniquely identifies this entity.
MLModelGroup
Bases: HasPlatformInstance, HasOwnership, HasInstitutionalMemory, HasTags, HasTerms, HasDomain, HasStructuredProperties, Entity
- Parameters:
- id (
str) - platform (
str) - name (
Optional[str]) - platform_instance (
Optional[str]) - env (
str) - description (
Optional[str]) - display_name (
Optional[str]) - external_url (
Optional[str]) - custom_properties (
Optional[Dict[str,str]]) - created (
Optional[datetime]) - last_modified (
Optional[datetime]) - owners (
Optional[List[Union[CorpUserUrn,CorpGroupUrn,Tuple[Union[CorpUserUrn,CorpGroupUrn],Union[str,OwnershipTypeUrn]],OwnerClass]]]) – - links (
Optional[Sequence[Union[str,Tuple[str,str],InstitutionalMemoryMetadataClass]]]) – - tags (
Optional[List[Union[str,TagUrn,TagAssociationClass]]]) – - terms (
Optional[List[Union[str,GlossaryTermUrn,GlossaryTermAssociationClass]]]) – - domain (
Union[str,DomainUrn,None]) – - training_jobs (
Optional[Sequence[Union[str,DataProcessInstanceUrn]]]) – - downstream_jobs (
Optional[Sequence[Union[str,DataProcessInstanceUrn]]]) – - structured_properties (
Optional[Dict[Union[str,StructuredPropertyUrn],Sequence[Union[str,float,int]]]]) – - extra_aspects (
Optional[List[TypeVar(Aspect, bound=_Aspect)]]) –
- id (
add_downstream_job(downstream_job)
- Parameters:downstream_job (
Union[str,DataProcessInstanceUrn]) – - Return type:
None
add_training_job(training_job)
- Parameters:training_job (
Union[str,DataProcessInstanceUrn]) – - Return type:
None
property created : datetime | None
property custom_properties : Dict[str, str] | None
property description : str | None
property downstream_jobs : List[str] | None
property external_url : str | None
classmethod get_urn_type()
Get the URN type for this entity class.
- Return type:
Type[MlModelGroupUrn] - Returns: The URN type class that corresponds to this entity type.
property last_modified : datetime | None
property name : str | None
remove_downstream_job(downstream_job)
- Parameters:downstream_job (
Union[str,DataProcessInstanceUrn]) – - Return type:
None
remove_training_job(training_job)
- Parameters:training_job (
Union[str,DataProcessInstanceUrn]) – - Return type:
None
set_created(created)
- Parameters:created (
datetime) - Return type:
None
set_custom_properties(custom_properties)
- Parameters:custom_properties (
Dict[str,str]) - Return type:
None
set_description(description)
- Parameters:description (
str) - Return type:
None
set_downstream_jobs(downstream_jobs)
- Parameters:downstream_jobs (
Sequence[Union[str,DataProcessInstanceUrn]]) – - Return type:
None
set_external_url(external_url)
- Parameters:external_url (
str) - Return type:
None
set_last_modified(last_modified)
- Parameters:last_modified (
datetime) - Return type:
None
set_name(display_name)
- Parameters:display_name (
str) - Return type:
None
set_training_jobs(training_jobs)
- Parameters:training_jobs (
Sequence[Union[str,DataProcessInstanceUrn]]) – - Return type:
None
property training_jobs : List[str] | None
property urn : MlModelGroupUrn
Get the entity’s URN.
- Returns: The URN that uniquely identifies this entity.
Dashboard
Bases: HasPlatformInstance, HasSubtype, HasOwnership, HasContainer, HasInstitutionalMemory, HasTags, HasTerms, HasDomain, Entity
Represents a dashboard in DataHub.
- Parameters:
- name (
str) - platform (
Union[str,DataPlatformUrn]) – - display_name (
Optional[str]) - platform_instance (
Union[str,DataPlatformInstanceUrn,None]) – - description (
str) - external_url (
Optional[str]) - dashboard_url (
Optional[str]) - custom_properties (
Optional[Dict[str,str]]) - last_modified (
Optional[datetime]) - last_refreshed (
Optional[datetime]) - input_datasets (
Optional[List[Union[str,DatasetUrn,Dataset]]]) – - charts (
Optional[List[Union[str,ChartUrn,Chart]]]) – - dashboards (
Optional[List[Union[str,DashboardUrn,Dashboard]]]) – - subtype (
Optional[str]) - owners (
Optional[List[Union[CorpUserUrn,CorpGroupUrn,Tuple[Union[CorpUserUrn,CorpGroupUrn],Union[str,OwnershipTypeUrn]],OwnerClass]]]) – - links (
Optional[Sequence[Union[str,Tuple[str,str],InstitutionalMemoryMetadataClass]]]) – - tags (
Optional[List[Union[str,TagUrn,TagAssociationClass]]]) – - terms (
Optional[List[Union[str,GlossaryTermUrn,GlossaryTermAssociationClass]]]) – - domain (
Union[str,DomainUrn,None]) – - extra_aspects (
Optional[List[TypeVar(Aspect, bound=_Aspect)]]) –
- name (
add_chart(chart)
Add a chart to the dashboard.
add_dashboard(dashboard)
Add a dashboard to the dashboard.
- Parameters:dashboard (
Union[str,DashboardUrn,Dashboard]) – - Return type:
None
add_input_dataset(input_dataset)
Add an input dataset to the dashboard.
- Parameters:input_dataset (
Union[str,DatasetUrn,Dataset]) – - Return type:
None
property charts : List[ChartUrn]
Get the charts of the dashboard.
property custom_properties : Dict[str, str]
Get the custom properties of the dashboard.
property dashboard_url : str | None
Get the dashboard URL.
property dashboards : List[DashboardUrn]
Get the dashboards of the dashboard.
property description : str | None
Get the description of the dashboard.
property display_name : str | None
Get the display name of the dashboard.
property external_url : str | None
Get the external URL of the dashboard.
classmethod get_urn_type()
Get the URN type for dashboards.
:rtype: Type[DashboardUrn]
:returns: The DashboardUrn class.
- Return type:Type[DashboardUrn]
property input_datasets : List[DatasetUrn]
Get the input datasets of the dashboard.
property last_modified : datetime | None
Get the last modification timestamp of the dashboard.
property last_refreshed : datetime | None
Get the last refresh timestamp of the dashboard.
property name : str
Get the name of the dashboard.
remove_chart(chart)
Remove a chart from the dashboard.
remove_input_dataset(input_dataset)
Remove an input dataset from the dashboard.
- Parameters:input_dataset (
Union[str,DatasetUrn,Dataset]) – - Return type:
None
set_charts(charts)
Set the charts of the dashboard.
set_custom_properties(custom_properties)
Set the custom properties of the dashboard.
- Parameters:custom_properties (
Dict[str,str]) - Return type:
None
set_dashboard_url(dashboard_url)
Set the dashboard URL.
- Parameters:dashboard_url (
str) - Return type:
None
set_dashboards(dashboards)
Set the dashboards of the dashboard.
- Parameters:dashboards (
List[Union[str,DashboardUrn,Dashboard]]) – - Return type:
None
set_description(description)
Set the description of the dashboard.
- Parameters:description (
str) - Return type:
None
set_display_name(display_name)
Set the display name of the dashboard.
- Parameters:display_name (
str) - Return type:
None
set_external_url(external_url)
Set the external URL of the dashboard.
- Parameters:external_url (
str) - Return type:
None
set_input_datasets(input_datasets)
Set the input datasets of the dashboard.
- Parameters:input_datasets (
List[Union[str,DatasetUrn,Dataset]]) – - Return type:
None
set_last_modified(last_modified)
Set the last modification timestamp of the dashboard.
- Parameters:last_modified (
datetime) - Return type:
None
set_last_refreshed(last_refreshed)
Set the last refresh timestamp of the dashboard.
- Parameters:last_refreshed (
datetime) - Return type:
None
set_title(title)
Set the title of the dashboard.
- Parameters:title (
str) - Return type:
None
property title : str
Get the title of the dashboard.
property urn : DashboardUrn
Get the entity’s URN.
- Returns: The URN that uniquely identifies this entity.
Chart
Bases: HasPlatformInstance, HasSubtype, HasOwnership, HasContainer, HasInstitutionalMemory, HasTags, HasTerms, HasDomain, Entity
Represents a chart in DataHub.
- Parameters:
- name (
str) - platform (
Union[str,DataPlatformUrn]) – - display_name (
Optional[str]) - platform_instance (
Union[str,DataPlatformInstanceUrn,None]) – - description (
Optional[str]) - external_url (
Optional[str]) - chart_url (
Optional[str]) - custom_properties (
Optional[Dict[str,str]]) - last_modified (
Optional[datetime]) - last_refreshed (
Optional[datetime]) - chart_type (
Union[str,ChartTypeClass,None]) – - access (
Optional[str]) - subtype (
Optional[str]) - owners (
Optional[List[Union[CorpUserUrn,CorpGroupUrn,Tuple[Union[CorpUserUrn,CorpGroupUrn],Union[str,OwnershipTypeUrn]],OwnerClass]]]) – - links (
Optional[Sequence[Union[str,Tuple[str,str],InstitutionalMemoryMetadataClass]]]) – - tags (
Optional[List[Union[str,TagUrn,TagAssociationClass]]]) – - terms (
Optional[List[Union[str,GlossaryTermUrn,GlossaryTermAssociationClass]]]) – - domain (
Union[str,DomainUrn,None]) – - input_datasets (
Optional[List[Union[str,DatasetUrn,Dataset]]]) – - extra_aspects (
Optional[List[TypeVar(Aspect, bound=_Aspect)]]) –
- name (
property access : str | None
Get the access level of the chart as a string.
add_input_dataset(input_dataset)
Add an input to the chart.
- Parameters:input_dataset (
Union[str,DatasetUrn,Dataset]) – - Return type:
None
property chart_type : str | None
Get the type of the chart as a string.
property chart_url : str | None
Get the chart URL.
property custom_properties : Dict[str, str]
Get the custom properties of the chart.
property description : str | None
Get the description of the chart.
property display_name : str | None
Get the display name of the chart.
property external_url : str | None
Get the external URL of the chart.
classmethod get_urn_type()
Get the URN type for charts.
:rtype: Type[ChartUrn]
:returns: The ChartUrn class.
- Return type:Type[ChartUrn]
property input_datasets : List[DatasetUrn]
Get the input datasets of the chart.
property last_modified : datetime | None
Get the last modification timestamp of the chart.
property last_refreshed : datetime | None
Get the last refresh timestamp of the chart.
property name : str
Get the name of the chart.
remove_input_dataset(input_dataset)
Remove an input from the chart.
- Parameters:input_dataset (
Union[str,DatasetUrn,Dataset]) – - Return type:
None
set_access(access)
Set the access level of the chart.
- Parameters:access (
Union[str,AccessLevelClass]) – - Return type:
None
set_chart_type(chart_type)
Set the type of the chart.
- Parameters:chart_type (
Union[str,ChartTypeClass]) – - Return type:
None
set_chart_url(chart_url)
Set the chart URL.
- Parameters:chart_url (
str) - Return type:
None
set_custom_properties(custom_properties)
Set the custom properties of the chart.
- Parameters:custom_properties (
Dict[str,str]) - Return type:
None
set_description(description)
Set the description of the chart.
- Parameters:description (
str) - Return type:
None
set_display_name(display_name)
Set the display name of the chart.
- Parameters:display_name (
str) - Return type:
None
set_external_url(external_url)
Set the external URL of the chart.
- Parameters:external_url (
str) - Return type:
None
set_input_datasets(input_datasets)
Set the input datasets of the chart.
- Parameters:input_datasets (
List[Union[str,DatasetUrn,Dataset]]) – - Return type:
None
set_last_modified(last_modified)
Set the last modification timestamp of the chart.
- Parameters:last_modified (
datetime) - Return type:
None
set_last_refreshed(last_refreshed)
Set the last refresh timestamp of the chart.
- Parameters:last_refreshed (
datetime) - Return type:
None
set_title(title)
Set the title of the chart.
- Parameters:title (
str) - Return type:
None
property title : str
Get the title of the chart.
property urn : ChartUrn
Get the entity’s URN.
- Returns: The URN that uniquely identifies this entity.
DataJob
Bases: HasPlatformInstance, HasSubtype, HasContainer, HasOwnership, HasInstitutionalMemory, HasTags, HasTerms, HasDomain, HasStructuredProperties, Entity
Represents a data job in DataHub. A data job is an executable unit of a data pipeline, such as an Airflow task or a Spark job.
- Parameters:
- name (
str) - flow (
Optional[DataFlow]) – - flow_urn (
Union[str,DataFlowUrn,None]) – - platform_instance (
Optional[str]) - display_name (
Optional[str]) - description (
Optional[str]) - external_url (
Optional[str]) - custom_properties (
Optional[Dict[str,str]]) - created (
Optional[datetime]) - last_modified (
Optional[datetime]) - subtype (
Optional[str]) - owners (
Optional[List[Union[CorpUserUrn,CorpGroupUrn,Tuple[Union[CorpUserUrn,CorpGroupUrn],Union[str,OwnershipTypeUrn]],OwnerClass]]]) – - links (
Optional[Sequence[Union[str,Tuple[str,str],InstitutionalMemoryMetadataClass]]]) – - tags (
Optional[List[Union[str,TagUrn,TagAssociationClass]]]) – - terms (
Optional[List[Union[str,GlossaryTermUrn,GlossaryTermAssociationClass]]]) – - domain (
Union[str,DomainUrn,None]) – - inlets (
Optional[List[Union[str,DatasetUrn]]]) – - outlets (
Optional[List[Union[str,DatasetUrn]]]) – - fine_grained_lineages (
Optional[List[FineGrainedLineageClass]]) – - structured_properties (
Optional[Dict[Union[str,StructuredPropertyUrn],Sequence[Union[str,float,int]]]]) – - extra_aspects (
Optional[List[TypeVar(Aspect, bound=_Aspect)]]) –
- name (
property created : datetime | None
Get the creation timestamp of the data job.
property custom_properties : Dict[str, str]
Get the custom properties of the data job.
property description : str | None
Get the description of the data job.
property display_name : str | None
Get the display name of the data job.
property env : str | None
Get the environment of the data job.
property external_url : str | None
Get the external URL of the data job.
property fine_grained_lineages : List[FineGrainedLineageClass]
property flow_urn : DataFlowUrn
Get the data flow associated with the data job.
classmethod get_urn_type()
Get the URN type for data jobs.
- Return type:
Type[DataJobUrn]
property inlets : List[DatasetUrn]
Get the inlets of the data job.
property last_modified : datetime | None
Get the last modification timestamp of the data job.
property name : str
Get the name of the data job.
property outlets : List[DatasetUrn]
Get the outlets of the data job.
set_created(created)
Set the creation timestamp of the data job.
- Parameters:created (
datetime) - Return type:
None
set_custom_properties(custom_properties)
Set the custom properties of the data job.
- Parameters:custom_properties (
Dict[str,str]) - Return type:
None
set_description(description)
Set the description of the data job.
- Parameters:description (
str) - Return type:
None
set_display_name(display_name)
Set the display name of the data job.
- Parameters:display_name (
str) - Return type:
None
set_external_url(external_url)
Set the external URL of the data job.
- Parameters:external_url (
str) - Return type:
None
set_fine_grained_lineages(lineages)
- Parameters:lineages (
List[FineGrainedLineageClass]) – - Return type:
None
set_inlets(inlets)
Set the inlets of the data job.
- Parameters:inlets (
List[Union[str,DatasetUrn]]) – - Return type:
None
set_last_modified(last_modified)
Set the last modification timestamp of the data job.
- Parameters:last_modified (
datetime) - Return type:
None
set_outlets(outlets)
Set the outlets of the data job.
- Parameters:outlets (
List[Union[str,DatasetUrn]]) – - Return type:
None
property urn : DataJobUrn
Get the entity’s URN.
- Returns: The URN that uniquely identifies this entity.
DataFlow
Bases: HasPlatformInstance, HasSubtype, HasOwnership, HasContainer, HasInstitutionalMemory, HasTags, HasTerms, HasDomain, HasStructuredProperties, Entity
Represents a dataflow in DataHub. A dataflow represents a collection of data, such as a table, view, or file. This class provides methods for managing dataflow metadata including schema, lineage, and various aspects like ownership, tags, and terms.
- Parameters:
- name (
str) - platform (
str) - display_name (
Optional[str]) - platform_instance (
Optional[str]) - env (
str) - description (
Optional[str]) - external_url (
Optional[str]) - custom_properties (
Optional[Dict[str,str]]) - created (
Optional[datetime]) - last_modified (
Optional[datetime]) - subtype (
Optional[str]) - owners (
Optional[List[Union[CorpUserUrn,CorpGroupUrn,Tuple[Union[CorpUserUrn,CorpGroupUrn],Union[str,OwnershipTypeUrn]],OwnerClass]]]) – - links (
Optional[Sequence[Union[str,Tuple[str,str],InstitutionalMemoryMetadataClass]]]) – - tags (
Optional[List[Union[str,TagUrn,TagAssociationClass]]]) – - terms (
Optional[List[Union[str,GlossaryTermUrn,GlossaryTermAssociationClass]]]) – - domain (
Union[str,DomainUrn,None]) – - parent_container (
Union[Container,ContainerKey,List[Union[Urn,str]],Unset]) – - structured_properties (
Optional[Dict[Union[str,StructuredPropertyUrn],Sequence[Union[str,float,int]]]]) – - extra_aspects (
Optional[List[TypeVar(Aspect, bound=_Aspect)]]) –
- name (
property created : datetime | None
Get the creation timestamp of the dataflow. :returns: The creation timestamp if set, None otherwise.
property custom_properties : Dict[str, str]
Get the custom properties of the dataflow. :returns: Dictionary of custom properties.
property description : str | None
Get the description of the dataflow. :returns: The description if set, None otherwise.
property display_name : str | None
Get the display name of the dataflow. :returns: The display name if set, None otherwise.
property env : str | FabricTypeClass | None
Get the environment of the dataflow.
property external_url : str | None
Get the external URL of the dataflow. :returns: The external URL if set, None otherwise.
classmethod get_urn_type()
Get the URN type for dataflows.
:rtype: Type[DataFlowUrn]
:returns: The DataflowUrn class.
- Return type:Type[DataFlowUrn]
property last_modified : datetime | None
Get the last modification timestamp of the dataflow. :returns: The last modification timestamp if set, None otherwise.
property name : str
Get the name of the dataflow. :returns: The name of the dataflow.
set_created(created)
Set the creation timestamp of the dataflow.
:type created: datetime
:param created: The creation timestamp to set.
- Return type:
None - Parameters:created (datetime)
set_custom_properties(custom_properties)
Set the custom properties of the dataflow.
:type custom_properties: Dict[str, str]
:param custom_properties: Dictionary of custom properties to set.
- Return type:
None - Parameters:custom_properties (Dict [str,str])
set_description(description)
Set the description of the dataflow.
:type description: str
:param description: The description to set.
:rtype: None
NOTE
If called during ingestion, this will warn if overwriting a non-ingestion description.
- Parameters:description (str)
- Return type: None
set_display_name(display_name)
Set the display name of the dataflow.
:type display_name: str
:param display_name: The display name to set.
- Return type:
None - Parameters:display_name (str)
set_external_url(external_url)
Set the external URL of the dataflow.
:type external_url: str
:param external_url: The external URL to set.
- Return type:
None - Parameters:external_url (str)
set_last_modified(last_modified)
- Parameters:last_modified (
datetime) - Return type:
None
property urn : DataFlowUrn
Get the entity’s URN.
- Returns: The URN that uniquely identifies this entity.