Skip to content

BigQuery Linked-Dataset lineage generator

Establishes lineage for linked datasets created as a result of a subscription off a Listing on a BigQuery Analytics-Hub Data-Exchange.

Subscribing to a Listing creates a linked-dataset in the target project that is essentially a sym-link to the dataset in source project associated with the Listing. All tables within the dataset are also available under the linked-dataset and are immutable. This packages determines the linkages between the linked and source datasets, and uses that info to build lineage across the tables in both datasets.

Configuration

Connection

Specify the BigQuery connection where the linked-datasets exist for the lineage to be established.

What it does

The package performs the following steps:

  • Analyzes the dataset metadata (Schema assets) to identify the linked ones and associated source datasets.
  • Matches ingested Tables in both datasets and create a lineage path between the matched pairs.