Skip to content

Google Cloud Storage -> Big Query (external table) lineage

The Google Cloud Storage -> BigQuery (external table) lineage generates lineage between BigQuery external tables and the upstream GCS objects. Both BigQuery tables and GCS objects have to be crawled in Atlan before using this package.

Configuration

Configuration

  • BigQuery connection: BigQuery connection where the external table are stored.
  • Regex to match characters to replace: regular expression to match characters to replace. It acts on the file full name (without bucket prefix).
  • Regex with replacement characters: regular expression with replacement characters. It acts on the file full name (without bucket prefix).
  • Operation, default: Generate:
    • Generate Lineage: to generate the lineage on Atlan.
    • Delete Lineage: to delete the lineage on Atlan (only the lineage previously created by this package will be deleted).

What it does

The package performs the following steps:

  • Retrieve the external tables from BigQuery.
  • Retrieve the GCS associated to each BigQuery table from the BigQuery metadata.
  • Search in Atlan all GCS objects that match the file name extracted from the BigQuery metadata.
  • If Operation = Generate Lineage -> the lineage is published to Atlan, if Operation = Delete Lineage -> the lineage is deleted from Atlan.