Skip to content

Metadata impact report

Source code

The metadata impact report package calculates various metrics about your data landscape from the metadata you manage in Atlan.

Use cases

  • Monitoring the rollout of Atlan over time, or as your data landscape evolves
  • Identifying opportunities for cost savings, based on indicators in the metadata (such as assets that are unused)
  • Tracking adoption and enrichment of assets in Atlan, possibly as part of a gamification drive during onboarding or rollout sprints

Configuration

Outputs

The report will always produce an Excel file, but you can choose whether to also create a glossary of metrics in Atlan:

  • Generate glossary?

    Creates (or updates, if it already exists) a glossary describing the various metrics the report calculates. The numeric quantity for each metric is also updated each time the report runs.

    • Glossary name: name of the glossary in which to store the metadata metrics.

    Only creates the Excel output

  • Include details: whether to include detailed results (Yes), or only the overall quantified metrics (No) in the Excel file produced.

Delivery

  • Export via: the results will always be downloadable from the workflow.

    If a direct download from the workflow is all you require, you can leave this default selected.

    The results will also be emailed to the list of addresses provided.

    • Email address(es): enter a list of email addresses, comma-separated, where you want the results to be sent as an attachment.

    The results will also be uploaded to the object storage location provided.

    • Prefix (path) the directory (path) within the object store into which to upload the exported file.
    • Object key (filename) the object key (filename), including its extension, within the object store and prefix.
    • Cloud object store the object store into which to upload the results.

      • AWS access key: your AWS access key.
      • AWS secret key: your AWS secret key.
      • Region: your AWS region.
      • Bucket: your AWS bucket.

      Reusing Atlan's backing S3 store

      When your Atlan tenant is deployed in AWS, you can leave all of these blank to reuse the backing store of Atlan itself.

      • Project ID: the ID of your GCP project.
      • Servive account JSON: your service account credentials, as JSON.
      • Bucket your GCS bucket.

      Reusing Atlan's backing GCS store

      When your Atlan tenant is deployed in GCP, you can leave all of these blank to reuse the backing store of Atlan itself.

      • Azure client ID: the unique application (client) ID assigned to your app by Azure AD when the app was registered.
      • Azure client secret: your Azure client secret (it's actual value, not its identifier).
      • Azure tenant ID: the unique identifier of the Azure Active Directory instance.
      • Storage account name: name of your storage account.
      • Container: your ADLS container.

      Reusing Atlan's backing ADLS store

      When your Atlan tenant is deployed in Azure, you can leave all of these blank to reuse the backing store of Atlan itself.

What it does

The package tracks the metadata metrics in Atlan, through a dedicated glossary, and also produces an Excel file report that consolidates all of the information into an easy-to-share file format.

Glossary

A single glossary named as-specified in the input will be created (or updated, if it already exists).

Terms

One term will be created (or updated, if it already exists) for each of the metrics defined in the report. Each term will be complete with:

  • Name and acronym
  • Description of what it represents
  • A quantified number
  • A certificate indicating whether there are any caveats (draft) or not (verified)
  • A warning announcement describing any caveats with the metric
  • An information announcement describing any other points of note with the metric
  • Categorization of each term to represent is typical use

For details of each metric, refer to the glossary

For details about each metric, we recommend running the report in your Atlan environment and reviewing the metrics in the glossary it produces. There you can also capture any further information you like specific to your own needs and use of the metric: either by updating the description1 or providing even more details in a README.

Excel file

In addition to the terms, an Excel file will be produced that includes these sheets:

Overview

Metric

Acronym and name of the metric.

Description

An explanation of what the metric measures, and how it can be used.

Result

Quantified number for the metric, produced from the metadata in your Atlan environment.

Caveats

An explanation of any of the caveats to be aware of with the quantified number.

Notes

Any other notes to be aware of regarding the metric.

Per-metric worksheet

If requested (using "include details"), a separate worksheet listing every asset that was counted as part of calculating each quantified metric is also included in the Excel report. The columns vary depending on the metric, but will typically include:

Connector

Type of connector for the asset, for example Snowflake.

Database

Name of the database for the asset.

Schema

Name of the schema for the asset.

Name

Name of the asset.

Type

Type of the asset, for example table vs view.

Size (GB)

Amount of storage, in gigabytes, used by the asset (at source).

Cost

Approximate compute cost associated with queries run against the asset.

Link

A link directly to this asset's detailed profile within Atlan.

How it works

The report is very modular, with each metric defined individually as:

  • A name, acronym, description, and any possible caveats or notes
  • A query that counts the assets that match a metric
  • A tabular heading and per-record format for listing out the detailed assets that fit the metric

The report itself simply iterates through each of these metrics and:

  1. Runs the query.
  2. Retrieves a total quantity from the result, to use as the overall quantified number.
  3. Iterates through the individual results of the search to output each detailed asset (if requested).

The glossary (for all) and term (for each metric) are idempotently created or updated at each step of the process with the details calculated.


  1. If you update the description through the UI, your own description will always take precedence. Even re-running the package will only ever update the background description, which will not appear on the UI and will not clobber or replace your overridden description.