Skip to content

Snowflake Asset description reverse sync

Asset description reverse sync Package enables users to reverse-sync updated descriptions of assets from Atlan back to Snowflake by adding them as comments on the corresponding objects within Snowflake. This package provides a stored procedure that users can execute to update the comments on Snowflake objects.

In-Scope: Snowflake

Mode: Offline

Prerequisites for Execution.

  • Things to note before running the workflow

    Important

    • Atlan must be the sole platform for managing data asset descriptions. Using alternative methods to add descriptions may result in loss of content due to overwrites.
    • Ensure that the view definitions in Atlan are latest if descriptions of View Columns are to be enriched.
  • To successfully execute the artifacts generated by this package on Snowflake, the Snowflake user-account performing the operation must have the necessary roles and permissions to update Comments on the relevant Snowflake objects.

  • Users can use the Dry Run option after loading the objects data when calling procedure.sql to preview the changes that will be made to the objects before actually executing the reverse-sync. ex:- call procedure('YES');.
  • The snowflake.account_usage.<object-type-view-name>.<object-type>_owner column must not be empty for successfully updating comments for objects. Objects can include tables, schemas, databases, functions, etc. For example, you can use the following query to check the owner of a table -
    select table_name, table_owner from snowflake.account_usage.views  where table_catalog = '<yourDatabaseName>';
    
  • To retrieve all roles possessed by a given user, run the SQL below on Snowflake.
    SELECT REPLACE(VALUE, '"') as CURRENT_ROLE FROM TABLE(FLATTEN(input => PARSE_JSON(CURRENT_AVAILABLE_ROLES())))
    
  • To grant OWNERSHIP privileges on an object to an existing role, run the SQL below on Snowflake.
    GRANT OWNERSHIP ON '<object_type>' '<object_name>' TO ROLE '<role_name>' COPY CURRENT GRANTS;
    

Credentials

In this section, you will configure the output destination where the assets are extracted.

Image title

  • Atlan API Token: this is required to generate the dynamic dropdowns (see the Asset types in scope screens)

Warning

The CSV file can be downloaded from the Atlan UI follow steps in download section.

Image title

  • The Bucket which will be used to store the output value is default S3 bucket attached to the tenant.
  • Atlan API Token: this is required to generate the dynamic dropdowns (see the Asset types in scope screens)

  • S3 Bucket Name

  • S3 Folder Path: folder path with trailing slash - atlan/output/ (make sure to add slash / at end of folder name)
  • Region: s3 bucket region
  • For example, the bucket name is Atlan_bucket, the path is reverse_description_sync/, and the region is eu-west-2. You need to navigate to the Atlan_bucket, then go to the reverse_description_sync directory. Inside, you will find two SQL files: procedure.sql and loader.sql.

Configuration

Image title

  • Extraction mode - Defaults to Offline.
  • Connection - Select the Snowflake connection you want to use for syncing descriptions from Atlan to Snowflake.
  • Asset types in scope - Choose the asset types you want to include in the reverse-sync operation, such as ['Table', 'Column']. This selection will ensure that only Tables and Columns with updated user descriptions are included in the process.
  • Recompile Views? - If selected as YES, the workflow will handle updating descriptions for Columns in Views, which will require a recompilation of the updated View definition SQL that could result in a momentary unavailability of the impacted Views. If NO is selected, the workflow will NOT handle updating descriptions for Columns in Views.
  • Interval for Run - Select the time interval in which the user-description was updated on Atlan for the selected assets.

Audit Trail

Image title

Database Name and Schema Name are used to create two types of tables during the execution of the stored procedure:

  • tbl_atlan_reverse_sync_inventory (Temporary Table) - This table is a temporary table that exists only for the duration of the current database session. It is used to store temporary data that is required for updating comments. Once the session ends or the user logs out, this table is automatically dropped and no longer exists. It helps manage intermediate data efficiently without leaving any persistent footprint.

  • tbl_atlan_reverse_sync_audit (Persistent Table) - This table is created only if it does not already exist in the environment. It is a persistent table that remains in the database across sessions. If it is the first time the stored procedure is executed or if the table is not present in the current environment, this table will be created. It is used to store data that needs to persist beyond the current session, ensuring that essential information is retained for future operations. By using these tables appropriately, the stored procedure manages both temporary and persistent data effectively, facilitating accurate updates and maintaining necessary records.

Downloading Artifacts

  • The workflow generates two SQL files - loader.sql and procedure.sql. The former handles the creation of the temporary/audit tables and populating them appropriately, while the latter handles the actual reverse-sync of descriptions for the assets in scope.

    Download Files

    • You can download the file from Atlan UI Atlan download refrence
    • You can use the Atlan API/SDK to download the file from the object store as illustrated here. The key would be <s3-prefix-value-provided-during-configuration>/loader.sql and <s3-prefix-value-provided-during-configuration>/procedure.sql.

Executing Artifacts

  • Steps to execute

    Execute on Snowflake UI
    • First, create the stored procedure on Snowflake. To do this, copy the content of procedure.sql and execute it on Snowflake. This will create the procedure on Snowflake. It will remain in the Snowflake environment until explicitly deleted.
    • Once this is done, copy all the content from loader.sql and paste it over a separate Snowflake window.
    • Select the appropriate database and schema at top.
    • Scroll to the end of the file where you will find the CALL statement. If you want to execute the stored procedure in Dry-run mode, pass YES as the parameter; otherwise, pass NO.
    • After that, select all the statements and execute them.
    Execute using SnowSql
    • Install SnowSql on your machine and configure it. Refer to the SnowSQL Installation and Configuration Guide for detailed instructions.
    • If your Snowflake account uses SSO for authentication, use the following command to authenticate and execute the artifacts: snowsql -a <account_id> -u <username> --authenticator externalbrowser -r <Role> -w <warehouse> -d <database> -s MY_SCHEMA -f procedure.sql snowsql -a <account_id> -u <username> --authenticator externalbrowser -r <Role> -w <warehouse> -d <database> -s MY_SCHEMA -f loader.sql
    • If you use a password for login, execute following command: snowsql -a <account_name> -u <username> -r <role> -w <warehouse> -d <database> -s <schema> -f procedure.sql snowsql -a <account_name> -u <username> -r <role> -w <warehouse> -d <database> -s <schema> -f loader.sql

What it does

  • Get assets whose user-description was updated in selected time range and updated that as a comment on Snowflake.
How it works
  • Extract assets from the selected Snowflake connection.
  • Transform the data and convert it into a single Python dictionary.
  • Create the main stored procedure and an SQL file containing multiple insert statements, along with a call statement for the main stored procedure, using this data and a predefined Jinja template.
  • Based on the selected output method, the file is either uploaded to an S3 bucket or made available for download from an Argo artifact.

Image title