Asset export (basic)¶
The asset export (basic) package identifies all assets that could have been enriched in some way through Atlan's UI and extracts them. The resulting CSV file can be modified or enriched, and then loaded back using the asset import package.
Use cases¶
- Backup any manual enrichment users have done through the UI
- Migrate manual enrichment done to one set of data assets to another (for example, when migrating from one data warehouse to another but retaining the same data structures)
Configuration¶
Intentionally limited
The configurable options of this package are intentionally limited, to make using it as simple as possible. For more extensive options, see the asset export (advanced) package.
Scope¶
-
Export scope: which assets you want to export
Only glossaries, terms and categories will be exported.
Only those assets that were enriched by users will be exported.
- Qualified name prefix (for assets): starting value for a qualifiedName that will determine which assets to export, default:
default
(all data assets). - Include description?: whether to extract only any user-provided description (No) or include system-crawled descriptions as well (Yes).
- Include glossaries?: whether glossaries (and their terms and categories) should be exported, too (Yes) or not (No).
All assets, whether enriched by users or not, will be exported.
- Qualified name prefix (for assets): starting value for a qualifiedName that will determine which assets to export, default:
default
(all data assets). - Include description?: whether to extract only any user-provided description (No) or include system-crawled descriptions as well (Yes).
- Include glossaries?: whether glossaries (and their terms and categories) should be exported, too (Yes) or not (No).
- Qualified name prefix (for assets): starting value for a qualifiedName that will determine which assets to export, default:
Delivery¶
-
Export via: the results will always be downloadable from the workflow.
If a direct download from the workflow is all you require, you can leave this default selected.
The results will also be emailed to the list of addresses provided.
- Email address(es): enter a list of email addresses, comma-separated, where you want the results to be sent as an attachment.
The results will also be uploaded to the object storage location provided.
- Prefix (path) the directory (path) within the object store into which to upload the exported file.
- Object key (filename) the object key (filename), including its extension, within the object store and prefix.
-
Cloud object store the object store into which to upload the results.
- AWS access key: your AWS access key.
- AWS secret key: your AWS secret key.
- Region: your AWS region.
- Bucket: your AWS bucket.
Reusing Atlan's backing S3 store
When your Atlan tenant is deployed in AWS, you can leave all of these blank to reuse the backing store of Atlan itself.
- Project ID: the ID of your GCP project.
- Servive account JSON: your service account credentials, as JSON.
- Bucket your GCS bucket.
Reusing Atlan's backing GCS store
When your Atlan tenant is deployed in GCP, you can leave all of these blank to reuse the backing store of Atlan itself.
- Azure client ID: the unique application (client) ID assigned to your app by Azure AD when the app was registered.
- Azure client secret: your Azure client secret (it's actual value, not its identifier).
- Azure tenant ID: the unique identifier of the Azure Active Directory instance.
- Storage account name: name of your storage account.
- Container: your ADLS container.
Reusing Atlan's backing ADLS store
When your Atlan tenant is deployed in Azure, you can leave all of these blank to reuse the backing store of Atlan itself.
What it does¶
Assets CSV file¶
The assets that match the supplied input criteria will be extracted into a CSV file, one per row. Each row will contain:
- The type of the asset
- The
qualifiedName
of the asset - Values for all possible enrichment to that asset
Detailed information on the columns in the CSV file produced:
qualifiedName
¶
Required. Unique name of the asset. This combined with typeName will be unique in Atlan.
typeName
¶
Required. Type of the asset.
name
¶
Required. Name of the asset. This will be the technical name, as crawled from a source system.
connectionQualifiedName
¶
Required. Qualified name of the connection for the asset. Without this, discovery filters will not work for the asset.
connectorType
¶
Required. Name of the connector type for the asset (for example: snowflake
). Without this, discovery filters will not work for the asset.
displayName
¶
An optional name you can give to the asset to override how it is displayed in the Atlan UI. If present, this will be shown in the UI instead of name
.
description
¶
Explanation of the asset, possibly crawled from a source system.
userDescription
¶
Explanation of the asset, as entered or confirmed by a user through the Atlan UI. If present, this will be shown in the UI instead of description
.
ownerUsers
¶
Individual users who are owners of the asset. Each user should be separated by a newline within the cell.
ownerGroups
¶
Groups of users who are owners of the asset. Each group should be separated by a newline within the cell.
certificateStatus
¶
Certificate on the asset. Must either be empty or one of:
VERIFIED
DRAFT
DEPRECATED
certificateStatusMessage
¶
An optional message that can be associated with the certificate (only used if certificateStatus
is non-empty).
announcementType
¶
Type of announcement on the asset. Must either be empty or one of:
information
warning
issue
announcementTitle
¶
Heading line for the announcement on the asset (only used if announcementType
is non-empty).
announcementMessage
¶
An optional detailed message that can be associated with the announcement (only used if announcementType
is non-empty).
assignedTerms
¶
Business terms that are assigned to the asset. Each term should be separated by a newline within the cell, and formatted as:
Term Name@@@Glossary Name
atlanTags
¶
Atlan tags that are assigned to the asset. Each tag should be separated by a newline within the cell, and formatted as one of:
Tag Name
, for tags that should be directly assigned and should not be propagatedTag Name>>FULL
for tags that should be directly assigned to the asset and propagated down their hierarchy and through lineageTag Name>>HIERARCHY_ONLY
for tags that should be directly assigned to the asset and only be propagated down their hierarchy (not through lineage)-
Tag Name<<PROPAGATED
for tags that have been propagated to the asset.Propagated tags will be ignored on import
Any tag marked propagated (
Tag Name<<PROPAGATED
) will be ignored by an import. Only those tags that are directly applied will be imported, though of course any tags applied up-hierarchy or upstream that are marked to propagate will still propagate accordingly.
For source tags (with values), you can extend the tag name portion as follows:
Tag Name {{connectorType/connectorName@@sourceTagLocation??key=value}}
, where:connectorType
is the type of the source tag (snowflake
,dbt
, etc)connectorName
is the name of the connection for the source the tag is synced fromsourceTagLocation
is the path within that connection where the source tag existskey
is an optional key for the associated value for the tagvalue
is the value for the associated tag
For example, this will associate the Confidential
Atlan tag, which is synced with the CONFIDENTIAL
Snowflake tag in the ANALYTICS
database's WIDE_WORLD_IMPORTERS
schema and is synced through the development
Snowflake connection. It has a value of Not Restricted
in Snowflake, and the tag itself should be fully-propagated.
Confidential {{snowflake/development@@ANALYTICS/WIDE_WORLD_IMPORTERS/CONFIDENTIAL??=Not Restricted}}>>FULL
links
¶
List of resources (links) assigned to the asset. Each link should be separated by a newline within the cell, and formatted as embedded JSON:
{"typeName":"Link","attributes":{"name":"linkName","link":"https://www.example.com"}}
readme
¶
Richly-formatted, detailed documentation for the asset. This should be an HTML-formatted string containing everything that would be inside <body></body>
, without the <body></body>
wrapping.
starredDetails
¶
Details about users who have starred the asset. Each starred asset detail entry should be separated by a newline within the cell, and formatted as embedded JSON:
{"assetStarredBy":"someone","assetStarredAt":1698769268966}
{CM}::{attribute}
¶
Any number of columns using a ::
separator in their heading represent custom metadata.
- The
{CM}
portion must give the name of the custom metadata - The
{attribute}
portion must give the name of an attribute within the custom metadata.
Both are the human-readable names.
For multi-valued custom metadata attributes, each value should be separated by a newline within the cell. Date values should be provided as an epoch-style timestamp (purely numeric).
.. (remaining columns) ..
¶
For an import, you can also supply any number of additional columns. These will be loaded as attributes on the asset on that row, and should contain the type of data expected for that attribute. (If a particular attribute does not apply to the type of asset on that row, it's cell should be left blank.)
You can find a listing of all attributes available, per asset type, through the full model reference . (Search within the diagram for the asset type of interest, or browse for it under Types
along the navigation bar on the left.)
For attributes that specify a relationship to another asset, use this format:
TypeName@qualifiedName
For example:
Table@default/snowflake/1234567890/DB/SCHEMA/TABLE_NAME
Glossaries CSV file¶
If requested, the glossaries, categories and terms will all be extracted into a separate CSV file, one per row. Each row will contain:
- The type of the asset
- The
qualifiedName
of the asset - Values for all possible enrichment to that asset
Detailed information on the columns in the CSV file produced:
qualifiedName
¶
Purely for your own reference, this is ignored during any import (and can therefore be empty).
typeName
¶
Required. Type of the asset.
name
¶
Required. Name of the asset.
anchor
¶
Required. Name of the glossary in which the term or category is contained.
parentCategory
¶
Parent category when the category on the row is a subcategory. The categories should use @
as a path-delimiter, and should have @@@
followed by the glossary name at the end. For example, if the parent category is called Lowest
, which itself is a subcategory of Middle
, itself a subcategory of Top
:
Top@Middle@Lowest@@@Glossary Name
categories
¶
Categories in which a term is organized. Each category should be separated by a newline within the cell, and formatted as indicated above: @
as a path-delimiter and @@@
as the glossary delimiter.
displayName
¶
An optional name you can give to the asset to override how it is displayed in the Atlan UI. If present, this will be shown in the UI instead of name
.
description
¶
Explanation of the asset, as a fallback. For example, if you want to pre-populate the description for the asset but allow users to override it through Atlan's UI.
userDescription
¶
Explanation of the asset, as entered or confirmed by a user through the Atlan UI. If present, this will be shown in the UI instead of description
.
ownerUsers
¶
Individual users who are owners of the asset. Each user should be separated by a newline within the cell.
ownerGroups
¶
Groups of users who are owners of the asset. Each group should be separated by a newline within the cell.
certificateStatus
¶
Certificate on the asset. Must either be empty or one of:
VERIFIED
DRAFT
DEPRECATED
certificateStatusMessage
¶
An optional message that can be associated with the certificate (only used if certificateStatus
is non-empty).
announcementType
¶
Type of announcement on the asset. Must either be empty or one of:
information
warning
issue
announcementTitle
¶
Heading line for the announcement on the asset (only used if announcementType
is non-empty).
announcementMessage
¶
An optional detailed message that can be associated with the announcement (only used if announcementType
is non-empty).
atlanTags
¶
Atlan tags that are assigned to the asset. Each tag should be separated by a newline within the cell, and formatted as one of:
Tag Name
, for tags that should be directly assigned and should not be propagatedTag Name>>FULL
for tags that should be directly assigned to the asset and propagated down their hierarchy and through lineageTag Name>>HIERARCHY_ONLY
for tags that should be directly assigned to the asset and only be propagated down their hierarchy (not through lineage)-
Tag Name<<PROPAGATED
for tags that have been propagated to the asset.Propagated tags will be ignored on import
Any tag marked propagated (
Tag Name<<PROPAGATED
) will be ignored by an import. Only those tags that are directly applied will be imported, though of course any tags applied up-hierarchy or upstream that are marked to propagate will still propagate accordingly.No source tags for glossaries
Note that source tags can only be related to physical assets, so you should not attempt to assign them to glossary objects.
links
¶
List of resources (links) assigned to the asset. Each link should be separated by a newline within the cell, and formatted as embedded JSON:
{"typeName":"Link","attributes":{"name":"linkName","link":"https://www.example.com"}}
readme
¶
Richly-formatted, detailed documentation for the asset. This should be an HTML-formatted string containing everything that would be inside <body></body>
, without the <body></body>
wrapping.
starredDetails
¶
Details about users who have starred the asset. Each starred asset detail entry should be separated by a newline within the cell, and formatted as embedded JSON:
{"assetStarredBy":"someone","assetStarredAt":1698769268966}
seeAlso
¶
Business terms that will show as related terms in the UI. Each term should be separated by a newline within the cell, and formatted as:
Term Name@@@Glossary Name
preferredTerms
¶
Business terms that will show as recommended terms in the UI. Each term should be separated by a newline within the cell, and formatted as:
Term Name@@@Glossary Name
synonyms
¶
Business terms that will show as synonyms in the UI. Each term should be separated by a newline within the cell, and formatted as:
Term Name@@@Glossary Name
antonyms
¶
Business terms that will show as antonyms in the UI. Each term should be separated by a newline within the cell, and formatted as:
Term Name@@@Glossary Name
translatedTerms
¶
Business terms that will show as translations for a term in the UI. Each term should be separated by a newline within the cell, and formatted as:
Term Name@@@Glossary Name
validValuesFor
¶
Business terms that will show as valid values for a term in the UI. Each term should be separated by a newline within the cell, and formatted as:
Term Name@@@Glossary Name
classifies
¶
Business terms that will show as classified by a term in the UI. Each term should be separated by a newline within the cell, and formatted as:
Term Name@@@Glossary Name
{CM}::{attribute}
¶
Any number of columns using a ::
separator in their heading represent custom metadata.
- The
{CM}
portion must give the name of the custom metadata - The
{attribute}
portion must give the name of an attribute within the custom metadata.
Both are the human-readable names.
For multi-valued custom metadata attributes, each value should be separated by a newline within the cell. Date values should be provided as an epoch-style timestamp (purely numeric).
Data products CSV file¶
If requested, the data domains, subdomains and data products will all be extracted into a separate CSV file, one per row. Each row will contain:
- The type of the asset
- The
qualifiedName
of the asset - Values for all possible enrichment to that asset
Detailed information on the columns in the CSV file produced:
qualifiedName
¶
Purely for your own reference, this is ignored during any import (and can therefore be empty).
typeName
¶
Required. Type of the asset.
name
¶
Required. Name of the asset.
parentDomain
¶
Name of the parent domain in which the subdomain is contained.
dataDomain
¶
Path of the data (sub)domain in which the data product is contained. The domains should use @
as a path-delimiter. For example, if the parent domain is called Lowest
, which itself is a subdomain of Middle
, itself a subdomain of Top
:
Top@Middle@Lowest
assetCoverImage
¶
Image to use as the cover image for the asset.
assetThemeHex
¶
An hexadecimal RGB value to specify the color of the theme for the asset.
assetIcon
¶
Name of the Phosphor icon to use to represent the asset, in the form PhIconName
.
displayName
¶
An optional name you can give to the asset to override how it is displayed in the Atlan UI. If present, this will be shown in the UI instead of name
.
description
¶
Explanation of the asset, as a fallback. For example, if you want to pre-populate the description for the asset but allow users to override it through Atlan's UI.
userDescription
¶
Explanation of the asset, as entered or confirmed by a user through the Atlan UI. If present, this will be shown in the UI instead of description
.
ownerUsers
¶
Individual users who are owners of the asset. Each user should be separated by a newline within the cell.
ownerGroups
¶
Groups of users who are owners of the asset. Each group should be separated by a newline within the cell.
certificateStatus
¶
Certificate on the asset. Must either be empty or one of:
VERIFIED
DRAFT
DEPRECATED
certificateStatusMessage
¶
An optional message that can be associated with the certificate (only used if certificateStatus
is non-empty).
announcementType
¶
Type of announcement on the asset. Must either be empty or one of:
information
warning
issue
announcementTitle
¶
Heading line for the announcement on the asset (only used if announcementType
is non-empty).
announcementMessage
¶
An optional detailed message that can be associated with the announcement (only used if announcementType
is non-empty).
atlanTags
¶
Atlan tags that are assigned to the asset. Each tag should be separated by a newline within the cell, and formatted as one of:
Tag Name
, for tags that should be directly assigned and should not be propagatedTag Name>>FULL
for tags that should be directly assigned to the asset and propagated down their hierarchy and through lineageTag Name>>HIERARCHY_ONLY
for tags that should be directly assigned to the asset and only be propagated down their hierarchy (not through lineage)-
Tag Name<<PROPAGATED
for tags that have been propagated to the asset.Propagated tags will be ignored on import
Any tag marked propagated (
Tag Name<<PROPAGATED
) will be ignored by an import. Only those tags that are directly applied will be imported, though of course any tags applied up-hierarchy or upstream that are marked to propagate will still propagate accordingly.No source tags for data products
Note that source tags can only be related to physical assets, so you should not attempt to assign them to data product objects.
links
¶
List of resources (links) assigned to the asset. Each link should be separated by a newline within the cell, and formatted as embedded JSON:
{"typeName":"Link","attributes":{"name":"linkName","link":"https://www.example.com"}}
readme
¶
Richly-formatted, detailed documentation for the asset. This should be an HTML-formatted string containing everything that would be inside <body></body>
, without the <body></body>
wrapping.
starredDetails
¶
Details about users who have starred the asset. Each starred asset detail entry should be separated by a newline within the cell, and formatted as embedded JSON:
{"assetStarredBy":"someone","assetStarredAt":1698769268966}
daapCriticality
¶
Criticality of the data product. Must either be empty or one of:
High
Medium
Low
daapSensitivity
¶
Sensitivity of the data product. Must either be empty or one of:
Public
Internal
Confidential
daapVisibility
¶
Visibility of the data product. Must either be empty or one of:
Private
Protected
(shows asRestricted
in the UI)Public
daapVisibilityUsers
¶
List of usernames of users who should be able to see this data product. Each should be separated by a newline within the cell.
daapVisibilityGroups
¶
List of group aliases (internal Atlan group names) of groups who should be able to see this data product. Each should be separated by a newline within the cell.
dataProductAssetsPlaybookFilter
¶
JSON-based DSL specifying the criteria to retain for the UI-based filtering rule(s) to select the assets for the data product.
dataProductAssetsDSL
¶
Required. JSON-based Elasticsearch DSL specifying the criteria for selecting which assets are part of the data product.
{CM}::{attribute}
¶
Any number of columns using a ::
separator in their heading represent custom metadata.
- The
{CM}
portion must give the name of the custom metadata - The
{attribute}
portion must give the name of an attribute within the custom metadata.
Both are the human-readable names.
For multi-valued custom metadata attributes, each value should be separated by a newline within the cell. Date values should be provided as an epoch-style timestamp (purely numeric).
How it works
Runs a search for all assets that match the supplied inputs:
- Only assets whose
qualifiedName
starts with the supplied qualified name prefix. - Export scope:
Enriched only
will only include assets that have at least one of the following:- system-provided description,
- user-provided description,
- owners,
- assigned terms,
- assigned tags, or
- any custom metadata values.
All
will include all assets with the supplied qualified name prefix.
-
Note that propagated Atlan tags are not included in the extract, since they will be propagated automatically. ↩