Azure Data Lake Storage crawler¶
The Azure Data Lake Storage crawler fetchs assets from Azure Data Lake Storage and publish them to Atlan for discovery. The assets crawled are:
- Account
- Container
- Objects
Configuration¶
Credentials¶
- Azure Client ID: unique application (client) ID assigned to your app by Azure AD when the app was registered.
- Azure Client Secret: client secret.
- Azure Tenant ID: unique identifier of the Azure Active Directory instance.
- Storage Account Name: name of the Azure storage account.
Metadata¶
- Container prefix: publish to Atlan only the containers that start with the 'container prefix' specified in this parameter. Leave as empty if you need all containers.
- Object prefix: publish to Atlan only the objects that start with the 'object prefix' specified in this parameter. Leave as empty if you need all objects.
Configurations¶
- Connection: name of the connection that will be created in Atlan.
Warning
The connection name must be unique across all Azure Data Lake storage connections.
What it does¶
The package performs the following steps:
- Create a connection in Atlan. If the connection already exists the step is skipped.
- Fetch the list of containers part of the storage account.
- For each container fetch the list of objects.
- Publish containers and objects into Atlan.
Warning
Containers and Objects deleted/archived in Azure Data Lake Storage are automatically archived in Atlan as well.