SAP Datasphere Crawler¶
The SAP Datasphere crawler package allows SAP Datasphere system to be cataloged in Atlan, ingesting all the views and analytical models (optional) into Atlan. The package catalogs all the assets under the connector name sap-sql
.
Warning
- Analytical models (if chosen) will be catalogued as a view in Atlan.
- Whitelisting of Atlan IPs at SAP Datasphere (if required) must be done prior to running the package.
The assets crawled are:
- Spaces
- Views
- Analytical models
- Columns
The relationship between the crawled assets can be understood from the developer portal reference.
Configuration¶
Credential¶
-
Authentication
: Choose an authentication model to access the SAP Datasphere.The package supports OAuth based client id-secret authentication with SAP DS.
-
Client ID
: OAuth client ID to be used for authentication and rest API calls. -
Client Secret
: OAuth client secret to be used for authentication and rest API calls. -
Refresh Token
: Refresh token to be used for authentication and rest API calls. The refresh token has a specific validity period, and once it expires, the authorization code must be obtained again, and the package must be re-run. -
Asset Endpoint
: Asset endpoint to be used for extracting asset information. For example,somename-ds-prod.us10.hcs.cloud.sap/api/v1/dwc/catalog/assets
. -
Token Endpoint
: Token endpoint to be used for token generation using the refresh token. For example,somename-ds-prod.us10.us10.hana.ondemand.com/oauth/token
. -
Database Name
: Database name to be used in atlan for the datasphere spaces to be catalogued. For example,somename-ds-prod
.
-
Connection¶
Connection Name
: Name of the connection that will be created in Atlan to assoicate it with the catalog. The connection name must be unique across all SAP Datasphere connections. Also, below the name input select the valid connection admins for the connection.
Metadata¶
Analytical Models
: Choose if you want to catalog analytical models.
Data type mapping¶
SAP DS Data Type | Atlan Data Type | Condition |
---|---|---|
Edm.String |
NVARCHAR |
If @MaxLength exists: NVARCHAR(@MaxLength) |
Edm.Decimal |
DECIMAL |
DECIMAL(@Precision, @Scale) Defaults: DECIMAL(18,0) |
Edm.Int64 |
BIGINT |
Always BIGINT |
Edm.Int32 |
INT |
Always INT |
Edm.Int16 |
SMALLINT |
Always SMALLINT |
Edm.Date |
DATE |
Always DATE |
Edm.Byte |
BYTE |
Always BYTE |
Edm.Binary |
BINARY |
Always BINARY |
Edm.Boolean |
BOOLEAN |
Always BOOLEAN |
Edm.DateTimeOffset |
DATETIMEOFFSET |
If @Precision exists: DATETIMEOFFSET(@Precision) |
Edm.Double |
DOUBLE |
Always DOUBLE |
Edm.Guid |
VARCHAR |
Always VARCHAR |
Edm.SByte |
TINYINT |
Always TINYINT |
Edm.Single |
REAL |
Always REAL |
Edm.DateTime |
TIMESTAMP |
Always TIMESTAMP |
Edm.Time |
TIME |
Always TIME |
Edm.Stream |
BLOB |
Always BLOB |
Edm.ComplexType |
NVARCHAR |
Always NVARCHAR |
Edm.EnumType |
NVARCHAR |
Always NVARCHAR |
Edm.Geography |
NVARCHAR |
Always NVARCHAR |
Edm.Geometry |
NVARCHAR |
Always NVARCHAR |
Other | UNKNOWN |
Any other unrecognized type |
What it does¶
The package performs the following steps:
- The crawler uses the client-id, client-secret and refresh token and obtains a token from the token endpoint URL.
- Next the crawler gets all "exposed" assets from the asset URL of SAP datasphere.
- Every object returned from above step contains an asset along with the metadata URL. The crawler gets the metadata of each asset and stores it localy for further processing.
- After asset extraction is completed, the crawler transforms these assets into Atlan assets and starts the publishing process.
- The crawler creates a new connection upon the first run and ingests the assets.
- For subsequent runs, compares the asset listing derived from the SAP Datasphere against the asset catalog on Atlan. Then adds/updates/removes assets as needed to address the delta.