Skip to content

SAP Datasphere Crawler

The SAP Datasphere crawler package allows SAP Datasphere system to be cataloged in Atlan, ingesting all the views and analytical models (optional) into Atlan. The package catalogs all the assets under the connector name sap-sql.

Warning

  • Analytical models (if chosen) will be catalogued as a view in Atlan.
  • Whitelisting of Atlan IPs at SAP Datasphere (if required) must be done prior to running the package.

The assets crawled are:

  • Spaces
  • Views
  • Analytical models
  • Columns

The relationship between the crawled assets can be understood from the developer portal reference.

Configuration

Credential

  1. Authentication: Choose an authentication model to access the SAP Datasphere.

    The package supports OAuth based client id-secret authentication with SAP DS.

    • Client ID: OAuth client ID to be used for authentication and rest API calls.

    • Client Secret: OAuth client secret to be used for authentication and rest API calls.

    • Refresh Token: Refresh token to be used for authentication and rest API calls. The refresh token has a specific validity period, and once it expires, the authorization code must be obtained again, and the package must be re-run.

    • Asset Endpoint: Asset endpoint to be used for extracting asset information. For example, somename-ds-prod.us10.hcs.cloud.sap/api/v1/dwc/catalog/assets.

    • Token Endpoint: Token endpoint to be used for token generation using the refresh token. For example, somename-ds-prod.us10.us10.hana.ondemand.com/oauth/token.

    • Database Name: Database name to be used in atlan for the datasphere spaces to be catalogued. For example, somename-ds-prod.

Connection

  1. Connection Name: Name of the connection that will be created in Atlan to assoicate it with the catalog. The connection name must be unique across all SAP Datasphere connections. Also, below the name input select the valid connection admins for the connection.

Metadata

  1. Analytical Models: Choose if you want to catalog analytical models.

Data type mapping

SAP DS Data Type Atlan Data Type Condition
Edm.String NVARCHAR If @MaxLength exists: NVARCHAR(@MaxLength)
Edm.Decimal DECIMAL DECIMAL(@Precision, @Scale) Defaults: DECIMAL(18,0)
Edm.Int64 BIGINT Always BIGINT
Edm.Int32 INT Always INT
Edm.Int16 SMALLINT Always SMALLINT
Edm.Date DATE Always DATE
Edm.Byte BYTE Always BYTE
Edm.Binary BINARY Always BINARY
Edm.Boolean BOOLEAN Always BOOLEAN
Edm.DateTimeOffset DATETIMEOFFSET If @Precision exists: DATETIMEOFFSET(@Precision)
Edm.Double DOUBLE Always DOUBLE
Edm.Guid VARCHAR Always VARCHAR
Edm.SByte TINYINT Always TINYINT
Edm.Single REAL Always REAL
Edm.DateTime TIMESTAMP Always TIMESTAMP
Edm.Time TIME Always TIME
Edm.Stream BLOB Always BLOB
Edm.ComplexType NVARCHAR Always NVARCHAR
Edm.EnumType NVARCHAR Always NVARCHAR
Edm.Geography NVARCHAR Always NVARCHAR
Edm.Geometry NVARCHAR Always NVARCHAR
Other UNKNOWN Any other unrecognized type

What it does

The package performs the following steps:

  • The crawler uses the client-id, client-secret and refresh token and obtains a token from the token endpoint URL.
  • Next the crawler gets all "exposed" assets from the asset URL of SAP datasphere.
  • Every object returned from above step contains an asset along with the metadata URL. The crawler gets the metadata of each asset and stores it localy for further processing.
  • After asset extraction is completed, the crawler transforms these assets into Atlan assets and starts the publishing process.
  • The crawler creates a new connection upon the first run and ingests the assets.
  • For subsequent runs, compares the asset listing derived from the SAP Datasphere against the asset catalog on Atlan. Then adds/updates/removes assets as needed to address the delta.