Oracle Miner¶

Supported Oracle Version

This package supports enterprise Oracle versions 12c or higher, which have Diagnostics Pack and AWR support.

The Oracle Miner package provides means to extract query history from Oracle and store it for further processing. The data is then used to create lineage using the extracted queries.

---
title:
---
graph LR
    subgraph "Generate Query 
                History"
    A["Workflow
        Configuration"]-->|Inputs|B[("Oracle")]
    B-->|Run Query|C("Write Extracted
                        Queries")
    end
    subgraph "Generate Lineage"
    D("Read Query 
        File")-->|Filter Queries|E("Call Package:
                                            'Generic Miner'")
    end
    E-->F(("End"))
    C-->|Generate|D
    C-->|Preview|F

Limitations

This package uses Oracle’s diagnostic views and AWR (available in 12c and later) but may not capture all SQL statements. Queries that are short-lived, removed from memory before a snapshot, or affected by retention policies might be missing. These views are mainly for performance analysis and prioritize compatibility with older Oracle versions. While Oracle 23+ includes V$SQL_HISTORY for more detailed query tracking, this package does not use it.

Prerequisites¶

This package requires the user to first verify some of the necessary information regarding their oracle connection at source. These are the following steps that need to be performed in order to verify that this package will be compatible with the oracle connection :

Verify Oracle version: The Oracle version should be 12c or higher for this package to work with the connection. Check your Oracle version with the following SQL query:
```
SELECT VERSION FROM V$INSTANCE;
```
Enable AWR: The package uses diagnostic views from the AWR to extract the query history from the connection. This part requires the AWR (Automatic Workload Repository) to be enabled.
- To check if the AWR is enabled, run the following command:
```
SELECT NAME, VALUE FROM V$PARAMETER WHERE NAME = 'statistics_level';
```
  If the value of statistics_level is either TYPICAL or ALL, it means that the AWR is enabled.
- To enable AWR if the above query results in the value of statistics_level as BASIC:
```
ALTER SYSTEM SET statistics_level=TYPICAL SCOPE=BOTH;
```
  statistics_level can be set to two values:
  - TYPICAL: (default) activates standard level of collection
  - ALL: same as typical but includes execution plans and timing info from the O/S
Check AWR settings: The package uses snapshots captured by the AWR to extract the query history. Check if the snapshot capture and retention is set in accordance with the schedule set on the miner workflow.
- To check the retention and snapshot interval set in AWR for the database this workflow will extract query history for, run the following command:
```
SELECT * FROM DBA_HIST_WR_CONTROL WHERE SRC_DBNAME='<database-name>';
```
  The snapshot and retention are duration strings. E.g., if the output of the above command has the SNAP_INTERVAL value as 0 0:30:0.0, it means that the snapshot will be taken automatically every 30 minutes. Similarly, if the RETENTION value is set to 14 0:0:0.0, it means the retention of these snapshots is 14 days.
- To change the snapshot and retention values for your database, run the following command:
```
BEGIN
    DBMS_WORKLOAD_REPOSITORY.modify_snapshot_settings(
        retention => 20160,  -- (1)
        interval  => 60      -- (2)
    );
END;
```
  1. Retention period in minutes (14 days).
  2. Interval between snapshots in minutes.
Grant Select Privileges: The package requires select privilage to some AWR-related views. Grant access to these views to the user used in the Oracle crawler workflow, since this package uses the same credentials as the Oracle crawler workflow to establish connection to the source database.
- Grant SELECT ANY DICTIONARY access to the user provided in the Oracle crawler:
```
GRANT SELECT ANY DICTIONARY TO your_user; -- (1)
```
  1. Replace your_user with the appropriate username.
- Alternatively, grant select access to views used by this package to the user provided in the Oracle crawler configuration:
```
GRANT SELECT ON DBA_HIST_SQLTEXT TO your_user; -- (1)
GRANT SELECT ON DBA_HIST_SQLSTAT TO your_user;
GRANT SELECT ON DBA_HIST_DATABASE_INSTANCE TO your_user;
```
  1. Replace your_user with the appropriate username.
Check snapshot generation: Before configuring the workflow, check if snapshots are being taken by the source.
- Check for snapshot ID and interval time to verify the working of AWR snapshot generation.
```
SELECT SNAP_ID, BEGIN_INTERVAL_TIME from DBA_HIST_SNAPSHOT;
```
- Generate snapshot manually if no interval exists on running the above query.
```
BEGIN 
    DBMS_WORKLOAD_REPOSITORY.create_snapshot;
END;
```

Configuration¶

Credential¶

Extraction method: Determines the method the package will use to extract metadata.

DirectAgent

Note

The crawler associated with the connection should also be setup with 'Direct' extraction method.

When Direct method is selected, package will directly connect to the database using credentials stored in atlan's credential store.

Note

For this option to appear, switch on Secure Agent toggle in Labs > Admin. Contact your workspace admin or atlan support.

When Agent method is selected, package will do extraction in the offline agent mode. This requires the users to setup secrets in one of the supported secret stores. Contact atlan support for more details on the prerequisites for using this mode and the required inputs for the miner setup.

Connection¶

Oracle Connection: Select an Oracle connection for which you want to run this workflow.

Output¶

Output Option: Select the output option for the workflow after query extraction.
Generate LineagePreview
Reads the extracted query file and generates lineage for the selected connection.
Only write the extracted query file to the output artifact and to the bucket associated with the tenant. Does not generate lineage or create any process assets.
Download Preview File

From workflow centre: Go to the run and select the node extract-query-direct(0) from the graph. On the right sidebar that appears, click on Outputs and download query-history.

From tenant bucket: Go to the tenant bucket and navigate to this prefix: argo-artifacts/default/{workflow_run_name} to download the file query_history.json.

What it does¶

Runs queries to extract the query history captured by the AWR snapshot. Stores these queries and calls generic miner to create lineage if generate lineage is selected.

How it works

Extracts connection credentials from the selected Oracle connection, if 'direct' extraction method is selected. In case of 'agent' mode, extraction of credentials happen on a separate pod which has access to the secret store selected in the configuration.
Downloads the Oracle JDBC driver jar to establish JDBC connection to the source.
Executes query to extract query history from Oracle source and writes it to a query_history.json file.
Moves generated file to an object store bucket and calls Generic Miner package if generate lineage is selected.