Skip to content

Oracle Miner

  • This package supports enterprise Oracle versions 12c or higher, which have Diagnostics Pack and AWR support.
  • This package works for AWS deployed tenants that have a S3 bucket associated with the tenant.

The Oracle Miner package provides means to extract query history from Oracle and store it for further processing. The data is then used to create lineage using the extracted queries.

---
title:
---
graph LR
    subgraph "Generate Query 
                History"
    A["Workflow
        Configuration"]-->|Inputs|B[("Oracle")]
    B-->|Run Query|C("Write Extracted
                        Queries")
    end
    subgraph "Generate Lineage"
    D("Read S3 
        Query Files")-->|Filter Queries|E("Call Package:
                                            'Generic Miner'")
    end
    E-->F(("End"))
    C-->|Generate|D
    C-->|Preview|F

Prerequisites

This package requires the user to first verify some of the necessary information regarding their oracle connection at source. These are the following steps that need to be performed in order to verify that this package will be compatible with the oracle connection :

  1. Verify Oracle version: The Oracle version should be 12c or higher for this package to work with the connection. Check your Oracle version with the following SQL query:

    SELECT VERSION FROM V$INSTANCE;
    
  2. Enable AWR: The package uses diagnostic views from the AWR to extract the query history from the connection. This part requires the AWR (Automatic Workload Repository) to be enabled.

    • To check if the AWR is enabled, run the following command:

      SELECT NAME, VALUE FROM V$PARAMETER WHERE NAME = 'statistics_level';
      

      If the value of statistics_level is either TYPICAL or ALL, it means that the AWR is enabled.

    • To enable AWR if the above query results in the value of statistics_level as BASIC:

      ALTER SYSTEM SET statistics_level=TYPICAL SCOPE=BOTH;
      

      statistics_level can be set to two values:

      • TYPICAL: (default) activates standard level of collection
      • ALL: same as typical but includes execution plans and timing info from the O/S
  3. Check AWR settings: The package uses snapshots captured by the AWR to extract the query history. Check if the snapshot capture and retention is set in accordance with the schedule set on the miner workflow.

    • To check the retention and snapshot interval set in AWR for the database this workflow will extract query history for, run the following command:

      SELECT * FROM DBA_HIST_WR_CONTROL WHERE SRC_DBNAME='<database-name>';
      

      The snapshot and retention are duration strings. E.g., if the output of the above command has the SNAP_INTERVAL value as 0 0:30:0.0, it means that the snapshot will be taken automatically every 30 minutes. Similarly, if the RETENTION value is set to 14 0:0:0.0, it means the retention of these snapshots is 14 days.

    • To change the snapshot and retention values for your database, run the following command:

      BEGIN
          DBMS_WORKLOAD_REPOSITORY.modify_snapshot_settings(
              retention => 20160,  -- (1)
              interval  => 60      -- (2)
          );
      END;
      
      1. Retention period in minutes (14 days).
      2. Interval between snapshots in minutes.
  4. Grant Select Privileges: The package requires select privilage to some AWR-related views. Grant access to these views to the user used in the Oracle crawler workflow, since this package uses the same credentials as the Oracle crawler workflow to establish connection to the source database.

    • Grant SELECT ANY DICTIONARY access to the user provided in the Oracle crawler:

      GRANT SELECT ANY DICTIONARY TO your_user; -- (1)
      
      1. Replace your_user with the appropriate username.
    • Alternatively, grant select access to views used by this package to the user provided in the Oracle crawler configuration:

      GRANT SELECT ON DBA_HIST_SQLTEXT TO your_user; -- (1)
      GRANT SELECT ON DBA_HIST_SQLSTAT TO your_user;
      GRANT SELECT ON DBA_HIST_DATABASE_INSTANCE TO your_user;
      
      1. Replace your_user with the appropriate username.
  5. Check snapshot generation: Before configuring the workflow, check if snapshots are being taken by the source.

    • Check for snapshot ID and interval time to verify the working of AWR snapshot generation.

      SELECT SNAP_ID, BEGIN_INTERVAL_TIME from DBA_HIST_SNAPSHOT;
      
    • Generate snapshot manually if no interval exists on running the above query.

      BEGIN 
          DBMS_WORKLOAD_REPOSITORY.create_snapshot;
      END;
      

Configuration

Configuration

  • Oracle Connection: Select an Oracle connection for which you want to run this workflow.
  • Output Option: Select the output option for the workflow after query extraction.

    • Generate Lineage: Reads the extracted query file and generates lineage for the selected connection.
    • Preview: Only write the extracted query file to the output artifact and S3 bucket associated with the tenant. Does not generate lineage or create any process assets.

    Download Preview File

    • From workflow centre: Go to the run and select the node extract-creds-and-run(0) from the graph. On the right sidebar that appears, click on Outputs and download query-history.
    • From tenant S3: Go to the tenant S3 bucket and navigate to this prefix: argo-artifacts/default/{workflow_name}/query_history.json to download query-history.

What it does

Extracts connection credentials from selected Oracle connection and runs queries to extract the query history captured by the AWR snapshot. Stores these queries and calls generic miner to create lineage if generate lineage is selected.

How it works
  • Extracts connection credentials from the selected Oracle connection.
  • Downloads the Oracle JDBC driver jar to establish JDBC connection to the source.
  • Executes query to extract query history from Oracle source and writes it to a query_history.json file.
  • Moves generated file to S3 and calls Generic Miner package if generate lineage is selected.