Oracle Miner¶
- This package supports enterprise Oracle versions 12c or higher, which have Diagnostics Pack and AWR support.
- This package works for AWS deployed tenants that have a S3 bucket associated with the tenant.
The Oracle Miner package provides means to extract query history from Oracle and store it for further processing. The data is then used to create lineage using the extracted queries.
---
title:
---
graph LR
subgraph "Generate Query
History"
A["Workflow
Configuration"]-->|Inputs|B[("Oracle")]
B-->|Run Query|C("Write Extracted
Queries")
end
subgraph "Generate Lineage"
D("Read S3
Query Files")-->|Filter Queries|E("Call Package:
'Generic Miner'")
end
E-->F(("End"))
C-->|Generate|D
C-->|Preview|F
Prerequisites¶
This package requires the user to first verify some of the necessary information regarding their oracle connection at source. These are the following steps that need to be performed in order to verify that this package will be compatible with the oracle connection :
-
Verify Oracle version
: The Oracle version should be 12c or higher for this package to work with the connection. Check your Oracle version with the following SQL query:SELECT VERSION FROM V$INSTANCE;
-
Enable AWR
: The package uses diagnostic views from the AWR to extract the query history from the connection. This part requires the AWR (Automatic Workload Repository) to be enabled.-
To check if the AWR is enabled, run the following command:
SELECT NAME, VALUE FROM V$PARAMETER WHERE NAME = 'statistics_level';
If the value of
statistics_level
is eitherTYPICAL
orALL
, it means that the AWR is enabled. -
To enable AWR if the above query results in the value of
statistics_level
asBASIC
:ALTER SYSTEM SET statistics_level=TYPICAL SCOPE=BOTH;
statistics_level
can be set to two values:TYPICAL
: (default) activates standard level of collectionALL
: same as typical but includes execution plans and timing info from the O/S
-
-
Check AWR settings
: The package uses snapshots captured by the AWR to extract the query history. Check if the snapshot capture and retention is set in accordance with the schedule set on the miner workflow.-
To check the retention and snapshot interval set in AWR for the database this workflow will extract query history for, run the following command:
SELECT * FROM DBA_HIST_WR_CONTROL WHERE SRC_DBNAME='<database-name>';
The snapshot and retention are duration strings. E.g., if the output of the above command has the
SNAP_INTERVAL
value as0 0:30:0.0
, it means that the snapshot will be taken automatically every 30 minutes. Similarly, if theRETENTION
value is set to14 0:0:0.0
, it means the retention of these snapshots is 14 days. -
To change the snapshot and retention values for your database, run the following command:
BEGIN DBMS_WORKLOAD_REPOSITORY.modify_snapshot_settings( retention => 20160, -- (1) interval => 60 -- (2) ); END;
- Retention period in minutes (14 days).
- Interval between snapshots in minutes.
-
-
Grant Select Privileges
: The package requires select privilage to some AWR-related views. Grant access to these views to the user used in the Oracle crawler workflow, since this package uses the same credentials as the Oracle crawler workflow to establish connection to the source database.-
Grant
SELECT ANY DICTIONARY
access to the user provided in the Oracle crawler:GRANT SELECT ANY DICTIONARY TO your_user; -- (1)
- Replace your_user with the appropriate username.
-
Alternatively, grant select access to views used by this package to the user provided in the Oracle crawler configuration:
GRANT SELECT ON DBA_HIST_SQLTEXT TO your_user; -- (1) GRANT SELECT ON DBA_HIST_SQLSTAT TO your_user; GRANT SELECT ON DBA_HIST_DATABASE_INSTANCE TO your_user;
- Replace your_user with the appropriate username.
-
-
Check snapshot generation
: Before configuring the workflow, check if snapshots are being taken by the source.-
Check for snapshot ID and interval time to verify the working of AWR snapshot generation.
SELECT SNAP_ID, BEGIN_INTERVAL_TIME from DBA_HIST_SNAPSHOT;
-
Generate snapshot manually if no interval exists on running the above query.
BEGIN DBMS_WORKLOAD_REPOSITORY.create_snapshot; END;
-
Configuration¶
Configuration¶
Oracle Connection
: Select an Oracle connection for which you want to run this workflow.-
Output Option
: Select the output option for the workflow after query extraction.Generate Lineage
: Reads the extracted query file and generates lineage for the selected connection.Preview
: Only write the extracted query file to the output artifact and S3 bucket associated with the tenant. Does not generate lineage or create any process assets.
Download Preview File
From workflow centre
: Go to the run and select the nodeextract-creds-and-run(0)
from the graph. On the right sidebar that appears, click on Outputs and download query-history.From tenant S3
: Go to the tenant S3 bucket and navigate to this prefix:argo-artifacts/default/{workflow_name}/query_history.json
to download query-history.
What it does¶
Extracts connection credentials from selected Oracle connection and runs queries to extract the query history captured by the AWR snapshot. Stores these queries and calls generic miner to create lineage if generate lineage is selected.
How it works
- Extracts connection credentials from the selected Oracle connection.
- Downloads the Oracle JDBC driver jar to establish JDBC connection to the source.
- Executes query to extract query history from Oracle source and writes it to a query_history.json file.
- Moves generated file to S3 and calls
Generic Miner
package if generate lineage is selected.