Oracle Database Performance Tuning – 5 Steps to Resolve Issues

Text Size 100%:

Oracle Database performance tuning is the collection of activities to identify and remove the top limiting factor in Database efficiency.

It’s typically an iterative exercise, you’ll squash one factor, then you need to rinse and repeat and squash the next… until you get the database performance you need.

 

Follow these step by step instructions to get started on your database performance tuning journey using Oracle Autonomous Health Framework.

 

Database performance tuning - Fix Flow

 

This post is part of the Autonomous Health Framework Fix Flow series, review them all to learn how to resolve the following problems:

 

Understanding Oracle Database Performance Tuning

Common Oracle Database performance issues fall into one of two categories.

Either they are something affecting general database wide performance or they are query specific.

Database performance tuning - Causes of slow performance

 

Database wide issues can be caused by:

  • A bottleneck with some form of resource, such as CPU, IO, Memory, Network or processes
  • A database misconfiguration of some type
  • A database hang of some form
  • Bugs

Query specific issues can be caused by:

  • Expensive SQL, where the structure of the query forces it to take a long time
  • Poor query optimization – This can be caused by things such as incorrect indexes, old statistics, and unexpected changes in execution plan etc.

 

When starting database performance tuning use this as a mental model to narrow down on the root cause.

Oracle Database Performance Tuning Steps

The basic Fix Flow steps for Database performance tuning are:

 

  1. Collect database performance diagnostics
  2. Use Autonomous Health Framework Insights

  3. Understand and protect against noisy neighbors
  4. Use diagnostics to identify and tune poor performing database workloads
  5. Log a new SR using the diagnostic collection

1. Use Autonomous Health Framework to collect database performance diagnostics

In the first step we’re going to use Autonomous Health Framework to generate a database performance diagnostic collection.

 

The Autonomous Health Framework provides a series of capabilities aimed at making it easier for you to maintain your database.

It’s the same technology Oracle uses in the cloud to keep the Autonomous Database running, and it’s available for you at no extra cost as part of your existing software support agreement.

AHF Services

You can download Autonomous Health Framework by logging into My Oracle Support and searching – or you can just go to http://bit.ly/oracleahf, which will take you directly to the download page.

 

AHF diagnostic collection service

Autonomous Health Framework is capable of collecting various types of diagnostic collections, with problem specific SRDCs being one of them.

Before Autonomous Health Framework, Oracle Support might have asked you to go through many different steps as part of SRDC or Service Request Data Collection for database performance tuning. This would involve collecting and trimming different logs and running scripts, to understand how to perform it correctly required watching a video and reading other knowledge documents.

Now all this SRDC logic is packaged within Autonomous Health Framework and you can capture it all with a single command.

 

Login to the machine where the database performance issue was seen and as the Oracle user run the command:

tfactl diagcollect srdc –dbperf –database <database_name>
 

Autonomous Health Framework will prompt if you have a performance issue now and then guide you through a series of questions and answers so it can collect all the necessary diagnostics.

For example

$ tfactl diagcollect -srdc dbperf -database db23cdb1
Do you have a performance issue now [Y|N] [Y]: Y
Enter duration of the issue in hours [<RETURN>=1h]: 1h
As you have indicated that the performance issue is currently happening,
Performance Reports will be collected for the following periods:
Start time when the performance was bad: 2024-05-01 15:13:32
Stop time when the performance was bad: 2024-05-01 16:13:32
For comparison, it is useful to gather data from another period with similar load where problems are not seen. Typically this is likely to be the same time period on a previous day. To compare to the same time period on a previous day enter the number of days ago you wish to use. [<RETURN> to provide other time range]: 1
Start time when the performance was good 2024-04-30 15:13:32
Stop time when the performance was good 2024-04-30 16:13:32
Has any SQL been identified to contribute to the performance issue?[Y|N]: N
Do you wish to take an AWR Dump as part of this collection? [Y|N]: N

Ending AWR snapshot successfully created.
Found 3 snapshot(s) for Bad Performance time range in  ORCL
Found 3 snapshot(s) for baseline range in  ORCL
"Automatic Workload Repository (AWR) is a licensed feature. Refer to My Oracle Support Document ID 1490798.1 for more information"

Components included in this collection: DATABASE CHMOS CHA OS INSIGHT
Preparing to execute support diagnostic scripts.
 Executing DB Script srdc_db_lfsdiag.sql on db23cdb1 with timeout of 120 seconds...
 Executing DB Script srdc_real_time_addm.sql on db23cdb1 with timeout of 120 seconds...
 Executing DB Script srdc_statsadvisor_report.sql on db23cdb1 with timeout of 300 seconds...
 Executing DB Script collect_logon_logoff_triggers.sql on db23cdb1 with timeout of 300 seconds...
 Executing OS Script get_perfhub_report with timeout of 600 seconds...
Collecting data for all nodes
TFA is using system timezone for collection, All times shown in UTC.
Scanning files from 2024-05-01 15:13:32 UTC to 2024-05-01 16:13:32 UTC
Collection Id : 20240501161522machine1
Detailed Logging at : /u01/app/giusr/oracle.ahf/data/repository/srdc_dbperf_collection_Wed_May_01_16_15_27_UTC_2024_node_all/diagcollect_20240501161522_machine1.log
Waiting up to 120 seconds for collection to start
2024/05/01 16:15:43 UTC : NOTE : Any file or directory name containing the string .com will be renamed to replace .com with dotcom
2024/05/01 16:15:43 UTC : Collection Name : tfa_srdc_dbperf_Wed_May_01_16_15_26_UTC_2024.zip
2024/05/01 16:15:43 UTC : Collecting diagnostics from hosts : [machine1, machine3, machine2, machine4]
2024/05/01 16:15:45 UTC : Getting list of files satisfying time range [05/01/2024 15:13:32, 05/01/2024 16:13:32]
2024/05/01 16:15:45 UTC : Collecting Additional Diagnostic Information...
2024/05/01 16:16:48 UTC : Executing DB Script awr_reports on db23cdb1 with timeout of 3600 seconds...
2024/05/01 16:16:57 UTC : Completed Collection of Additional Diagnostic Information for Insights...
2024/05/01 16:18:56 UTC : Collecting ADR incident files...
2024/05/01 16:20:32 UTC : Executing Applicable ORAchk Validations with timeout of 600 seconds...
2024/05/01 16:24:16 UTC : Executing IPS Incident Package Collection(s)...
2024/05/01 16:24:25 UTC : No ADR Incidents for db23cdb1 covering period "2024-05-01 15:13:32" to "2024-05-01 16:13:32" were generated, IPS Pack will not be collected.
2024/05/01 16:24:25 UTC : Executing SQL Script db_feature_usage.sql on db23cdb1 with timeout of 600 seconds...
2024/05/01 16:24:25 UTC : Executing Collection for OS with timeout of 1800 seconds...
2024/05/01 16:24:39 UTC : Completed Collection of Additional Diagnostic Information...
2024/05/01 16:24:47 UTC : Completed Local Collection
2024/05/01 16:24:47 UTC : Not Redacting this Collection ...
2024/05/01 16:24:47 UTC : Remote Collection in Progress...
2024/05/01 16:24:48 UTC : Collection completed on host: machine2
2024/05/01 16:24:48 UTC : Collection completed on host: machine3
2024/05/01 16:24:48 UTC : Collection completed on host: machine4
2024/05/01 16:24:52 UTC : Executing Creation of insights zip with timeout of 900 seconds...
2024/05/01 16:28:49 UTC : Report is generated at : /u01/app/giusr/oracle.ahf/data/repository/srdc_dbperf_collection_Wed_May_01_16_15_27_UTC_2024_node_all/machine1_insights_2024_05_01_16_24_55.zip
2024/05/01 16:28:49 UTC : Finished creation of insights zip with status 0
2024/05/01 16:28:50 UTC : Collection completed on host: machine1
2024/05/01 16:28:49 UTC : Completed collection of zip files.
.-----------------------------------------.
|            Collection Summary           |
+---------------+-----------+------+------+
| Host          | Status    | Size | Time |
+---------------+-----------+------+------+
| machine2 	    | Completed | 13MB | 289s |
| machine3 	    | Completed | 12MB | 332s |
| machine4 	    | Completed | 13MB | 420s |
| machine1 	    | Completed | 17MB | 544s |
'---------------+-----------+------+------'
Logs are being collected to: /u01/app/giusr/oracle.ahf/data/repository/srdc_dbperf_collection_Wed_May_01_16_15_27_UTC_2024_node_all
/u01/app/giusr/oracle.ahf/data/repository/srdc_dbperf_collection_Wed_May_01_16_15_27_UTC_2024_node_all/machine1.tfa_srdc_dbperf_Wed_May_01_16_15_26_UTC_2024.zip
/u01/app/giusr/oracle.ahf/data/repository/srdc_dbperf_collection_Wed_May_01_16_15_27_UTC_2024_node_all/machine2.tfa_srdc_dbperf_Wed_May_01_16_15_26_UTC_2024.zip
/u01/app/giusr/oracle.ahf/data/repository/srdc_dbperf_collection_Wed_May_01_16_15_27_UTC_2024_node_all/machine3.tfa_srdc_dbperf_Wed_May_01_16_15_26_UTC_2024.zip
/u01/app/giusr/oracle.ahf/data/repository/srdc_dbperf_collection_Wed_May_01_16_15_27_UTC_2024_node_all/machine4.tfa_srdc_dbperf_Wed_May_01_16_15_26_UTC_2024.zip
/u01/app/giusr/oracle.ahf/data/repository/srdc_dbperf_collection_Wed_May_01_16_15_27_UTC_2024_node_all/machine1_insights_2024_05_01_16_24_55.zip

 

Once it’s finished Autonomous Health Framework will package everything for you in a zip file for each machine.

It’s worth noting here, that in this example although Autonomous Health Framework has generated a zip for each machine, most of the time you’re only going to need the zip from the machine where you initiated the collect.

 

Now we can move on to step number two. Use Autonomous Health Framework Insights to find recommendations. 

 

2. Use Autonomous Health Framework Insights

Transfer the diagnostic zip from the machine where you initiated the collection to a machine with a web browser and unzip it.

Database performance tuning - Insights zip 1

 

Within here you’ll find another zip containing Autonomous Health Framework Insights. 

Extract that and open the index.html

Database performance tuning - Insights zip 2

 

a. Compare configuration against best practices

Autonomous Health Framework Insights provides you a bird’s eye view of your entire system. You can use it to spot problems, drill into the root cause and understand how to resolve.

First we’ll look for any relevant configuration problems to see if you’ve drifted from Oracle best practices.

Click on the Best Practices issues to dill down.

Database performance tuning - Insights Best Practice selection

 

The summary provides an overview of where your system has strayed from best practice and the relative severity.

Database performance tuning - Insights best practice graphs

 

As you scroll down you’ll see specific configuration issues.

Look down through the list of issues and see if you find anything that may be relevant.

If you find something that looks like it could be related to performance, expand it to see the recommendation.

Every best practice check explains the impact and risk of this configuration drift along with the repair steps.

Database performance tuning - Insights best practice results

 

b. Find and fix database anomalies and resource bottlenecks

Next we’ll look at the Database Anomalies Advisor.

Autonomous Health Framework uses AI to detect unusual events and recommend solutions.

 

Slow IO Example Scenario

Database performance tuning - Insights os issues selection

 

In this scenario Autonomous Health Framework has detected a number of unusual events suggesting the database and host IO is slow.

Database performance tuning - Insights anomaly IO

 

From here we can go back out to the home screen to explore the Operating System metrics.

 

In this example we can see Autonomous Health Framework has detected the Oracle Processes were blocked in D State, which means they were stuck waiting for IO to return

We can also see some disks had a long wait time

Database performance tuning - Insights os issues IO

 

Expanding this gives us graphs showing  disks with average wait time greater than 20 milliseconds and top IO consuming processes.

Database performance tuning - Insights os issues IO graphs

 

And scrolling further down we can see problematic snapshots.

This one shows us which disks were slow

Database performance tuning - Insights os issues IO disk speed

 

High CPU Usage Example Scenario

 

Here’s a different scenario looking at anomalies detected by Autonomous Health Framework, in this case we see the latency for Database global cache messages higher than expected. Autonomous Health Framework tells us the cause and recommended action.

So now we know CPU is a bottleneck

Database performance tuning - Insights anomaly CPU

 

Going to the OS issues report page in this scenario shows us a finding for high CPU usage.

Expanding that finding shows us details about when the CPU usage was high. As we scroll down we see CPU usage graphed over time.

 

Database performance tuning - Insights os issues cpu

 

Continuing down shows us the CPU usage for different processes.

Here we can see the database background and foreground processes are consuming an average of over 80% CPU and a Max of over 100% CPU

Database performance tuning - Insights os issues cpu processes

 

3. Use Autonomous Health Framework to understand and protect against noisy neighbors

This third step is only applicable if you’ve had repeated database performance problems on a host with multiple other databases and the host shows periods with more than 70% CPU usage.

In this step we’ll check for noisy neighbors.

In generic terms, a noisy neighbor is when one service consumes a larger than expected share of system resources, which can impact the performance of other services.

Autonomous Health Framework can detect noisy database neighbors based on CPU usage.

 

Autonomous Health Framework uses the Oracle Enterprise Manager repository as a source of historical data. So, configure it to connect to this repository by running:

ahf configuration set --type impact --user-name <EM user> --connect-string <EM repository connection string>

 

You’ll be prompted for the repository password and Autonomous Health Framework will setup the configuration.

For example:

ahf configuration set --type impact --user-name ahftest --connect-string "//mymachine.acme.com:1521/abcd.acme.com"

Enter EM Repository password:

Re-enter EM Repository password:

Configuration files created in /opt/oracle.ahf/data/mymachine/balance/user_john

Now that’s configured you can generate a cluster analysis report by running:

ahf analysis create --type impact --scope cluster --name cluster_name

For example:

ahf analysis create --type impact --scope cluster --name mycluster 

Starting analysis and collecting data for impact

Report is generated at : /opt/oracle.ahf/data/mymachine/diag/balance/user_john/cluster_160424_154432451_UTC.html

 

The first section in the report gives you a summary of your current noisy neighbor exposure and impact vs what it would be if you set the recommended CPU_COUNTs.

 

Database performance tuning - balance report

 

The next graph down shows the current noisy neighbor status over the past month vs what it would have been with the recommended CPU_COUNT.

Anywhere with a black line indicates a database was impacted by a noisy neighbor.

Database performance tuning - balance chart

As you hover over the graph these controls appear, which allow you to zoom in and pan around the data. Use this to understand if the black lines correlate with times you’ve had performance problems.

If the black lines correlate with the periods of slow database performance then then use the rest of the report to drill down further by database or machine and understand the changes you need to make to CPU_COUNT across your various databases.

4. Use Autonomous Health Framework diagnostics to identify and tune poor performing database workloads

This fourth step uses more of the diagnostics you captured to identify and tune poor performing database workloads

 

Looking at the same collection that was used previously for the high CPU utilization example, you can find a perfhub html file.

Database performance tuning - dbperf file

Opening that gives you the Database Performance Hub report with details about the database workload.

Database performance tuning - dbperf wait classes

The first thing to understand is what your active sessions are doing – an active session is one actively working on behalf of the client.

By default the Performance Hub Report opens on the ASH Analytics tab.

ASH stands for Active Session History, which is a database feature that samples active database sessions.

Expand the graph to get a better look.

Here we can see our highest wait class is CPU, which tells us the database is CPU bound.

Database performance tuning - dbperf cpu bound

Looking at the SQL by wait class graph we can identify SQL that are candidates for tuning.

 

Let’s collapse the graph again and take a look at the ADDM tab

ADDM stands for Automatic Database Diagnostic Monitor.

Database performance tuning - dbperf addm

The ADDM tab analyzes the data in Automatic Workload Repository or AWR and provides options to resolve performance problems.

 

Click into one of the finding to get a recommendation. The recommendation will tell you what to change and give you an estimated benefit.

Database performance tuning - dbperf addm recommendation

You can use this table to identify any particular SQL statements that have long running durations.

Once identified you can focus on tuning their individual performance.

 

Oracle has a whole documentation guide on this, which you can find at https://bit.ly/tunesql – Work through that guide to understand how the query optimizer works and how to improve execution times.

As said at the beginning Database performance tuning is often an iterative exercise.

You’ll most likely find once you’ve fixed one issue, you’ll need to repeat these steps again to find the next until you reach your desired performance goal.

 

5. Log a new SR using the Autonomous Health Framework diagnostic collection

If you still need help then the final step is to log an SR using the Autonomous Health Framework diagnostic collection

Log into My Oracle Support.

  1. Choose the Service Requests tab
  2. Then click the Create Technical SR button

 

Database performance tuning - create SR 1

 

Complete all the required Service Request fields.

Make sure to choose product as “Oracle Database – Enterprise Edition”

Then when you come to the Problem Type, select the option to Create an Express SR.

 

This is a fast track SR creation route using an Autonomous Health Framework collection.

Database performance tuning - create SR 2

Follow the rest of the steps including uploading the collection.

That’s it!

Oracle Support will take over and help resolve it.

 

Get started now

To get started today download Oracle Autonomous Health Framework

Gareth Chapman

Senior Principal Product Operations Manager

Gareth is Senior Principal Product Operations Manager for Oracle Autonomous Health Framework.