13 June, 2020

BW Performance issue Troubleshooting


Purpose

The purpose of this blog is to troubleshoot the BW performance issue which we sometime face due to obvious reason. To troubleshoot we should perform the below mandatory check one by one and note it down in separate in notepad/word document.

  • Basis Checks from Application
  • Basis Checks from OS
  • Basic Checks from DB
Basis Checks from Application


Action/Tcode
Expectation

Check GUI
Should be able to login
If note use Basic check for OS/DB.

SM51
All 3 Application should be available
If not check application server from OS
SM50
Check Background/Dialog WP on all application server, both WP should be available
If Background and Dialog WP is not available, then note it down and check which user/report causing this. Steps
SM66
Note down the long running and high memory consuming jobs as well as any WP in private mode.
Steps:
ST22
Check dumps and investigate

SM21
Check log

SM12
Check lock entries

SM13
Check update is active or not as well as failed updates.
If Update is disable, then enable it from SM13 administrator or Sm14. If enable and various lock entries are there then check step:
SM37
Check active jobs and long running jobs as well as jobs which I failed today
If jobs are getting failed check steps:
If Jobs are running from long time check steps:
RSDANLCON
Check NLS database is available
If not report this to DB team
SMLG
Check logon group load
SMLG --> F5
SM58
Check entries in error
Inform Application Team
SMQ1
Check entries in error
Inform Application Team
SMQ2
Check entries in error
Inform Application Team


Basis Checks from Operating System (OS) 


Action/Tcode
Expectation

Server logon
Try to login to server
If Not Inform the Unix team
R3Trans
Run R3trans –d and make sure output is 000
If not check from database end most probably database is down
DPMON
Go to profile directory and run dpmon pf=Instance profile
Check dispatcher queue and number of WP
Log
Check log
Check Dispatcher and WP log for error
Timing
Check the timing of Application and DB
If timing is not correct then inform unix for ntp sync


Basis Checks from Database                                                                         


Action/Tcode
Expectation

Server logon
Try to login to server
If Not Inform the Unix team
Status of DB
sapcontrol -nr 00 -function GetSystemInstanceList
Check all nodes are green
Status of services
Hana studio : Landscape: Services
Check last start time of services
Memory
Check memory status
If it’s in red/yellow find out reason
MVCC
Check MVCC blocker Hana studio: System information: MVCC query
Cancel after confirmation
Alert
Check alerts for today/yesterday
Hana studio: Alerts




Background WP Not Available                                                                                                  

Reason
Check WP available on all application server from SM50 make sure it’s free on all apps
If work process not available mostly background, then follow below

Action
Find out the user who is using maximum number of session with AL08
Find out active jobs from user whose jobs are using most of background WP

Resolution
Inform user to cancel his/her jobs or better kill before it goes for P1
Cancel job or remove user from system

SAP Notes
2098461 - PRIV process management
2537149 - Work Process in PRIV Mode are not ended based on max_wp_runtime

MVCC Blocker                                                                                                                            

     a)      Multi Version Concurrency Control (MVCC) is a concept that ensures transactional data consistency by isolating transactions that are accessing the same data at the same time.
     b)      SAP HANA use multi version concurrency control to make schedule serializable. Each user connected to the database sees a snapshot of the database at a particular instant in time. Any changes made by a writer will not be seen by other users of the database until the changes have been completed (or, in database terms: until the transaction has been committed.)

Action: To resolve this particular situation, we need to perform 3 below task
     a)     Find out the connection which is causing it using MVCC Blocker Connection script.
     b)   Find out statements that may be blocking garbage collection using MVCC Blocker Script.
     c)   Find out transactions that may be blocking garbage collection using MVCC Blocker Transaction
     d)   After finding we had to kill the session and transaction

Resolution
a)    MVCC alerts need to monitor carefully as its already configured on hana database in daily monitoring.
b)    Parallel job/processing should be avoided which using same table by different process.
c)    Cancel job/session/transaction from hana database as per below.
You stop a session using the following SQL syntax:
ALTER SYSTEM CANCEL SESSION '< connection_id >'
Keep in mind, the CANCEL operation does not happen on the spot. It could take some times before the session stops.
[DISCONNECT SESSION] 
You stop a session by disconnecting it using the following SQL syntax:
ALTER SYSTEM DISCONNECT SESSION '< connection_id >'
This is a 2 internal steps procedures. The current transaction is cancelled and rolled back first then it is disconnected.

Important Note: you should execute a DISCONNECT SESSION with care if the CANCEL SESSION didn't succeed.

How to get session details using sap hana studio?
Performance tab --> session

killing transaction
hdbcons 'transaction c < transaction_id > ‘(run from sidadm)

SAP Notes
       2081856 - How to handle HANA Alert 59: 'Percentage of transactions blocked'
       2169283 - FAQ: SAP HANA Garbage Collection

BW data load stuck due to ECC job failure                                                                                              

Reason

BW load is stuck due to dump error " Internal session terminated with a runtime error DBSQL_SQL_ERROR (see ST22)" in ECC system
Its happed because of dead lock on table, please avoid to run multiple jobs at the same time which have same table.

A database deadlock occurs when two processes lock each other's resources and are therefore unable to proceed.  This problem can only be solved by terminating one of the two transactions.  The database more or less at random terminates one of the transactions.

Resolution

a)    Check SM58 in ECC and clear the failed RFCs after confirmation from application team.
b)    Menu --> Edit --> Here you find 2 execute options 1-Execute LUW (F6) - this will process single LUW.
c)    there is another option "EXECUTE LUWs" --> select this option -->Enter date and destination and select the statuses and execute in background.

SAP Notes

2063206 - SM58: Transaction RFC: Log manual actions