Sunday, July 22, 2007

How we solved a (ORA-02049 Timeout: Distributed Transaction Waiting for Lock) on our Apps Customized module

We have a customized Point of Sale module that is integrated with our Apps standard CRM and financial modules; we faced a serious issue on this customized module that is when users are trying to sale through this module they receive an ORA-02049 Timeout: Distributed Transaction Waiting for Lock, which require them to keep trying until they make the sale. This error used to show on daily basis on the peak hours only but we could not tell what the cause of it, simple search of the error on metalink return note 1018919.102 that advices that we should increases the distributed_lock_timeout value in the INIT.ORA file the default value was 60 seconds so we increased it to 300 seconds even though we don’t have any distributed transactions on the system all the transactions were local. We restart the issue and the problem became worse because now the end users have to wait for 5 minutes (300 seconds) before they receive the error message (ORA-02049) and because of that we had to set the value back to 60 seconds.

After that we tried to trace the error using different event trace levels but with no luck we were not able to determine what is causing the error.

We thought that it’s a database bug and oracle advised us to upgrade the database from 9.2.0.5 to 9.2.0.7 we did that but still the issue is there.

After a month of investigation/tracing and snapshot of when the problem is happing we managed to find out what was causing the problem. It was a bitmap index that was built on the table we were trying to insert data on.

When an end user was trying to sale without committing his transaction for some reason and at the same time another end user tries to sale he will receive the error message and a lock on the table happened and the error pops-up.

We solved the issue by dropping the bitmap index and creating a normal b-tree index even though the column has only three distinct values.

7 comments:

Anonymous said...

Interesting Post.

Sam
appsdbablog.com

fhasweh said...

thanks sam for your support

Arunkumar said...

how did u figure out it was an index problem? trace file/explain plan/statspack etc?

fhasweh said...

hi Arun,
yes it was trace files/statspack and testing senriose

fadi

Anonymous said...

Was this problem appearing while 'select for update' was executing?

fhasweh said...

so sorry for being late in my reply, the answer is yes 'select for update' with the bitmap index was casing problems
fadi

Anonymous said...

This is a cool discovery. Even Oracle Support cannot figure this out.
We have similar problem and Oracle support every time ask us to increase distributed_lock_timeout...it is 300 seconds now.

We use remote database only for quering and these queries are executed in few seconds.