Saturday, November 17, 2012

503 Service Unavailable : Oracle Grid Control

This happened recently when I discovered that Grid Control is not working and the error displayed was:

503 service unavailable servlet error service is not initialized correctly.

I bounced all services from services.msc, our server is on windows .

However the error was still there. I searched on metalink and found note ID 418159.1

On reading the note and diagnosing i found a file emoms.properties was nullified, this file is used by oms when services start.

As per the document this happens when disk space gets full and server is restarted.

So i followed the document and created a new file as no backup was available.

After creating file, Services started nicely...:-)

Friday, September 21, 2012

High ENQ:SQ contention in database

This Happened recently when all of a sudden Database stopped responding.

Only option left was to bounce the DB. Since it is a RAC setup, we shutdown one of the instance first and fortunately after starting things came to normal.

Later on when i analyzed the AWR of Database, first instance i found ENQ:SQ contention in top events.

This was new for me, then i checked TOP SQL section of AWR.

Here all top queries were accessing a particular Sequence , when i checked this sequence it was newly created and the cache size was very less 10.

So as such i increased cache size to 100 initially, and till now things are working fine.

I think it was because of this particular sequence , and this should not occur again.


Will try to dig out more..about it...

Wednesday, July 25, 2012

ORA-00600 error while querying a table

This happened recently in one of the setups. When user was querying a table for some data he was getting ORA-00600 [4000] ...We got this issue for resolution. Well this was the first time i came across an issue like this related to table corruption, so we to diagnose furthur we did a count(*) from table first.

1. Select count(*) from table;

ORA-00600: internal error code, arguments:[4000]...

Now we did and export of the table and it went fine

2. In this step we did VALIDATE STRUCTURE with CASCADE option.

SQL> Analyze table table name validate structure cascade;

ORA-00600: internal error code, arguments:[4000]...

 This again failed with same error

3. Now just to check whether its an index corruption or table corruption we again ran VALIDATE STRUCTURE but without cascade and it ran fine without any errors.

SQL> Analyze table table name validate structure ;

Table Analyzed.

4. From this we concluded that table was intact but corruption is with indexes, Further diagnosis revealed that  there was one index that too primary key

5. an online rebuild of the index was done

SQL > alter index index name rebuild online parallel 10;

This solved the problem and the error was gone

Wednesday, July 13, 2011

RMAN-20035

I received this error today while configuring RMAN on one of the databases.

RMAN-20035: Invalid high RECID error

In this case Note 273446.1 from Oracle support was helpful.

It suggests to unregister the database using DBMS_RCVCAT.UNREGISTERDATABASE procedure.
and then register the database again.

This is what i did and the problem resolved as of now.
The following commands are helpful -->

select * from rc_database where dbid = DBID;

exec dbms_rcvcat.unregisterdatabase(DBKEY, DBID);

Monday, July 11, 2011

ORA-00600: internal error code, arguments: [4194],[],[]

I have very less experience in recovery of databases. However recently i have recovered big databases and in one of the databases i saw the error-
ORA-00600: internal error code, arguments: [4194],
Doing block recovery for file 5 block 18791
Block recovery from logseq 4237, block 76 to scn 6058257035817
This was continuously coming in alert log.
After searching for a while a solution that i came across was to drop undo table space and recreate it. And this is what i did and the error got resolved.
Hope this helps

Tuesday, April 12, 2011

Jumbo frames in Oracle RAC

Recently we implemented jumbo frames in our RAC environment, I have read a lot about jumbo frames and how they can be helpful in a cluster based environment but never actually experienced in a real time setup.
In one of the setups recently after migration , it was basically a hardware based migration, an upgrade to more advanced servers ,post migration database was slow and there were many Cluster related events. One of the main event being "gc cr request" and "gc cr congested" , these 2 events were in large count if we query gv$session_wait view.
There was no quick solution to this event and we tried to tune a few problematic sqls, which helped in reducing this event to some extent but most of the times the average CR block Receive Time was unusual then normal and so was average current block receive time.
Finally after a few days we had implemented JUMBO Frames in our rac environment, And just after implementing these, the average CR block Receive Time came down tremendously and also gc cr request and gc cr congested event were very very less.

So Jumbo frames are very much useful as i would say, although it is not recommended by support people.