HOTSOS!

So, yeah, kinda let this die for a while. Hope to post some recent work soon. Just a quick note that I’m at HOTSOS13 and looking forward to a few days of stuffing my brain! I’m on twitter as @tylervanvierzen if you happen to be there too.

OGG-00360 Oracle GoldenGate 11.2: EXT TRAIL/FILE cannot be used with REPLICAT GROUP.

I assisted with a GoldenGate 11.1 -> 11.2 upgrade today. It was pretty hairy — a lot of small but important changes.  One thing I wanted to toss out here, because it cost me a few hours, is that there is a deprecation/error handling change missing from the GoldenGate 11.2 documentation. In previous versions, whether it paid attention to it or not, for REPLICAT we were able to have the EXTTRAIL command listed in both the initial setup (ie. using an obey file) AND in the PRM file. As of 11.2, if you have it listed in the PRM file, it gives you an error that is really difficult to understand (at least when you are under the mistaken impression that it HAS to be in the replicat’s prm file … clearly it does not).  (Also submitted this to Oracle Support as a suggested metalink note, since this OGG error is not searchable on their site.)

The error message is:

2012-05-18 09:52:32 ERROR OGG-00360 Oracle GoldenGate Delivery for Oracle, rohcsd41.prm: EXT TRAIL/FILE cannot be used with REPLICAT GROUP.

and the fix is:

Remove or comment out any EXTTRAIL commands in the PRM file. EXTTRAIL should be set with an earlier configuration command or in the obey file, not in the PRM file. (I’m assuming this has always been the case and in 11.2 they just decided to start throwing an error about this.)

Example from the .prm file:

— **** THIS IS DEPRECATED IN 11.2 PRM FILES FOR REPLICAT PRM FILES!!! ****
–EXTTRAIL ./dirdat/myserver14/HCS/rohcsd41/e1

Oracle GoldenGate tip: DDL INCLUDE/EXCLUDE and comments

Figured this out by trial and error, actually while I was compiling a convincing “you have a bug” log for Oracle Support prior to opening an SR.  🙂  The odds of hitting this are small, but it’s worth putting a post out there just in case.  Certainly drove us nuts for a while.

GoldenGate DDL include/exclude statements look something like this:

 DDL INCLUDE MAPPED &
 EXCLUDE ALL INSTR 'TRIGGER' &
 EXCLUDE ALL INSTR 'GRANT' &
 EXCLUDE ALL OPTYPE 'ALTER' &
 OBJTYPE 'SESSION'
;

My goal was to exclude all trigger operations.  I tried both OPTYPE/OBJTYPE (for ‘create’ and ‘trigger’) and INSTR in a one-table extract/pump/replicat setup, doing the DDL excludes in the replicat file.  It complete ignored whatever I did.  Finally, I removed a line leftover from testing, which was commented out as follows:

--DDL INCLUDE ALL &   <--- this line
 DDL INCLUDE MAPPED &
 EXCLUDE ALL INSTR 'TRIGGER' &
 EXCLUDE ALL INSTR 'GRANT' &
 EXCLUDE ALL OPTYPE 'ALTER' &
 OBJTYPE 'SESSION'
;

Well it turns out that in this particular formatting case, GoldenGate ignores your comment “–” and parses the “INCLUDE ALL” and then ignores the excludes.  After a bunch of testing to try and find something that would actually work to exclude TRIGGER operations, removing this comment:

--DDL INCLUDE ALL &

was the ‘magic’ fix.  Now triggers operations are correctly excluded when using INSTR. However, it still does not work to do the following:

DDL INCLUDE MAPPED &
 EXCLUDE ALL OPTYPE 'CREATE' & 
             OBJTYPE 'TRIGGER' 
;

CREATE TRIGGER DDL was still replicated with this configuration in the replicat.  But at least INSTR works.

Oracle GoldenGate v11 and RMAN-08137

This is worth a post, despite my dearth of time to post lately … just had an interesting one and found the perfect blog article to resolve my issue, so definitely need to send a shoutout to the author, Andy Black.

In a nutshell, v11 of GG now uses the same sort of “delete only when applied” method that logicals do.  v10 does not.  If you create an extract and don’t delete it, it thinks the archive logs are still needed  and won’t delete them when you do an rman backup.  I did have the extract enabled for a bit, I think that is also needed to create this situation, not just an unused extract.

This of course can provide no end of confusion if you have a primary database with both dataguard and goldengate running, and you don’t know GG v11 made this change.  I was so confused as to why my logicals weren’t telling the primary that they were long done with these logs.  Turns out GG was the holdup.

Andy provides more than enough info to fix the problem, and I also left a comment there about a specific situation and how to fix it (quick ref – make sure to dblogin before you delete an extract, or it will leave a record in db_capture and you’ll end up with RMAN-08137 issues).

Gotta love Oracle / DBA work, learn something new every day.

UPDATE 12/14: we had the same problem but in a restored non-prod copy of the database. Even after installing goldengate and trying to use ggsci to remove the extract, it wouldn’t budge from dba_capture. So I put in an oracle SR. They came back with this oracle article (1351352.1) and this DBMS to remove things manually from dba_capture – DBMS_CAPTURE_ADM.DROP_CAPTURE.  I used the following:

select 'exec DBMS_CAPTURE_ADM.DROP_CAPTURE ('''||capture_name||''');' from dba_capture;

then run whichever bits of the output need to be removed.

DBMS_SPM.EVOLVE_SQL_PLAN_BASELINE runs and evolves successfully, but errors when returning output (ORA-06502)

We recently have upgraded many of our databases to 11g.  One of our tools to make sure performance did not degrade was to implement SQL Plan Baselines for our top queries, as identified a few different ways via AWR.  That’s a long story, but one interesting short note is that after you have Baselines, you might want to evaluate them periodically to see if there is a better plan available.  Oracle provides DBMS_SPM.EVOLVE_SQL_PLAN_BASELINE so you can do this.

I chose to wrap my evolve in a shell script and call it from cron.  The script worked perfectly for just over a month, then one day all my log had in it was an error:

DECLARE
*
ERROR at line 1:
ORA-06502: PL/SQL: numeric or value error
ORA-06512: at line 13

I tried increasing my SET LONG statement – no dice.  Also, it was clear from the baselines themselves and the total run time before the error that the evolve was actually working, it was just the log/text output at the end that was failing.  In fact, the DBMS_SPM.EVOLVE_SQL_PLAN_BASELINE wasn’t failing at all.  It was actually DBMS_OUTPUT.PUT_LINE that was failing.

Finally I read that DBMS_OUTPUT.PUT_LINE has a 32767 character limit for a single line.   Since I could see that my output had grown slowly but steadily each day, and on the day it started failing it had grown beyond this limit, this was the answer.  This is not exactly a bug in the DBMS_SPM package, however since this output should be expected to be long, one wonders why a better output method wasn’t developed.  Oh well – on to the solution.

Many  smart people have posted about how to chunk up a clob to get around this limit and print it to the screen/sqlplus.  I happened to like and use this one (modified to use put_line instead of writing to a file).

My working evolve script — leaving out the surrounding shell stuff which you’d want to customize for your environment anyway — looks like this:

set serveroutput on
SET LONG 1000000
DECLARE
  results  CLOB;
  r_len    NUMBER;
  putlimit constant binary_integer := 32767;
  r_buffer VARCHAR2(32767);
  r_pos    PLS_INTEGER := 1;
BEGIN
  results := DBMS_SPM.EVOLVE_SQL_PLAN_BASELINE (
   time_limit   => 180,
   verify       => 'YES',
   commit       => 'YES'
    );
  dbms_output.put_line('finished with the evolve here, no errors so far.');
  r_len := dbms_lob.getlength(results);
  dbms_output.put_line('length of results var: '||r_len);
  while r_pos < r_len loop
    r_buffer := dbms_lob.substr(results, putlimit, r_pos);
    exit when r_buffer is null;
    DBMS_OUTPUT.PUT_LINE(r_buffer);
    r_pos := r_pos + LEAST(LENGTH(r_buffer)+1,putlimit);
  end loop;
END;
/