Go Back   Rhinocerus > Newsgroup > Newsgroup comp.databases.oracle.server

Reply
 
Thread Tools Display Modes
  #1 (permalink)  
Old 08-16-2005, 01:23 AM
Dennis G Allard
Guest
 
Posts: n/a
Default Combining Data Guard with clustered redo logs for high performancestandby

Hello everyone.

In Data Guard, setting LogXptMode = SYNC incurs
a performance penalty on the primary database.

That is because there is latency involved in
writing to the standby redo log.

It should be possible to use an active/passive
cluster solution whereby the redo logs are kept
on an external storage array. When the primary
system fails, the standby would take over the
primary server's redo logs by mounting the redo
log disk partition.

I realize that one can use RAC to achieve HA
via a clustered disk array. However, me thinks
that a less expensive solution for HA could be
obtained by combining Data Guard with the above
idea of using external storage just to store the
redo logs (and control files?). In this way, one
should be able to operate Data Guard with a
LogXptMode set to ARCH. At time of fail over,
one would need to point the Standby database
server to the redo logs that had been previously
mounted by the primary server.

I have not yet combed the Data Guard documentation
to attempt to figure out if I can make use of its
facilities for transferring archive redo logs and
initiating fail over yet somehow inform the standby
to mount the redo logs on the external storage
prior to activation.

Has someone out there attempted this feat?

Thanks,
Dennis
Reply With Quote
Alt Today
Advertising
 
and become member of Rhinocerus
Standard Sponsored Links

  #2 (permalink)  
Old 08-16-2005, 01:56 AM
Mark Bole
Guest
 
Posts: n/a
Default Re: Combining Data Guard with clustered redo logs for high performancestandby

Dennis G Allard wrote:
> Hello everyone.
>
> In Data Guard, setting LogXptMode = SYNC incurs
> a performance penalty on the primary database.
>


You mean Data Guard Broker, not Data Guard.

> That is because there is latency involved in
> writing to the standby redo log.
>
> It should be possible to use an active/passive
> cluster solution whereby the redo logs are kept
> on an external storage array. When the primary
> system fails, the standby would take over the
> primary server's redo logs by mounting the redo
> log disk partition.


You are confusing online redo logs, standby redo logs, and archived redo
logs.


>
> I realize that one can use RAC to achieve HA
> via a clustered disk array. However, me thinks
> that a less expensive solution for HA could be
> obtained by combining Data Guard with the above
> idea of using external storage just to store the
> redo logs (and control files?). In this way, one
> should be able to operate Data Guard with a
> LogXptMode set to ARCH. At time of fail over,
> one would need to point the Standby database
> server to the redo logs that had been previously
> mounted by the primary server.


Are you willing for your online redo logs to be a single point of
failure, or not? Once you answer that question, you can proceed.

>
> I have not yet combed the Data Guard documentation
> to attempt to figure out if I can make use of its
> facilities for transferring archive redo logs and
> initiating fail over yet somehow inform the standby
> to mount the redo logs on the external storage
> prior to activation.
>
> Has someone out there attempted this feat?
>
> Thanks,
> Dennis


-Mark Bole



Reply With Quote
  #3 (permalink)  
Old 08-16-2005, 06:47 AM
Dennis G Allard
Guest
 
Posts: n/a
Default Re: Combining Data Guard with clustered redo logs for high performancestandby

Mark Bole wrote:
> Dennis G Allard wrote:
>
>> Hello everyone.
>>
>> In Data Guard, setting LogXptMode = SYNC incurs
>> a performance penalty on the primary database.
>>

>
> You mean Data Guard Broker, not Data Guard.
>
>> That is because there is latency involved in
>> writing to the standby redo log.
>>
>> It should be possible to use an active/passive
>> cluster solution whereby the redo logs are kept
>> on an external storage array. When the primary
>> system fails, the standby would take over the
>> primary server's redo logs by mounting the redo
>> log disk partition.

>
>
> You are confusing online redo logs, standby redo logs, and archived redo
> logs.


I slightly misspoke -- the clustered disk will have the primary
server redo logs, the primary server archived redo logs (that have
not yet been shipped to the standby), and the primary server
control file.

The Standby will receive copies of the archived redo logs
periodically and use them to maintain the Standby database.

At time of failure of the Primary server the Standby will need
to mount the clustered disk and obtain the Primary server
redo logs and archived redo logs that had not been received
prior to failure.

I implemented this once but used an NFS mount for the 'clustered'
disk. It is highly inadvisable to use an NFS mount for redo logs
so I plan to use a true external active/passive cluster instead
this time around.

Here is some more definition of what I plan to do...

There will be a 'Primary database server' and a 'Standby database
server'. If the Primary server fails, the Standby server will
be 'activated' to replace the Primary server and begin acting
as the Primary server (with due notification and/or IP
modifications so that clients see it as the Primary server).

I will use a physically shared disk array in an active/passive
configuration to store files that are used by the Primary server
before it fails and are mounted and accessed by the Standby after
the Primary server fails. I.e., the shared disk is only used by
the Primary server while the Primary server is operating. But
after the Primary server fails, the Standby will mount the
shared disk to obtain the latest redo logs.

There are three kinds of files on external disk array:

primary server redo logs
primary server archived redo logs
primary server control files (not sure I need these on the cluster)

A process will send archived redo logs to the standby
at regular intervals. The standby will have its own
local control file. The state diagram for the standby
during the time it is acting as a Standby (pre-activation)
is as follows (the commands shown are current as of Oracle 8.1.7
and may need to be updated for Oracle 10g):


> Standby Database State Diagram (Standby mode)
>
> +-------+--------> Shutdown
> | ^ |
> | | |
> | | | startup nomount
> | | | pfile=/u01/app/oracle/admin/<SID>/pfile/initSTANDBY.ora
> | | |
> | | Started
> | | |
> | | |
> | | shutdown | alter database mount standby database;
> | | immediate |
> | +<--------
> | Mounted (not open) ----------------------------+
> | +--------> |
> | | | |
> | | | <receive primary site archived logs> |
> | | | |
> | | cancel | recover automatic standby database |
> | | | |
> | | | | alter
> | +--------- Recovered (through most recent archived log) | database
> | | | open
> | | | read only;
> | | alter database open read only; |
> | | |
> | | |
> +------------------ Open (read only) <----------------------------+
> shutdown abort



The Standby control files and initSTANDBY.ora are created by the
following procedure:

> 1. Back up the primary database data files via either cold or hot backup.
>
> 2. Create a standby control file
> (ALTER DATABASE CREATE STANDBY CONFOLFILE AS 'filename')
>
> 3. Archive the current online redo logs of the primary database
> (ALTER SYSTEM ARCHIVE LOG CURRENT)
>
> 4. Transfer the standby control file, archived log files and backed-up
> data files to the standby database.



In the above diagram, nodes represent states that the Standby database
can be in and arrows are labeled by the database operation used to
cause a transition to the state the arrow points to.

The process <receive primary site archived logs> can be implemented
via scp or other remote copy program over a TCP connection. (I'm
hoping Data Guard [Broker] can help with that.)

Normally, in the above diagram, the Standby oscillates between the
states 'Mounted' and 'Recovered'. After new archived redo logs
arrive from the Primary database, the Standby is transitioned
to the 'Recovered' state (my terminology) by the 'recover
automatic standby database' command, which consumes all available
archive redo logs that have an SCN less than the current Standby
control state.

Now, what happens when the Primary server fails?

The fail over process causes the Standby Server to perform
the following

> .
> . -- Primary site failure detected
> .
> started (mounted)
> |
> | obtain primary site online redo logs, control files and
> | any remaining archived logs (not sure I need the control files)
> |
> | recover automatic standby database;
> |
> Recovered (through most recent archived log)
> |
> | cancel
> |
> Mounted (not open, recovered through all primary site archived logs)
> |
> | shutdown immediate
> |
> Shutdown
> |
> | <copy production control files, production online redo logs
> | to /u01/oradata/<SID>/ -- OK to do copy prior to above shutdown>
> |
> | <Assure production parameter file in admin/pfile/init<SID>.ora>
> |
> Shutdown (still)
> |
> | startup mount pfile=/u01/app/oracle/admin/<SID>/pfile/init<SID>.ora
> |
> |
> Mounted
> |
> | recover database
> |
> |
> Recovered
> |
> | alter database open
> |
> |
> Open (normal read/write operation)
> |
> | alter system archive log all [ Doc says do this 'after opening'
> | but Appendix C - Post Open Script does it before opening :-( ]
> |
> Continue using system as a primary site database






>> I realize that one can use RAC to achieve HA
>> via a clustered disk array. However, me thinks
>> that a less expensive solution for HA could be
>> obtained by combining Data Guard with the above
>> idea of using external storage just to store the
>> redo logs (and control files?). In this way, one
>> should be able to operate Data Guard with a
>> LogXptMode set to ARCH. At time of fail over,
>> one would need to point the Standby database
>> server to the redo logs that had been previously
>> mounted by the primary server.

>
>
> Are you willing for your online redo logs to be a single point of
> failure, or not? Once you answer that question, you can proceed.


I will either use Oracle mirroring or Hardware mirroring
of the online redo logs. In that sense, they will not be a
single point of failure.

The external disk array we have has dual storage processors,
dual fiber channel ports, and hardware RAID.

It is true that if we mirror enough information to internal
disks on the Standby, we can achieve a further level of
reliability. But my goal is to have *zero* impact on
transaction performance of the Primary server. My reading
of the Data Guard documentation is that it is not able to
provide recoverability to the last committed transaction without
some loss of performance (LogXptMode = SYNC).

Cheers,
Dennis


>> I have not yet combed the Data Guard documentation
>> to attempt to figure out if I can make use of its
>> facilities for transferring archive redo logs and
>> initiating fail over yet somehow inform the standby
>> to mount the redo logs on the external storage
>> prior to activation.
>>
>> Has someone out there attempted this feat?
>>
>> Thanks,
>> Dennis

>
>
> -Mark Bole



Reply With Quote
  #4 (permalink)  
Old 08-16-2005, 06:52 AM
Dennis G Allard
Guest
 
Posts: n/a
Default Re: Combining Data Guard with clustered redo logs for high performancestandby

(I just can't stand that in year 2005 we still have
mail clients such as Thunderbird that are unable to
correctly wrap or NOT wrap plain ascii text).

(so, my apologies on behalf of the computer software
community for the poor formatting of my ASCII art in
my previous post, which is entirely out of my control
as far as I can tell)

Dennis
Reply With Quote
  #5 (permalink)  
Old 08-16-2005, 05:40 PM
Mark Bole
Guest
 
Posts: n/a
Default Re: Combining Data Guard with clustered redo logs for high performancestandby

Dennis G Allard wrote:

> Mark Bole wrote:
>
>> Dennis G Allard wrote:
>>

[...]
> I slightly misspoke -- the clustered disk will have the primary
> server redo logs, the primary server archived redo logs (that have
> not yet been shipped to the standby), and the primary server
> control file.
>
> The Standby will receive copies of the archived redo logs
> periodically and use them to maintain the Standby database.
>
> At time of failure of the Primary server the Standby will need
> to mount the clustered disk and obtain the Primary server
> redo logs and archived redo logs that had not been received
> prior to failure.
>

[...]
>
>>> I have not yet combed the Data Guard documentation
>>> to attempt to figure out if I can make use of its
>>> facilities for transferring archive redo logs and
>>> initiating fail over yet somehow inform the standby
>>> to mount the redo logs on the external storage
>>> prior to activation.
>>>

[...]

I think I understand what you are trying to do -- recover the physical
standby with all available archived redo logs, then replace the standby
control file with the primary control file, copy the primary online redo
logs, and start up the database and let it perform automatic instance
recovery as if it were the primary. I doubt it will work, but there's
no harm in testing.

Step back and consider your two main goals: you want guaranteed zero
loss of committed transactions, and you don't want any performance
penalty for copying those transactions to a second location in real
time. I don't think it is possible in this case to get "something for
nothing". Have you measured the actual impact of the performance
penalty? Have you measured the actual impact of losing, say, ten
minutes worth of transactions once every three years?

If you trust your external storage array, why not put your whole
database on the external array, since you are already putting the
control files, online, and archived redo logs there? Do you know that
having a physical standby requires another full Oracle license (false
claims to the contrary not withstanding)? Do you know that Oracle
recommends never backing up online redo log files, which is very similar
to what you are trying to do? Why not just set your archive lag target
to 10 minutes and maintain the physical standby in a geographically
separate location, which will take care of many, many more scenarios
than the few you are trying to address, and is supported too? If you
look at it from a business point of view, rather than technical, you'll
be better off in my opinion.

-Mark Bole

Reply With Quote
  #6 (permalink)  
Old 08-16-2005, 07:38 PM
Dennis G Allard
Guest
 
Posts: n/a
Default Re: Combining Data Guard with clustered redo logs for high performancestandby

Mark Bole wrote:
> Dennis G Allard wrote:
>
>> Mark Bole wrote:
>>
>>> Dennis G Allard wrote:
>>> [...]


> I think I understand what you are trying to do -- recover the physical
> standby with all available archived redo logs, then replace the standby
> control file with the primary control file, copy the primary online redo
> logs, and start up the database and let it perform automatic instance
> recovery as if it were the primary. I doubt it will work, but there's
> no harm in testing.


Actually, not 'copy' the redo logs. Instead, mount the physical
partition containing the redo logs (etc.) on the Standby so that
it now sees those files as its own. That is the essence of an
active/passive cluster. What used to be files local to the Primary
server become local files on the Standby after the Primary goes
off line and the Standby mounts the partitions.

> Step back and consider your two main goals: you want guaranteed zero
> loss of committed transactions, and you don't want any performance
> penalty for copying those transactions to a second location in real
> time. I don't think it is possible in this case to get "something for
> nothing". Have you measured the actual impact of the performance
> penalty? Have you measured the actual impact of losing, say, ten
> minutes worth of transactions once every three years?


In our case, I do agree that we can 'afford' to lose some transactions
if it is very rare. But I also believe that the new clustering
technologies (such as Red Hat Linux Cluster Suite) will make it
possible to have ones cake and eat it too!

I have scoured the web and USENET and Oracle docs. I am now convinced
that this form of active/passive cluster database failover is not
in common use. However, I see no reason that it cannot be implemented.

>
> If you trust your external storage array, why not put your whole
> database on the external array, since you are already putting the
> control files, online, and archived redo logs there? Do you know that


As a matter of fact, I have decided to do just that! Because then I
could use the active/passive failover technique by simply installing
Oracle on a backup server but keep it turned off. Fail over would
cause the backup server to mount the external disks and bring Oracle
up (and, for that matter, take over the IP of the failed primary
server).

> having a physical standby requires another full Oracle license (false
> claims to the contrary not withstanding)? Do you know that Oracle


I just attended Linux Expo in San Francisco last week. Oracle reps
stated that they have a 'ten day rule' -- as long as you don't use
the standby server database more than ten days in the year, there is
no license fee.

> recommends never backing up online redo log files, which is very similar
> to what you are trying to do? Why not just set your archive lag target


Again, I'm not copying redo logs. I'm merely remounting them to a
different CPU box. I agree there is no point in backing up redo
logs (as opposed to archived redo logs, which, of course, one should
backup back to at least the last hot or cold backup of the database).

> to 10 minutes and maintain the physical standby in a geographically
> separate location, which will take care of many, many more scenarios
> than the few you are trying to address, and is supported too? If you


My only issue with Data Guard is the potential performance hit. It seems
you agree that using LogXptMode = SYNC is expensive but ASYNC or ARCH
modes are tolerable. Is that correct?

> look at it from a business point of view, rather than technical, you'll
> be better off in my opinion.


I agree. My current plan is to use a Standby (ten day rule) in ASYNC
mode. If I can set the lag to ten minutes, fine (I have to dig into the
Data Guard docs now to see how that is done). If it is possible to
dynamically change from ASYNC to SYNC modes, I would run the system in
SYNC mode for most daily operations but in certain circumstances when
we have higher database use (the nature of our application is such that
we get spikes in usage) I would turn down the volume and use a higher
lag time.

Do you know if I can dynamically modify the lag time?

thanks for your help,
Dennis

>
> -Mark Bole
>

Reply With Quote
  #7 (permalink)  
Old 08-16-2005, 09:19 PM
Mark Bole
Guest
 
Posts: n/a
Default Re: Combining Data Guard with clustered redo logs for high performancestandby



Dennis G Allard wrote:

[...]
>>
>> If you trust your external storage array, why not put your whole
>> database on the external array, since you are already putting the
>> control files, online, and archived redo logs there? Do you know that

>
>
> As a matter of fact, I have decided to do just that! Because then I
> could use the active/passive failover technique by simply installing
> Oracle on a backup server but keep it turned off. Fail over would
> cause the backup server to mount the external disks and bring Oracle
> up (and, for that matter, take over the IP of the failed primary
> server).
>


This is exactly what products like HP/UX Service Guard and Veritas VCS
have provided for many years.

>> having a physical standby requires another full Oracle license (false
>> claims to the contrary not withstanding)? Do you know that Oracle

>
>
> I just attended Linux Expo in San Francisco last week. Oracle reps
> stated that they have a 'ten day rule' -- as long as you don't use
> the standby server database more than ten days in the year, there is
> no license fee.


Read the following document, the section on "Backup/Failover/Standby".

http://www.oracle.com/corporate/pricing/sig.pdf

The confusion arises because of sloppy use of the words "standby" vs.
"failover".

To summarize: if you are applying redo to a separate copy of the
database (meaning you have to mount the database), it is a standby and
requires a license for the server that runs it. If you are simply
moving the primary database from one node to another via mounting and
unmounting shared storage, it is a "failover" and the ten day rule applies.


[...]
>
> My only issue with Data Guard is the potential performance hit. It seems
> you agree that using LogXptMode = SYNC is expensive but ASYNC or ARCH
> modes are tolerable. Is that correct?
>


No, because I have never used Data Guard "maximum protection" mode, but
if I did, I would measure the performance "hit" first to see how bad it
was before concerning myself with steps to address it.


-Mark Bole



Reply With Quote
  #8 (permalink)  
Old 08-17-2005, 02:06 AM
IANAL_VISTA
Guest
 
Posts: n/a
Default Re: Combining Data Guard with clustered redo logs for high performance standby

Dennis G Allard <allard@oceanpark.com> wrote in
news:430240AE.9050505@oceanpark.com:

> Mark Bole wrote:
>> Dennis G Allard wrote:
>>
>>> Mark Bole wrote:
>>>
>>>> Dennis G Allard wrote:
>>>> [...]

>
>> I think I understand what you are trying to do -- recover the
>> physical standby with all available archived redo logs, then replace
>> the standby control file with the primary control file, copy the
>> primary online redo logs, and start up the database and let it
>> perform automatic instance recovery as if it were the primary. I
>> doubt it will work, but there's no harm in testing.

>
> Actually, not 'copy' the redo logs. Instead, mount the physical
> partition containing the redo logs (etc.) on the Standby so that
> it now sees those files as its own. That is the essence of an
> active/passive cluster. What used to be files local to the Primary
> server become local files on the Standby after the Primary goes
> off line and the Standby mounts the partitions.
>
>> Step back and consider your two main goals: you want guaranteed zero
>> loss of committed transactions, and you don't want any performance
>> penalty for copying those transactions to a second location in real
>> time. I don't think it is possible in this case to get "something
>> for nothing". Have you measured the actual impact of the performance
>> penalty? Have you measured the actual impact of losing, say, ten
>> minutes worth of transactions once every three years?

>
> In our case, I do agree that we can 'afford' to lose some transactions
> if it is very rare. But I also believe that the new clustering
> technologies (such as Red Hat Linux Cluster Suite) will make it
> possible to have ones cake and eat it too!
>
> I have scoured the web and USENET and Oracle docs. I am now convinced
> that this form of active/passive cluster database failover is not
> in common use. However, I see no reason that it cannot be
> implemented.
>
>>
>> If you trust your external storage array, why not put your whole
>> database on the external array, since you are already putting the
>> control files, online, and archived redo logs there? Do you know
>> that

>
> As a matter of fact, I have decided to do just that! Because then I
> could use the active/passive failover technique by simply installing
> Oracle on a backup server but keep it turned off. Fail over would
> cause the backup server to mount the external disks and bring Oracle
> up (and, for that matter, take over the IP of the failed primary
> server).
>
>> having a physical standby requires another full Oracle license (false
>> claims to the contrary not withstanding)? Do you know that Oracle

>
> I just attended Linux Expo in San Francisco last week. Oracle reps
> stated that they have a 'ten day rule' -- as long as you don't use
> the standby server database more than ten days in the year, there is
> no license fee.
>
>> recommends never backing up online redo log files, which is very
>> similar to what you are trying to do? Why not just set your archive
>> lag target

>
> Again, I'm not copying redo logs. I'm merely remounting them to a
> different CPU box. I agree there is no point in backing up redo
> logs (as opposed to archived redo logs, which, of course, one should
> backup back to at least the last hot or cold backup of the database).
>


The mount point for redo logfiles mentioned immediately above could
be a nasty single point of failure that could bring down both DBs.
Reply With Quote
 
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are Off
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Question about efficient data extraction Nordlund, Dan Newsgroup comp.soft-sys.sas 0 06-04-2008 07:17 PM
Re: multicollinearity in proc logistic Bora Yavuz Newsgroup comp.soft-sys.sas 0 11-21-2006 02:02 PM
Re: Data step questions Ian Whitlock Newsgroup comp.soft-sys.sas 1 09-09-2006 10:37 AM
Re: Views and passes (was RE: Output last record of fantom by Paul M. Dorfman Newsgroup comp.soft-sys.sas 0 07-14-2005 04:40 AM
Re: Views and passes (was RE: Output last record of fantom by Sigurd Hermansen Newsgroup comp.soft-sys.sas 0 07-13-2005 10:00 PM



All times are GMT. The time now is 10:10 PM.


Copyright ©2009

LinkBacks Enabled by vBSEO 3.3.0 RC2 © 2009, Crawlability, Inc.