|
|||||||
![]() |
|
|
Thread Tools | Display Modes |
|
|||
|
Hello everyone.
In Data Guard, setting LogXptMode = SYNC incurs a performance penalty on the primary database. That is because there is latency involved in writing to the standby redo log. It should be possible to use an active/passive cluster solution whereby the redo logs are kept on an external storage array. When the primary system fails, the standby would take over the primary server's redo logs by mounting the redo log disk partition. I realize that one can use RAC to achieve HA via a clustered disk array. However, me thinks that a less expensive solution for HA could be obtained by combining Data Guard with the above idea of using external storage just to store the redo logs (and control files?). In this way, one should be able to operate Data Guard with a LogXptMode set to ARCH. At time of fail over, one would need to point the Standby database server to the redo logs that had been previously mounted by the primary server. I have not yet combed the Data Guard documentation to attempt to figure out if I can make use of its facilities for transferring archive redo logs and initiating fail over yet somehow inform the standby to mount the redo logs on the external storage prior to activation. Has someone out there attempted this feat? Thanks, Dennis |
|
|
||||
|
||||
|
|
|
|||
|
Dennis G Allard wrote:
> Hello everyone. > > In Data Guard, setting LogXptMode = SYNC incurs > a performance penalty on the primary database. > You mean Data Guard Broker, not Data Guard. > That is because there is latency involved in > writing to the standby redo log. > > It should be possible to use an active/passive > cluster solution whereby the redo logs are kept > on an external storage array. When the primary > system fails, the standby would take over the > primary server's redo logs by mounting the redo > log disk partition. You are confusing online redo logs, standby redo logs, and archived redo logs. > > I realize that one can use RAC to achieve HA > via a clustered disk array. However, me thinks > that a less expensive solution for HA could be > obtained by combining Data Guard with the above > idea of using external storage just to store the > redo logs (and control files?). In this way, one > should be able to operate Data Guard with a > LogXptMode set to ARCH. At time of fail over, > one would need to point the Standby database > server to the redo logs that had been previously > mounted by the primary server. Are you willing for your online redo logs to be a single point of failure, or not? Once you answer that question, you can proceed. > > I have not yet combed the Data Guard documentation > to attempt to figure out if I can make use of its > facilities for transferring archive redo logs and > initiating fail over yet somehow inform the standby > to mount the redo logs on the external storage > prior to activation. > > Has someone out there attempted this feat? > > Thanks, > Dennis -Mark Bole |
|
|||
|
Mark Bole wrote:
> Dennis G Allard wrote: > >> Hello everyone. >> >> In Data Guard, setting LogXptMode = SYNC incurs >> a performance penalty on the primary database. >> > > You mean Data Guard Broker, not Data Guard. > >> That is because there is latency involved in >> writing to the standby redo log. >> >> It should be possible to use an active/passive >> cluster solution whereby the redo logs are kept >> on an external storage array. When the primary >> system fails, the standby would take over the >> primary server's redo logs by mounting the redo >> log disk partition. > > > You are confusing online redo logs, standby redo logs, and archived redo > logs. I slightly misspoke -- the clustered disk will have the primary server redo logs, the primary server archived redo logs (that have not yet been shipped to the standby), and the primary server control file. The Standby will receive copies of the archived redo logs periodically and use them to maintain the Standby database. At time of failure of the Primary server the Standby will need to mount the clustered disk and obtain the Primary server redo logs and archived redo logs that had not been received prior to failure. I implemented this once but used an NFS mount for the 'clustered' disk. It is highly inadvisable to use an NFS mount for redo logs so I plan to use a true external active/passive cluster instead this time around. Here is some more definition of what I plan to do... There will be a 'Primary database server' and a 'Standby database server'. If the Primary server fails, the Standby server will be 'activated' to replace the Primary server and begin acting as the Primary server (with due notification and/or IP modifications so that clients see it as the Primary server). I will use a physically shared disk array in an active/passive configuration to store files that are used by the Primary server before it fails and are mounted and accessed by the Standby after the Primary server fails. I.e., the shared disk is only used by the Primary server while the Primary server is operating. But after the Primary server fails, the Standby will mount the shared disk to obtain the latest redo logs. There are three kinds of files on external disk array: primary server redo logs primary server archived redo logs primary server control files (not sure I need these on the cluster) A process will send archived redo logs to the standby at regular intervals. The standby will have its own local control file. The state diagram for the standby during the time it is acting as a Standby (pre-activation) is as follows (the commands shown are current as of Oracle 8.1.7 and may need to be updated for Oracle 10g): > Standby Database State Diagram (Standby mode) > > +-------+--------> Shutdown > | ^ | > | | | > | | | startup nomount > | | | pfile=/u01/app/oracle/admin/<SID>/pfile/initSTANDBY.ora > | | | > | | Started > | | | > | | | > | | shutdown | alter database mount standby database; > | | immediate | > | +<-------- > | Mounted (not open) ----------------------------+ > | +--------> | > | | | | > | | | <receive primary site archived logs> | > | | | | > | | cancel | recover automatic standby database | > | | | | > | | | | alter > | +--------- Recovered (through most recent archived log) | database > | | | open > | | | read only; > | | alter database open read only; | > | | | > | | | > +------------------ Open (read only) <----------------------------+ > shutdown abort The Standby control files and initSTANDBY.ora are created by the following procedure: > 1. Back up the primary database data files via either cold or hot backup. > > 2. Create a standby control file > (ALTER DATABASE CREATE STANDBY CONFOLFILE AS 'filename') > > 3. Archive the current online redo logs of the primary database > (ALTER SYSTEM ARCHIVE LOG CURRENT) > > 4. Transfer the standby control file, archived log files and backed-up > data files to the standby database. In the above diagram, nodes represent states that the Standby database can be in and arrows are labeled by the database operation used to cause a transition to the state the arrow points to. The process <receive primary site archived logs> can be implemented via scp or other remote copy program over a TCP connection. (I'm hoping Data Guard [Broker] can help with that.) Normally, in the above diagram, the Standby oscillates between the states 'Mounted' and 'Recovered'. After new archived redo logs arrive from the Primary database, the Standby is transitioned to the 'Recovered' state (my terminology) by the 'recover automatic standby database' command, which consumes all available archive redo logs that have an SCN less than the current Standby control state. Now, what happens when the Primary server fails? The fail over process causes the Standby Server to perform the following > . > . -- Primary site failure detected > . > started (mounted) > | > | obtain primary site online redo logs, control files and > | any remaining archived logs (not sure I need the control files) > | > | recover automatic standby database; > | > Recovered (through most recent archived log) > | > | cancel > | > Mounted (not open, recovered through all primary site archived logs) > | > | shutdown immediate > | > Shutdown > | > | <copy production control files, production online redo logs > | to /u01/oradata/<SID>/ -- OK to do copy prior to above shutdown> > | > | <Assure production parameter file in admin/pfile/init<SID>.ora> > | > Shutdown (still) > | > | startup mount pfile=/u01/app/oracle/admin/<SID>/pfile/init<SID>.ora > | > | > Mounted > | > | recover database > | > | > Recovered > | > | alter database open > | > | > Open (normal read/write operation) > | > | alter system archive log all [ Doc says do this 'after opening' > | but Appendix C - Post Open Script does it before opening :-( ] > | > Continue using system as a primary site database >> I realize that one can use RAC to achieve HA >> via a clustered disk array. However, me thinks >> that a less expensive solution for HA could be >> obtained by combining Data Guard with the above >> idea of using external storage just to store the >> redo logs (and control files?). In this way, one >> should be able to operate Data Guard with a >> LogXptMode set to ARCH. At time of fail over, >> one would need to point the Standby database >> server to the redo logs that had been previously >> mounted by the primary server. > > > Are you willing for your online redo logs to be a single point of > failure, or not? Once you answer that question, you can proceed. I will either use Oracle mirroring or Hardware mirroring of the online redo logs. In that sense, they will not be a single point of failure. The external disk array we have has dual storage processors, dual fiber channel ports, and hardware RAID. It is true that if we mirror enough information to internal disks on the Standby, we can achieve a further level of reliability. But my goal is to have *zero* impact on transaction performance of the Primary server. My reading of the Data Guard documentation is that it is not able to provide recoverability to the last committed transaction without some loss of performance (LogXptMode = SYNC). Cheers, Dennis >> I have not yet combed the Data Guard documentation >> to attempt to figure out if I can make use of its >> facilities for transferring archive redo logs and >> initiating fail over yet somehow inform the standby >> to mount the redo logs on the external storage >> prior to activation. >> >> Has someone out there attempted this feat? >> >> Thanks, >> Dennis > > > -Mark Bole |
|
|||
|
(I just can't stand that in year 2005 we still have
mail clients such as Thunderbird that are unable to correctly wrap or NOT wrap plain ascii text). (so, my apologies on behalf of the computer software community for the poor formatting of my ASCII art in my previous post, which is entirely out of my control as far as I can tell) Dennis |
|
|||
|
Dennis G Allard wrote:
> Mark Bole wrote: > >> Dennis G Allard wrote: >> [...] > I slightly misspoke -- the clustered disk will have the primary > server redo logs, the primary server archived redo logs (that have > not yet been shipped to the standby), and the primary server > control file. > > The Standby will receive copies of the archived redo logs > periodically and use them to maintain the Standby database. > > At time of failure of the Primary server the Standby will need > to mount the clustered disk and obtain the Primary server > redo logs and archived redo logs that had not been received > prior to failure. > [...] > >>> I have not yet combed the Data Guard documentation >>> to attempt to figure out if I can make use of its >>> facilities for transferring archive redo logs and >>> initiating fail over yet somehow inform the standby >>> to mount the redo logs on the external storage >>> prior to activation. >>> [...] I think I understand what you are trying to do -- recover the physical standby with all available archived redo logs, then replace the standby control file with the primary control file, copy the primary online redo logs, and start up the database and let it perform automatic instance recovery as if it were the primary. I doubt it will work, but there's no harm in testing. Step back and consider your two main goals: you want guaranteed zero loss of committed transactions, and you don't want any performance penalty for copying those transactions to a second location in real time. I don't think it is possible in this case to get "something for nothing". Have you measured the actual impact of the performance penalty? Have you measured the actual impact of losing, say, ten minutes worth of transactions once every three years? If you trust your external storage array, why not put your whole database on the external array, since you are already putting the control files, online, and archived redo logs there? Do you know that having a physical standby requires another full Oracle license (false claims to the contrary not withstanding)? Do you know that Oracle recommends never backing up online redo log files, which is very similar to what you are trying to do? Why not just set your archive lag target to 10 minutes and maintain the physical standby in a geographically separate location, which will take care of many, many more scenarios than the few you are trying to address, and is supported too? If you look at it from a business point of view, rather than technical, you'll be better off in my opinion. -Mark Bole |
|
|||
|
Mark Bole wrote:
> Dennis G Allard wrote: > >> Mark Bole wrote: >> >>> Dennis G Allard wrote: >>> [...] > I think I understand what you are trying to do -- recover the physical > standby with all available archived redo logs, then replace the standby > control file with the primary control file, copy the primary online redo > logs, and start up the database and let it perform automatic instance > recovery as if it were the primary. I doubt it will work, but there's > no harm in testing. Actually, not 'copy' the redo logs. Instead, mount the physical partition containing the redo logs (etc.) on the Standby so that it now sees those files as its own. That is the essence of an active/passive cluster. What used to be files local to the Primary server become local files on the Standby after the Primary goes off line and the Standby mounts the partitions. > Step back and consider your two main goals: you want guaranteed zero > loss of committed transactions, and you don't want any performance > penalty for copying those transactions to a second location in real > time. I don't think it is possible in this case to get "something for > nothing". Have you measured the actual impact of the performance > penalty? Have you measured the actual impact of losing, say, ten > minutes worth of transactions once every three years? In our case, I do agree that we can 'afford' to lose some transactions if it is very rare. But I also believe that the new clustering technologies (such as Red Hat Linux Cluster Suite) will make it possible to have ones cake and eat it too! I have scoured the web and USENET and Oracle docs. I am now convinced that this form of active/passive cluster database failover is not in common use. However, I see no reason that it cannot be implemented. > > If you trust your external storage array, why not put your whole > database on the external array, since you are already putting the > control files, online, and archived redo logs there? Do you know that As a matter of fact, I have decided to do just that! Because then I could use the active/passive failover technique by simply installing Oracle on a backup server but keep it turned off. Fail over would cause the backup server to mount the external disks and bring Oracle up (and, for that matter, take over the IP of the failed primary server). > having a physical standby requires another full Oracle license (false > claims to the contrary not withstanding)? Do you know that Oracle I just attended Linux Expo in San Francisco last week. Oracle reps stated that they have a 'ten day rule' -- as long as you don't use the standby server database more than ten days in the year, there is no license fee. > recommends never backing up online redo log files, which is very similar > to what you are trying to do? Why not just set your archive lag target Again, I'm not copying redo logs. I'm merely remounting them to a different CPU box. I agree there is no point in backing up redo logs (as opposed to archived redo logs, which, of course, one should backup back to at least the last hot or cold backup of the database). > to 10 minutes and maintain the physical standby in a geographically > separate location, which will take care of many, many more scenarios > than the few you are trying to address, and is supported too? If you My only issue with Data Guard is the potential performance hit. It seems you agree that using LogXptMode = SYNC is expensive but ASYNC or ARCH modes are tolerable. Is that correct? > look at it from a business point of view, rather than technical, you'll > be better off in my opinion. I agree. My current plan is to use a Standby (ten day rule) in ASYNC mode. If I can set the lag to ten minutes, fine (I have to dig into the Data Guard docs now to see how that is done). If it is possible to dynamically change from ASYNC to SYNC modes, I would run the system in SYNC mode for most daily operations but in certain circumstances when we have higher database use (the nature of our application is such that we get spikes in usage) I would turn down the volume and use a higher lag time. Do you know if I can dynamically modify the lag time? thanks for your help, Dennis > > -Mark Bole > |
|
|||
|
Dennis G Allard wrote: [...] >> >> If you trust your external storage array, why not put your whole >> database on the external array, since you are already putting the >> control files, online, and archived redo logs there? Do you know that > > > As a matter of fact, I have decided to do just that! Because then I > could use the active/passive failover technique by simply installing > Oracle on a backup server but keep it turned off. Fail over would > cause the backup server to mount the external disks and bring Oracle > up (and, for that matter, take over the IP of the failed primary > server). > This is exactly what products like HP/UX Service Guard and Veritas VCS have provided for many years. >> having a physical standby requires another full Oracle license (false >> claims to the contrary not withstanding)? Do you know that Oracle > > > I just attended Linux Expo in San Francisco last week. Oracle reps > stated that they have a 'ten day rule' -- as long as you don't use > the standby server database more than ten days in the year, there is > no license fee. Read the following document, the section on "Backup/Failover/Standby". http://www.oracle.com/corporate/pricing/sig.pdf The confusion arises because of sloppy use of the words "standby" vs. "failover". To summarize: if you are applying redo to a separate copy of the database (meaning you have to mount the database), it is a standby and requires a license for the server that runs it. If you are simply moving the primary database from one node to another via mounting and unmounting shared storage, it is a "failover" and the ten day rule applies. [...] > > My only issue with Data Guard is the potential performance hit. It seems > you agree that using LogXptMode = SYNC is expensive but ASYNC or ARCH > modes are tolerable. Is that correct? > No, because I have never used Data Guard "maximum protection" mode, but if I did, I would measure the performance "hit" first to see how bad it was before concerning myself with steps to address it. -Mark Bole |
|
|||
|
Dennis G Allard <allard@oceanpark.com> wrote in
news:430240AE.9050505@oceanpark.com: > Mark Bole wrote: >> Dennis G Allard wrote: >> >>> Mark Bole wrote: >>> >>>> Dennis G Allard wrote: >>>> [...] > >> I think I understand what you are trying to do -- recover the >> physical standby with all available archived redo logs, then replace >> the standby control file with the primary control file, copy the >> primary online redo logs, and start up the database and let it >> perform automatic instance recovery as if it were the primary. I >> doubt it will work, but there's no harm in testing. > > Actually, not 'copy' the redo logs. Instead, mount the physical > partition containing the redo logs (etc.) on the Standby so that > it now sees those files as its own. That is the essence of an > active/passive cluster. What used to be files local to the Primary > server become local files on the Standby after the Primary goes > off line and the Standby mounts the partitions. > >> Step back and consider your two main goals: you want guaranteed zero >> loss of committed transactions, and you don't want any performance >> penalty for copying those transactions to a second location in real >> time. I don't think it is possible in this case to get "something >> for nothing". Have you measured the actual impact of the performance >> penalty? Have you measured the actual impact of losing, say, ten >> minutes worth of transactions once every three years? > > In our case, I do agree that we can 'afford' to lose some transactions > if it is very rare. But I also believe that the new clustering > technologies (such as Red Hat Linux Cluster Suite) will make it > possible to have ones cake and eat it too! > > I have scoured the web and USENET and Oracle docs. I am now convinced > that this form of active/passive cluster database failover is not > in common use. However, I see no reason that it cannot be > implemented. > >> >> If you trust your external storage array, why not put your whole >> database on the external array, since you are already putting the >> control files, online, and archived redo logs there? Do you know >> that > > As a matter of fact, I have decided to do just that! Because then I > could use the active/passive failover technique by simply installing > Oracle on a backup server but keep it turned off. Fail over would > cause the backup server to mount the external disks and bring Oracle > up (and, for that matter, take over the IP of the failed primary > server). > >> having a physical standby requires another full Oracle license (false >> claims to the contrary not withstanding)? Do you know that Oracle > > I just attended Linux Expo in San Francisco last week. Oracle reps > stated that they have a 'ten day rule' -- as long as you don't use > the standby server database more than ten days in the year, there is > no license fee. > >> recommends never backing up online redo log files, which is very >> similar to what you are trying to do? Why not just set your archive >> lag target > > Again, I'm not copying redo logs. I'm merely remounting them to a > different CPU box. I agree there is no point in backing up redo > logs (as opposed to archived redo logs, which, of course, one should > backup back to at least the last hot or cold backup of the database). > The mount point for redo logfiles mentioned immediately above could be a nasty single point of failure that could bring down both DBs. |
|
|
![]() |
| Thread Tools | |
| Display Modes | |
|
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Question about efficient data extraction | Nordlund, Dan | Newsgroup comp.soft-sys.sas | 0 | 06-04-2008 07:17 PM |
| Re: multicollinearity in proc logistic | Bora Yavuz | Newsgroup comp.soft-sys.sas | 0 | 11-21-2006 02:02 PM |
| Re: Data step questions | Ian Whitlock | Newsgroup comp.soft-sys.sas | 1 | 09-09-2006 10:37 AM |
| Re: Views and passes (was RE: Output last record of fantom by | Paul M. Dorfman | Newsgroup comp.soft-sys.sas | 0 | 07-14-2005 04:40 AM |
| Re: Views and passes (was RE: Output last record of fantom by | Sigurd Hermansen | Newsgroup comp.soft-sys.sas | 0 | 07-13-2005 10:00 PM |