DG的日常管理操作
<p>[TOC]</p>
<h2>DG的日常管理操作</h2>
<p>2021/10/9 15:39:26</p>
<h3>DG的启动与关闭</h3>
<ul>
<li>
<p>查看关闭启动过程alert日志变化</p>
<pre><code>SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.
## 主库停用很多进程
## 备库断开连接
SQL> alter database recover managed standby database cancel;
Database altered.
## 取消同步,这里有说错误日志的路径
2021-10-09T11:28:15.043582+08:00
alter database recover managed standby database cancel
2021-10-09T11:28:15.083866+08:00
MRP0 (PID:5347): MRP0: Background Media Recovery cancelled with status 16037
2021-10-09T11:28:15.084122+08:00
Errors in file /u01/app/oracle/diag/rdbms/dgsorcl/dgsorcl/trace/dgsorcl_mrp0_5347.trc:
ORA-16037: user requested cancel of managed recovery operation
MRP0 (PID:5347): Managed Standby Recovery not using Real Time Apply
Recovery interrupted!
stopping change tracking
2021-10-09T11:28:15.484313+08:00
Errors in file /u01/app/oracle/diag/rdbms/dgsorcl/dgsorcl/trace/dgsorcl_mrp0_5347.trc:
ORA-16037: user requested cancel of managed recovery operation
2021-10-09T11:28:15.491674+08:00
Background Media Recovery process shutdown (dgsorcl)
2021-10-09T11:28:16.083531+08:00
Managed Standby Recovery Canceled (dgsorcl)
Completed: alter database recover managed standby database cancel
## 主备断开后,备库还是在不停尝试连接
Fatal NI connect error 12514, connecting to:
(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=172.18.31.202)(PORT=1521)))(CONNECT_DATA=(SERVICE_NAME=dgorcl)(INSTANCE_NAME=dgorcl)(CID=(PROGRAM=oracle)(HOST=standby)(USER=oracle))))
VERSION INFORMATION:
TNS for Linux: Version 19.0.0.0.0 - Production
TCP/IP NT Protocol Adapter for Linux: Version 19.0.0.0.0 - Production
Version 19.3.0.0.0
Time: 09-OCT-2021 11:39:26
Tracing not turned on.
Tns error struct:
ns main err code: 12564
TNS-12564: TNS:connection refused
ns secondary err code: 0
nt main err code: 0
nt secondary err code: 0
nt OS err code: 0
SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.
## 关闭很多进程和broker</code></pre>
</li>
<li>查看启动过程alert日志变化<br />
不写了,写多也不好</li>
</ul>
<h3>查看日志的切换</h3>
<ul>
<li>查看当前日志变化过程
<pre><code>## 还没归档日志124,当前日志是125,要归档的下一个日志是125,上一归档的日志是123
SQL> archive log list
Database log mode Archive Mode
Automatic archival Enabled
Archive destination /u01/app/oracle/product/19.3.0/dbhome_1/dbs/arch
Oldest online log sequence 123
Next log sequence to archive 125
Current log sequence 125
## 主库:线程1升级到序号125
Thread 1 advanced to log sequence 125 (LGWR switch)
Current log# 2 seq# 125 mem# 0: /u01/app/oracle/oradata/DGORCL/redo02.log
2021-10-09T10:55:52.041672+08:00
## 备库:124已经完成同步,等待125归档
2021-10-09T10:55:57.140524+08:00
rfs (PID:18819): Archived Log entry 22 added for B-1084564721.T-1.S-124 ID 0x6b458e6f LAD:2
rfs (PID:18819): No SRLs created for T-1
2021-10-09T10:55:57.170782+08:00
rfs (PID:18819): Opened log for T-1.S-125 dbid 1799699567 branch 1084564721
2021-10-09T10:55:57.916887+08:00
MRP0 (PID:5347): Media Recovery Log /u01/app/oracle/product/19.3.0/dbhome_1/dbs/arch1_124_1084564721.arc
Resize operation completed for file# 3, old size 952320K, new size 962560K
2021-10-09T10:56:09.760769+08:00
MRP0 (PID:5347): Media Recovery Waiting for T-1.S-125 (in transit)
## 执行日志切换观察归档文件的变化
SQL> alter system switch logfile;</code></pre></li>
</ul>
<h3>备库以读写方式打开做测试</h3>
<ul>
<li>开启闪回</li>
<li>切换日志</li>
<li>禁止同步</li>
<li>设置闪回点</li>
<li>取消同步</li>
<li>打开为读写</li>
<li>测试</li>
<li>闪回</li>
<li>开户同步</li>
</ul>
<h3>日志传送状态监控</h3>
<ul>
<li>主要通过如下四个视图查看
<pre><code>v$archive_dest_status -- 归档路径状态
v$archived_log
v$log_history
v$log</code></pre></li>
</ul>
<h2>DG相应的视图</h2>
<ul>
<li>日常巡检脚本</li>
</ul>
<pre><code>select name,open_mode,database_role,db_unique_name from v$database;--数据库属性
select dest_name,status,destination,target,archiver,schedule,reopen_secs,delay_mins,max_connections,valid_type,valid_role,db_unique_name from v$archive_dest;--归档描述
select thread#,group#,sequence#,status,archived,first_change#,first_time,last_change#,last_time from v$standby_log;--哪个线程在流动
SELECT thread#,low_sequence#,high_sequence# from v$archive_gap;--是否有脑裂
SELECT NAME,VALUE,datum_time FROM v$dataguard_stats;--日志应用信息
select * from v$archive_dest_status;--各路径信息
SELECT * FROM v$dataguard_status;--DG错误信息日志
SELECT * FROM v$managed_standby;--进程状态</code></pre>
<h2>角色互换</h2>
<h3>Switchover</h3>
<h4>主库端检验数据库可切换状态</h4>
<ul>
<li>
<p>to standby 表示可以正常切换,sessions active 表示会话活动</p>
<pre><code>## 主库上查询的
SQL> select switchover_status from v$database;
SWITCHOVER_STATUS
TO STANDBY</code></pre>
</li>
<li>
<p>主库执行切换为备库,如果如下命令成功执行,主库即转为备库</p>
<pre><code>## 如果switchover_status=to standby
alter database commit to switchover to physical standby;
## 如果switchover_status=session active
alter database commit to switchover to physical standby with session shutdown;
## 可以查看实例的状态
SQL> select status from v$instance;
执行前:STATUS=OPEN
执行后旧主数据库是关闭状态,视频中的旧主数据库没关是started状态。</code></pre>
</li>
<li>
<p>重启旧的主数据库,查看身份(这样重启有点问题,参考DG启动与关闭)</p>
<pre><code>shutdown immediate;
startup mount;
SQL> select database_role from v$database;
DATABASE_ROLE
PHYSICAL STANDBY</code></pre>
</li>
<li>
<p>在备库验证可切换状态</p>
<pre><code>SQL> select switchover_status from v$database;
SWITCHOVER_STATUS
TO PRIMARY</code></pre>
</li>
<li>
<p>备库执行切换为主库,如果如下命令成功执行,备库即转为主库</p>
<pre><code>## 如果switchover_status=to primary
alter database commit to switchover to primary;
## 如果switchover_status=session active
alter database commit to switchover to primary with session shutdown;
## 执行的alert日志
Online log /u01/app/oracle/oradata/DGSORCL/redo01.log: Thread 1 Group 1 was previously cleared
Online log /u01/app/oracle/oradata/DGSORCL/redo02.log: Thread 1 Group 2 was previously cleared
Online log /u01/app/oracle/oradata/DGSORCL/redo03.log: Thread 1 Group 3 was previously cleared
Standby became primary SCN: 4561293
Switchover: Complete - Database mounted as primary
TMI: kcv_commit_to_so_to_primary Switchover from physical END 2021-10-09 20:21:59.658249
Completed: alter database commit to switchover to primary
## 查看执行成功后角色,备库切换为主库并没有关闭
SQL> select database_role from v$database;
DATABASE_ROLE
PRIMARY
## 当前状态是mount,重新打开也是读写模式的
SQL> select status from v$instance;
STATUS
MOUNTED
## 这里考虑先把数据库改为读写模式再open
SQL> alter database open;
Database altered.
SQL> select open_mode from v$database;
OPEN_MODE
READ WRITE</code></pre>
</li>
<li>
<p>旧主库现备库,启动接收日志,启动数据库,模式自动就变成read only(如果在前面参照DG启动部分启动,就不需要这步了)</p>
<pre><code>SQL> alter database recover managed standby database disconnect from session;
Database altered.
SQL> alter database open;
Database altered.
SQL> select open_mode from v$database;
OPEN_MODE
READ ONLY</code></pre>
</li>
<li>
<p>后面就是启动问题、测试切换效果问题了</p>
<pre><code>alter system switch logfile;</code></pre>
</li>
<li>安装时没有在备库设置log_archive_dest_2这个参数,导致切换失败</li>
</ul>
<h4>第二次操作的步骤</h4>
<ul>
<li>
<p>旧主库(现备库)</p>
<pre><code>SQL> select switchover_status from v$database;
SWITCHOVER_STATUS
TO STANDBY
SQL> alter database commit to switchover to physical standby;
Database altered.
SQL> select status from v$instance;##切换完后这个实例是不可用状态
ERROR at line 1:
ORA-01034: ORACLE not available
Process ID: 5006
Session ID: 34 Serial number: 35680</code></pre>
</li>
<li>
<p>切换结束后打开数据库</p>
<pre><code>SQL> startup nomount
ORACLE instance started.
Total System Global Area 1660940992 bytes
Fixed Size 8897216 bytes
Variable Size 973078528 bytes
Database Buffers 671088640 bytes
Redo Buffers 7876608 bytes
SQL> select status from V$instance;
STATUS
STARTED
SQL> alter database mount standby database;
Database altered.
SQL> alter database open;
Database altered.
SQL> select status from v$instance;
STATUS
OPEN
SQL> select open_mode from v$database;
OPEN_MODE
READ ONLY
## alter database mount standby database;\alter database open;做了如下的事情。
## log_archive_dest_2被清空是个问题,当再次切换其为主库就有问题,需要重新设置这个值
SQL> ALTER SYSTEM SET LOG_ARCHIVE_DEST_2='SERVICE=dgorcl NOAFFIRM ASYNC VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_NAME=dgorcl';
ALTER SYSTEM SET log_archive_dest_2='' SCOPE=BOTH;
2021-10-10T09:34:13.885915+08:00
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT NODELAY
## 如上open的操作,需要重新应用日志
SQL> alter database recover managed standby database cancel;
Database altered.
SQL> alter database recover managed standby database disconnect from session;
Database altered.</code></pre>
</li>
<li>旧备库(现主库)
<pre><code>SQL> select switchover_status from v$database;
SWITCHOVER_STATUS
TO PRIMARY
SQL> alter database commit to switchover to primary;
Database altered.
SQL> select database_role from v$database;
DATABASE_ROLE
PRIMARY
SQL> select open_mode from v$database;##切换完成后数据库是mount状态
OPEN_MODE
MOUNTED
SQL> alter database open read write;##如果直接打开是保持原来的read only模式
Database altered.
SQL> select open_mode from v$database;
OPEN_MODE
READ WRITE</code></pre></li>
</ul>
<h3>failover</h3>
<ul>
<li>
<p>操作之前需要拍快照,因为failover后DG结构就破坏了</p>
</li>
<li>
<p>要考虑的问题<br />
arvhive gap<br />
归档日志<br />
REDO 文件</p>
</li>
<li>
<p>记住主备库的归档路径和重做路径(我装的环境的是一样的),平时也要记录</p>
<pre><code>/u01/app/oracle/product/19.3.0/dbhome_1/dbs/arch1_172_1084564721.arc
/u01/app/oracle/oradata/DGORCL/redo02.log</code></pre>
</li>
<li>
<p>查看最新归档\存放路径</p>
<pre><code>SELECT * FROM v$archived_log ;
SHOW parameter log_archive_dest_1</code></pre>
</li>
<li>
<p>查看最新重做日志\路径</p>
<pre><code>SELECT group#,sequence#,archived,status,first_time FROM v$LOG;
SQL> SELECT * FROM v$logfile ;
/u01/app/oracle/oradata/DGORCL
[oracle@master DGORCL]$ pwd
/u01/app/oracle/oradata/DGORCL
[oracle@master DGORCL]$ ll redo0*
-rw-r-----. 1 oracle oinstall 209715712 Oct 10 14:07 redo01.log
-rw-r-----. 1 oracle oinstall 209715712 Oct 10 14:12 redo02.log
-rw-r-----. 1 oracle oinstall 209715712 Oct 10 12:19 redo03.log</code></pre>
</li>
<li>
<p>备库上应用到的归档</p>
<pre><code>SELECT sequence#,NAME,creator,registrar,archived,applied FROM v$archived_log WHERE dest_id=1 ORDER BY sequence#;</code></pre>
</li>
<li>
<p>现在主库已经不能用了,查看备库已经应用的归档168,归档路径存放的归档</p>
<pre><code>SQL> SELECT * FROM (SELECT sequence#,NAME,creator,registrar,archived,applied FROM v$archived_log WHERE dest_id=1 ORDER BY sequence# DESC) WHERE ROWNUM=1 ;
SEQUENCE#NAMECREATOR REGISTR ARC APPLIED
168 /u01/app/oracle/product/19.3.0/dbhome_1/dbs/arch1_168_1084564721.arc FGRD RFS YES YES
## 对比主备库/u01/app/oracle/product/19.3.0/dbhome_1/dbs 路径下归档日志区别
/u01/app/oracle/product/19.3.0/dbhome_1/dbs/arch1_180_1084564721.arc</code></pre>
</li>
<li>
<p>恢复重做日志</p>
<pre><code>## 恢复主库路径下最新的日志
[oracle@master DGORCL]$ ll re*
-rw-r-----. 1 oracle oinstall 209715712 Oct 10 14:53 redo01.log
-rw-r-----. 1 oracle oinstall 209715712 Oct 10 14:42 redo02.log
-rw-r-----. 1 oracle oinstall 209715712 Oct 10 14:43 redo03.log</code></pre>
</li>
<li>复制操作是:复制文件,注册归档,恢复重做
<pre><code>[oracle@master DGORCL]$ scp -r redo01.log oracle@172.18.31.203:/u01/failover/
[oracle@master dbs]$ cd /u01/app/oracle/product/19.3.0/dbhome_1/dbs
[oracle@master dbs]$ scp -r *[167,169]* oracle@172.18.31.203:/u01/failover/
## 由于归档日志已经传过来了,如下注册没用的,开启应用即可。
[oracle@master dbs]$ scp -r arch1_169_1084564721.arc oracle@172.18.31.203:/u01/failover/
[oracle@master dbs]$ scp -r arch1_170_1084564721.arc oracle@172.18.31.203:/u01/failover/
[oracle@master dbs]$ scp -r arch1_171_1084564721.arc oracle@172.18.31.203:/u01/failover/
[oracle@master dbs]$ scp -r arch1_172_1084564721.arc oracle@172.18.31.203:/u01/failover/
[oracle@master dbs]$ scp -r arch1_173_1084564721.arc oracle@172.18.31.203:/u01/failover/
[oracle@master dbs]$ scp -r arch1_174_1084564721.arc oracle@172.18.31.203:/u01/failover/
[oracle@master dbs]$ scp -r arch1_175_1084564721.arc oracle@172.18.31.203:/u01/failover/
[oracle@master dbs]$ scp -r arch1_176_1084564721.arc oracle@172.18.31.203:/u01/failover/
[oracle@master dbs]$ scp -r arch1_177_1084564721.arc oracle@172.18.31.203:/u01/failover/
[oracle@master dbs]$ scp -r arch1_178_1084564721.arc oracle@172.18.31.203:/u01/failover/
[oracle@master dbs]$ scp -r arch1_179_1084564721.arc oracle@172.18.31.203:/u01/failover/
[oracle@master dbs]$ scp -r arch1_180_1084564721.arc oracle@172.18.31.203:/u01/failover/
alter database register logfile '/arch/testdb/1_89_882457974.arc';
alter database recover managed standby database disconnect from session;
## 恢复重做日志
/u01/failover/redo01.log
alter database recover managed standby database cancel;
recover standby database until cancel;
输入路径/u01/failover/redo01.log
alter database activate standby database;
alter database open;
select name,open_mode,database_role,db_unique_name from v$database;--数据库属性</code></pre></li>
</ul>
<h4>实际操作</h4>
<ul>
<li>对比找出归档差异,注册</li>
<li>恢复最新的重做日志</li>
<li>激活数据库,打开数据库</li>
</ul>
<h2>DG broker</h2>
<ul>
<li>只学了一个命令,并且很难执行成功
DGMGRL> switchover to 备库(需要变成主库的库)</li>
</ul>
<h2>常见DG的故障</h2>
<h3>归档路径不正确</h3>
<ul>
<li>如下归档是否指向备用数据库
<pre><code>ALTER SYSTEM SET LOG_ARCHIVE_DEST_2='SERVICE=dgorcl NOAFFIRM ASYNC VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_NAME=dgorcl';
ALTER SYSTEM SET LOG_ARCHIVE_DEST_STATE_2=ENABLE;</code></pre></li>
</ul>
<h3>日志无法传递到standby</h3>
<ul>
<li>standby服务器异常</li>
<li>standby相应的进程工作异常</li>
<li>standby监听服务器异常(未启动,实例未注册进去)</li>
<li>两库之间的网络有问题</li>
</ul>
<h3>归档日志的GAP</h3>
<h4>手动修复GAP</h4>
<ul>
<li>出现GAP后的修复方式(基于SCN号的增量补偿)
<pre><code>--手动解决日志缺失
--在standby库中使用以下查询语句确定缺失问题
SQL> select * from v$archive_gap;
1 85 89</code></pre></li>
<li>复制,传送失败的归档日志到standby库,手动注册
<pre><code>## 远程复制示例
scp oracle@172.18.31.202:/u01/archive/archive1/initdgsorcl.ora /u01/data/data1/initdgsorcl.ora </code></pre></li>
<li>在主库中执行以下SQL语句,查找在主库中的归档REDO日志
<pre><code>SQL> select name from v$archived_log where thread#=1 and dest_id=1 and sequence#>=84 and sequence#<90;
--拷贝日志到物理standby数据库,执行alter database register logfile SQL语句注册它们
SQL> alter database register logfile '/arch/testdb/1_85_882457974.arc';
SQL> alter database register logfile '/arch/testdb/1_86_882457974.arc';
SQL> alter database register logfile '/arch/testdb/1_87_882457974.arc';
SQL> alter database register logfile '/arch/testdb/1_88_882457974.arc';
SQL> alter database register logfile '/arch/testdb/1_89_882457974.arc';</code></pre></li>
</ul>
<h4>GAP无法自动处理</h4>
<ul>
<li>GAP都会自动处理,如果没有自动处理,可能是归档日志有问题</li>
<li>
<p>归档日志可能是太大了,添加参数使其压缩(主库上)</p>
<pre><code>alter system set log_archive_dest_2='service=dgsorcl async compression=enable';
alter system set log_archive_dest_state_2=enable;</code></pre>
</li>
<li>添加连接进程,并行执行传送
<pre><code>alter system set log_archive_dest_2='service=dgsorcl max_connections=3';
alter system set log_archive_dest_state_2=enable;</code></pre></li>
</ul>