版本管理
版本 | 修訂日期 | 變更類型 | 生效日期 |
1.0 | 2019/4/15 | ||
1.1 | 2019/7/30 | 1.更新故障計數描述 2.更新啟停順序的說明 | 2019/7/30 |
SAP高可用環境維護概述
本文檔適用基於SUSE HAE 12叢集部署的SAP系統應用或SAP HANA ECS執行個體需要進行營運操作的情境,例如ECS執行個體規格升降配、SAP應用 / 資料庫升級、主/備節點的常規維護、節點發生異常切換等情境的前置和後處理說明。
通過SUSE HAE管理的SAP系統,如果要在叢集節點上執行維護任務,可能需要停止該節點上啟動並執行資源、移動這些資源,或者關閉或重啟該節點。此外,可能還需要暫時接管叢集中資源的控制權。
下面列舉的情境以SAP HANA高可用為例,SAP應用高可用維護操作類似。
本文檔無法代替標準的SUSE和SAP的安裝/管理文檔,更多高可用環境維護指導請參考SUSE和SAP的官方文檔。
SUSE HAE操作手冊請參考:
SAP HANA HSR配置手冊請參考:
SAP HANA高可用常見維護情境
SUSE Pacemaker提供了多種選項用於不同需求的維護需求:
將叢集設定為維護模式
使用全域叢集屬性 maintenance-mode 可以一次性將所有資源置於維護狀態。叢集將停止監控這些資源。
將節點設定為維護模式
一次性將指定節點上啟動並執行所有資源置於維護狀態。叢集將停止監控這些資源。
將節點設定為待機模式
處於待機模式的節點不再能夠運行資源。該節點上啟動並執行所有資源將被移出或停止(如果沒有其他節點可用於運行資源)。另外,該節點上的所有監控操作將會停止(設定了role=”Stopped” 的操作除外)。
如果您需要停止叢集中的某個節點,同時繼續提供另一個節點上啟動並執行服務,則可以使用此選項。
將資源設定為維護模式
將某個資源設定成此模式後,將不會針對該資源觸發監控操作。如果您需要手動調整此資源所管理的服務,並且不希望叢集在此期間對該資源運行任何監控操作,則可以使用此選項。
將資源設定為不受管理員模式
使用 is-managed 屬性可以暫時“釋放”某個資源,使其不受叢集堆棧的管理。這意味著,您可以手動調整此資源管理的服務。不過,叢集將繼續監控該資源,並會報告錯誤的資訊。如果您希望叢集同時停止監控該資源,請改為使用按資源維護模式。
1.主節點異常後處理
主節點異常時,HAE會觸發主備切換,原備節點Node B會被promote為primary,但原主節點Node A仍然是primary角色,因此在原主節點Node A損毀修復後啟動Pacemaker服務前,需要手工重新設定HANA HSR,將原主節點Node A註冊為Secondary。
本樣本初始狀態的主節點為saphana-01,備節點為saphana-02。
1.1 查詢SUSE HAE的正常狀態
登入任意節點,使用crm status
命令查詢HAE的正常狀態。
# crm status
Stack: corosync
Current DC: saphana-01 (version 1.1.16-4.8-77ea74d) - partition with quorum
Last updated: Mon Apr 15 14:33:22 2019
Last change: Mon Apr 15 14:33:19 2019 by root via crm_attribute on saphana-01
2 nodes configured
6 resources configured
Online: [ saphana-01 saphana-02 ]
Full list of resources:
rsc_sbd (stonith:external/sbd): Started saphana-01
rsc_vip (ocf::heartbeat:IPaddr2): Started saphana-01
Master/Slave Set: msl_SAPHana_HDB [rsc_SAPHana_HDB]
Masters: [ saphana-01 ]
Slaves: [ saphana-02 ]
Clone Set: cln_SAPHanaTopology_HDB [rsc_SAPHanaTopology_HDB]
Started: [ saphana-01 saphana-02 ]
主節點出現異常後,HAE自動將備節點promote成primary。
# crm status
Stack: corosync
Current DC: saphana-02 (version 1.1.16-4.8-77ea74d) - partition with quorum
Last updated: Mon Apr 15 14:40:43 2019
Last change: Mon Apr 15 14:40:41 2019 by root via crm_attribute on saphana-02
2 nodes configured
6 resources configured
Online: [ saphana-02 ]
OFFLINE: [ saphana-01 ]
Full list of resources:
rsc_sbd (stonith:external/sbd): Started saphana-02
rsc_vip (ocf::heartbeat:IPaddr2): Started saphana-02
Master/Slave Set: msl_SAPHana_HDB [rsc_SAPHana_HDB]
Masters: [ saphana-02 ]
Stopped: [ saphana-01 ]
Clone Set: cln_SAPHanaTopology_HDB [rsc_SAPHanaTopology_HDB]
Started: [ saphana-02 ]
Stopped: [ saphana-01 ]
1.2 重新註冊HSR,修複原主節點故障
重新設定HSR之前,一定要先確認主、備節點,配置錯誤可能會導致資料被覆蓋甚至丟失。
用SAP HANA執行個體使用者,登入原主節點,配置HSR。
h01adm@saphana-01:/usr/sap/H01/HDB00> hdbnsutil -sr_register --remoteHost=saphana-02 --remoteInstance=00 --replicationMode=syncmem --name=saphana-01 --operationMode=logreplay
adding site ...
checking for inactive nameserver ...
nameserver saphana-01:30001 not responding.
collecting information ...
updating local ini files ...
done.
1.3 檢查SBD狀態
如果發現節點槽的狀態不是 “clear”,需要將其設定為 “clear”。
# sbd -d /dev/vdc list
0 saphana-01 reset saphana-02
1 saphana-02 reset saphana-01
# sbd -d /dev/vdc message saphana-01 clear
# sbd -d /dev/vdc message saphana-02 clear
# sbd -d /dev/vdc list
0 saphana-01 clear saphana-01
1 saphana-02 clear saphana-01
1.4 啟動pacemaker服務
執行以下命令啟動pacemaker服務。啟動pacemaker服務後,HAE會自動拉起SAP HANA服務。
# systemctl start pacemaker
此時,原備節點成為新主節點,當前HAE狀態如下:
# crm status
Stack: corosync
Current DC: saphana-02 (version 1.1.16-4.8-77ea74d) - partition with quorum
Last updated: Mon Apr 15 15:10:58 2019
Last change: Mon Apr 15 15:09:56 2019 by root via crm_attribute on saphana-02
2 nodes configured
6 resources configured
Online: [ saphana-01 saphana-02 ]
Full list of resources:
rsc_sbd (stonith:external/sbd): Started saphana-02
rsc_vip (ocf::heartbeat:IPaddr2): Started saphana-02
Master/Slave Set: msl_SAPHana_HDB [rsc_SAPHana_HDB]
Masters: [ saphana-02 ]
Slaves: [ saphana-01 ]
Clone Set: cln_SAPHanaTopology_HDB [rsc_SAPHanaTopology_HDB]
Started: [ saphana-01 saphana-02 ]
1.5 檢查SAP HANA HSR狀態
通過SAP HANA內建python指令碼檢查
使用SAP HANA執行個體使用者登入主節點,確保所有SAP HANA進程Replication Status都是ACTIVE。
saphana-02:~ # su - h01adm h01adm@saphana-02:/usr/sap/H01/HDB00> cdpy h01adm@saphana-02:/usr/sap/H01/HDB00/exe/python_support> python systemReplicationStatus.py | Database | Host | Port | Service Name | Volume ID | Site ID | Site Name | Secondary | Secondary | Secondary | Secondary | Secondary | Replication | Replication | Replication | | | | | | | | | Host | Port | Site ID | Site Name | Active Status | Mode | Status | Status Details | | -------- | ---------- | ----- | ------------ | --------- | ------- | ---------- | ---------- | --------- | --------- | ---------- | ------------- | ----------- | ----------- | -------------- | | SYSTEMDB | saphana-02 | 30001 | nameserver | 1 | 2 | saphana-02 | saphana-01 | 30001 | 1 | saphana-01 | YES | SYNCMEM | ACTIVE | | | H01 | saphana-02 | 30007 | xsengine | 3 | 2 | saphana-02 | saphana-01 | 30007 | 1 | saphana-01 | YES | SYNCMEM | ACTIVE | | | H01 | saphana-02 | 30003 | indexserver | 2 | 2 | saphana-02 | saphana-01 | 30003 | 1 | saphana-01 | YES | SYNCMEM | ACTIVE | | status system replication site "1": ACTIVE overall system replication status: ACTIVE Local System Replication State ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ mode: PRIMARY site id: 2 site name: saphana-02
通過SUSE提供的SAPHanaSR工具,查看複製狀態,確保備節點的 sync_state為SOK。
saphana-02:~ # SAPHanaSR-showAttr Global cib-time -------------------------------- global Mon Apr 15 15:17:12 2019 Hosts clone_state lpa_h01_lpt node_state op_mode remoteHost roles site srmode standby sync_state version vhost ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------- saphana-01 DEMOTED 30 online logreplay saphana-02 4:S:master1:master:worker:master saphana-01 syncmem SOK 2.00.020.00.1500920972 saphana-01 saphana-02 PROMOTED 1555312632 online logreplay saphana-01 4:P:master1:master:worker:master saphana-02 syncmem off PRIM 2.00.020.00.1500920972 saphana-02
1.6 (可選)重設故障計數
如果資源失敗,它將自動重新啟動,但是每次失敗都會增加資源的故障計數。如果為該資源設定了migration-threshold,當故障數量達到閾值前,節點將不再允許運行該資源,因此我們需要手工清理這個故障計數。
清理故障計數的命令如下:
# crm resource cleanup [resouce name] [node]
例如:節點saphana-01的rsc_SAPHana_HDB的資源已經被修複,這時我們需要cleanup這個監控警示,命令如下:
crm resource cleanup rsc_SAPHana_HDB saphana-01
2.備節點異常後處理
備節點異常時,主節點不受任何影響,不會觸發主備切換動作。當備節點故障恢複後,啟動pacemaker服務,會自動拉起SAP HANA服務,主備角色不會發生變化,無需人工幹預。
本樣本初始狀態的主節點為saphana-02,備節點為saphana-01。
2.1 查詢HAE的正常狀態
以SUSE HAE的正常狀態登入任意節點,使用crm status
命令查詢HAE的正常狀態。
# crm status
Stack: corosync
Current DC: saphana-02 (version 1.1.16-4.8-77ea74d) - partition with quorum
Last updated: Mon Apr 15 15:34:52 2019
Last change: Mon Apr 15 15:33:50 2019 by root via crm_attribute on saphana-02
2 nodes configured
6 resources configured
Online: [ saphana-01 saphana-02 ]
Full list of resources:
rsc_sbd (stonith:external/sbd): Started saphana-02
rsc_vip (ocf::heartbeat:IPaddr2): Started saphana-02
Master/Slave Set: msl_SAPHana_HDB [rsc_SAPHana_HDB]
Masters: [ saphana-02 ]
Slaves: [ saphana-01 ]
Clone Set: cln_SAPHanaTopology_HDB [rsc_SAPHanaTopology_HDB]
Started: [ saphana-01 saphana-02 ]
2.2 重啟pacemaker
備節點故障恢複後,先檢查SBD,再重啟pacemaker。
# systemctl start pacemaker
HSR保持原主備關係,當前HAE狀態如下:
# crm status
Stack: corosync
Current DC: saphana-02 (version 1.1.16-4.8-77ea74d) - partition with quorum
Last updated: Mon Apr 15 15:43:28 2019
Last change: Mon Apr 15 15:43:25 2019 by root via crm_attribute on saphana-01
2 nodes configured
6 resources configured
Online: [ saphana-01 saphana-02 ]
Full list of resources:
rsc_sbd (stonith:external/sbd): Started saphana-02
rsc_vip (ocf::heartbeat:IPaddr2): Started saphana-02
Master/Slave Set: msl_SAPHana_HDB [rsc_SAPHana_HDB]
Masters: [ saphana-02 ]
Slaves: [ saphana-01 ]
Clone Set: cln_SAPHanaTopology_HDB [rsc_SAPHanaTopology_HDB]
Started: [ saphana-01 saphana-02 ]
2.3 檢查SAP HANA HSR狀態
詳細操作,請參見1.5 檢查SAP HANA HSR狀態。
2.4 重設故障計數(可選)
3.主備節點停機維護
將叢集設定為維護模式,依次關停備和主節點。
本樣本初始狀態的主節點為saphana-02,備節點為saphana-01。
3.1 查詢HAE的正常狀態
以SUSE HAE的正常狀態登入任意節點,使用crm status
命令查詢HAE的正常狀態。
# crm status
Stack: corosync
Current DC: saphana-02 (version 1.1.16-4.8-77ea74d) - partition with quorum
Last updated: Mon Apr 15 15:34:52 2019
Last change: Mon Apr 15 15:33:50 2019 by root via crm_attribute on saphana-02
2 nodes configured
6 resources configured
Online: [ saphana-01 saphana-02 ]
Full list of resources:
rsc_sbd (stonith:external/sbd): Started saphana-02
rsc_vip (ocf::heartbeat:IPaddr2): Started saphana-02
Master/Slave Set: msl_SAPHana_HDB [rsc_SAPHana_HDB]
Masters: [ saphana-02 ]
Slaves: [ saphana-01 ]
Clone Set: cln_SAPHanaTopology_HDB [rsc_SAPHanaTopology_HDB]
Started: [ saphana-01 saphana-02 ]
3.2 將叢集和master/slave資源集設定為維護模式
登入主節點,設定叢集為維護模式。
# crm configure property maintenance-mode=true
將master/slave資源集設定為維護模式,本樣本master/slave資源集為rsc_SAPHana_HDB和rsc_SAPHanaTopology_HDB。
# crm resource maintenance rsc_SAPHana_HDB true
Performing update of 'maintenance' on 'msl_SAPHana_HDB', the parent of 'rsc_SAPHana_HDB'
Set 'msl_SAPHana_HDB' option: id=msl_SAPHana_HDB-meta_attributes-maintenance name=maintenance=true
# crm resource maintenance rsc_SAPHanaTopology_HDB true
Performing update of 'maintenance' on 'cln_SAPHanaTopology_HDB', the parent of 'rsc_SAPHanaTopology_HDB'
Set 'cln_SAPHanaTopology_HDB' option: id=cln_SAPHanaTopology_HDB-meta_attributes-maintenance name=maintenance=true
當前HAE的狀態如下:
# crm status
Stack: corosync
Current DC: saphana-02 (version 1.1.16-4.8-77ea74d) - partition with quorum
Last updated: Mon Apr 15 16:02:13 2019
Last change: Mon Apr 15 16:02:11 2019 by root via crm_resource on saphana-02
2 nodes configured
6 resources configured
*** Resource management is DISABLED ***
The cluster will not attempt to start, stop or recover services
Online: [ saphana-01 saphana-02 ]
Full list of resources:
rsc_sbd (stonith:external/sbd): Started saphana-02 (unmanaged)
rsc_vip (ocf::heartbeat:IPaddr2): Started saphana-02 (unmanaged)
Master/Slave Set: msl_SAPHana_HDB [rsc_SAPHana_HDB] (unmanaged)
rsc_SAPHana_HDB (ocf::suse:SAPHana): Slave saphana-01 (unmanaged)
rsc_SAPHana_HDB (ocf::suse:SAPHana): Master saphana-02 (unmanaged)
Clone Set: cln_SAPHanaTopology_HDB [rsc_SAPHanaTopology_HDB] (unmanaged)
rsc_SAPHanaTopology_HDB (ocf::suse:SAPHanaTopology): Started saphana-01 (unmanaged)
rsc_SAPHanaTopology_HDB (ocf::suse:SAPHanaTopology): Started saphana-02 (unmanaged)
3.3 停止備-主節點SAP HANA服務並關停ECS
用SAP HANA執行個體使用者登入兩個節點,先停備節點SAP HANA服務,再停主節點SAP HANA服務。
saphana-01:~ # su - h01adm
h01adm@saphana-01:/usr/sap/H01/HDB00> HDB stop
hdbdaemon will wait maximal 300 seconds for NewDB services finishing.
Stopping instance using: /usr/sap/H01/SYS/exe/hdb/sapcontrol -prot NI_HTTP -nr 00 -function Stop 400
15.04.2019 16:46:42
Stop
OK
Waiting for stopped instance using: /usr/sap/H01/SYS/exe/hdb/sapcontrol -prot NI_HTTP -nr 00 -function WaitforStopped 600 2
15.04.2019 16:46:54
WaitforStopped
OK
hdbdaemon is stopped.
saphana-02:~ # su - h01adm
h01adm@saphana-02:/usr/sap/H01/HDB00> HDB stop
hdbdaemon will wait maximal 300 seconds for NewDB services finishing.
Stopping instance using: /usr/sap/H01/SYS/exe/hdb/sapcontrol -prot NI_HTTP -nr 00 -function Stop 400
15.04.2019 16:47:05
Stop
OK
Waiting for stopped instance using: /usr/sap/H01/SYS/exe/hdb/sapcontrol -prot NI_HTTP -nr 00 -function WaitforStopped 600 2
15.04.2019 16:47:35
WaitforStopped
OK
hdbdaemon is stopped.
3.4 啟動SAP HANA ECS主備節點,並將叢集和資源集恢複為正常模式
依次登入主和備節點,執行以下命令啟動pacemaker服務。
# systemctl start pacemaker
將叢集和資源集恢複為正常模式。
# crm configure property maintenance-mode=false
# crm resource maintenance rsc_SAPHana_HDB false
Performing update of 'maintenance' on 'msl_SAPHana_HDB', the parent of 'rsc_SAPHana_HDB'
Set 'msl_SAPHana_HDB' option: id=msl_SAPHana_HDB-meta_attributes-maintenance name=maintenance=false
# crm resource maintenance rsc_SAPHanaTopology_HDB false
Performing update of 'maintenance' on 'cln_SAPHanaTopology_HDB', the parent of 'rsc_SAPHanaTopology_HDB'
Set 'cln_SAPHanaTopology_HDB' option: id=cln_SAPHanaTopology_HDB-meta_attributes-maintenance name=maintenance=false
SUSE HAE叢集會自動將主備節點的SAP HANA服務拉起,並保持原主備角色不變。
當前HAE狀態如下:
# crm status
Stack: corosync
Current DC: saphana-01 (version 1.1.16-4.8-77ea74d) - partition with quorum
Last updated: Mon Apr 15 16:56:49 2019
Last change: Mon Apr 15 16:56:43 2019 by root via crm_attribute on saphana-01
2 nodes configured
6 resources configured
Online: [ saphana-01 saphana-02 ]
Full list of resources:
rsc_sbd (stonith:external/sbd): Started saphana-01
rsc_vip (ocf::heartbeat:IPaddr2): Started saphana-02
Master/Slave Set: msl_SAPHana_HDB [rsc_SAPHana_HDB]
Masters: [ saphana-02 ]
Slaves: [ saphana-01 ]
Clone Set: cln_SAPHanaTopology_HDB [rsc_SAPHanaTopology_HDB]
Started: [ saphana-01 saphana-02 ]
3.5 檢查SAP HANA HSR狀態
詳細操作,請參見1.5 檢查SAP HANA HSR狀態。
3.6 重設故障計數(可選)
詳細操作,請參見1.6 (可選)重設故障計數。
4.主節點停機維護
主節點將被設定為standby模式,叢集將觸發切換。
本樣本初始狀態的主節點為saphana-02,備節點為saphana-01。
4.1 查詢SUSE HAE的正常狀態
登入任意節點,使用crm status
命令查詢HAE的正常狀態。
# crm status
Stack: corosync
Current DC: saphana-02 (version 1.1.16-4.8-77ea74d) - partition with quorum
Last updated: Mon Apr 15 15:34:52 2019
Last change: Mon Apr 15 15:33:50 2019 by root via crm_attribute on saphana-02
2 nodes configured
6 resources configured
Online: [ saphana-01 saphana-02 ]
Full list of resources:
rsc_sbd (stonith:external/sbd): Started saphana-02
rsc_vip (ocf::heartbeat:IPaddr2): Started saphana-02
Master/Slave Set: msl_SAPHana_HDB [rsc_SAPHana_HDB]
Masters: [ saphana-02 ]
Slaves: [ saphana-01 ]
Clone Set: cln_SAPHanaTopology_HDB [rsc_SAPHanaTopology_HDB]
Started: [ saphana-01 saphana-02 ]
4.2 將主節點設定standby模式
本樣本主節點是saphana-02。
# crm node standby saphana-02
叢集會停掉saphana-02節點的SAP HANA,並將saphana-01節點的SAP HANA設定為主節點。
當前HAE的狀態如下:
# crm status
Stack: corosync
Current DC: saphana-01 (version 1.1.16-4.8-77ea74d) - partition with quorum
Last updated: Mon Apr 15 17:07:56 2019
Last change: Mon Apr 15 17:07:38 2019 by root via crm_attribute on saphana-02
2 nodes configured
6 resources configured
Node saphana-02: standby
Online: [ saphana-01 ]
Full list of resources:
rsc_sbd (stonith:external/sbd): Started saphana-01
rsc_vip (ocf::heartbeat:IPaddr2): Started saphana-01
Clone Set: msl_SAPHana_HDB [rsc_SAPHana_HDB] (promotable)
Masters: [ saphana-01 ]
Stopped: [ saphana-02 ]
Clone Set: cln_SAPHanaTopology_HDB [rsc_SAPHanaTopology_HDB]
Started: [ saphana-01 ]
Stopped: [ saphana-02 ]
4.3 關停ECS,執行停機維護任務
4.4 啟動維護節點,重新註冊HSR
登入被維護節點,註冊HSR。
# hdbnsutil -sr_register --remoteHost=saphana-01 --remoteInstance=00 --replicationMode=syncmem --name=saphana-02 --operationMode=logreplay
4.5 啟動pacemaker服務,並將standby節點恢複成online模式
# systemctl start pacemaker
# crm node online saphana-02
SUSE HAE叢集會自動將備節點的SAP HANA服務拉起。
當前HAE狀態如下:
# crm status
Stack: corosync
Current DC: saphana-02 (version 1.1.16-4.8-77ea74d) - partition with quorum
Last updated: Mon Apr 15 18:02:33 2019
Last change: Mon Apr 15 18:01:31 2019 by root via crm_attribute on saphana-02
2 nodes configured
6 resources configured
Online: [ saphana-01 saphana-02 ]
Full list of resources:
rsc_sbd (stonith:external/sbd): Started saphana-01
rsc_vip (ocf::heartbeat:IPaddr2): Started saphana-01
Master/Slave Set: msl_SAPHana_HDB [rsc_SAPHana_HDB]
Masters: [ saphana-01 ]
Slaves: [ saphana-02 ]
Clone Set: cln_SAPHanaTopology_HDB [rsc_SAPHanaTopology_HDB]
Started: [ saphana-01 saphana-02 ]
4.6 檢查SAP HANA HSR狀態
詳細操作,請參見1.5 檢查SAP HANA HSR狀態。
4.7 重設故障計數(可選)
詳細操作,請參見1.6 (可選)重設故障計數。
5.備節點停機維護
將備節點設定為維護模式。
本樣本初始狀態的主節點為saphana-02,備節點為saphana-01。
5.1 查詢HAE的正常狀態。
SUSE HAE的正常狀態登入任意節點,使用crm status
命令查詢HAE的正常狀態。
# crm status
Stack: corosync
Current DC: saphana-02 (version 1.1.16-4.8-77ea74d) - partition with quorum
Last updated: Mon Apr 15 15:34:52 2019
Last change: Mon Apr 15 15:33:50 2019 by root via crm_attribute on saphana-02
2 nodes configured
6 resources configured
Online: [ saphana-01 saphana-02 ]
Full list of resources:
rsc_sbd (stonith:external/sbd): Started saphana-02
rsc_vip (ocf::heartbeat:IPaddr2): Started saphana-02
Master/Slave Set: msl_SAPHana_HDB [rsc_SAPHana_HDB]
Masters: [ saphana-02 ]
Slaves: [ saphana-01 ]
Clone Set: cln_SAPHanaTopology_HDB [rsc_SAPHanaTopology_HDB]
Started: [ saphana-01 saphana-02 ]
5.2 將備節點設為維護模式
# crm node maintenance saphana-01
設定生效後,HAE狀態如下:
Stack: corosync
Current DC: saphana-02 (version 1.1.16-4.8-77ea74d) - partition with quorum
Last updated: Mon Apr 15 18:18:10 2019
Last change: Mon Apr 15 18:17:49 2019 by root via crm_attribute on saphana-01
2 nodes configured
6 resources configured
Node saphana-01: maintenance
Online: [ saphana-02 ]
Full list of resources:
rsc_sbd (stonith:external/sbd): Started saphana-02
rsc_vip (ocf::heartbeat:IPaddr2): Started saphana-02
Master/Slave Set: msl_SAPHana_HDB [rsc_SAPHana_HDB]
rsc_SAPHana_HDB (ocf::suse:SAPHana): Slave saphana-01 (unmanaged)
Masters: [ saphana-02 ]
Clone Set: cln_SAPHanaTopology_HDB [rsc_SAPHanaTopology_HDB]
rsc_SAPHanaTopology_HDB (ocf::suse:SAPHanaTopology): Started saphana-01 (unmanaged)
Started: [ saphana-02 ]
5.3 停止備節點SAP HANA服務,關停ECS進行停機維護任務
用SAP HANA執行個體使用者登入備節點,停止SAP HANA服務。
saphana-01:~ # su - h01adm
h01adm@saphana-01:/usr/sap/H01/HDB00> HDB stop
hdbdaemon will wait maximal 300 seconds for NewDB services finishing.
Stopping instance using: /usr/sap/H01/SYS/exe/hdb/sapcontrol -prot NI_HTTP -nr 00 -function Stop 400
15.04.2019 16:47:05
Stop
OK
Waiting for stopped instance using: /usr/sap/H01/SYS/exe/hdb/sapcontrol -prot NI_HTTP -nr 00 -function WaitforStopped 600 2
15.04.2019 16:47:35
WaitforStopped
OK
hdbdaemon is stopped.
5.4 啟動SAP HANA ECS備節點,並將節點恢複為正常模式
登入備節點,啟動pacemaker服務。
# systemctl start pacemaker
將備節點恢複為正常模式。
saphana-02:~ # crm node ready saphana-01
SUSE HAE叢集會自動將備節點的SAP HANA服務拉起,並保持原主備角色不變。
當前HAE狀態如下:
# crm status
Stack: corosync
Current DC: saphana-02 (version 1.1.16-4.8-77ea74d) - partition with quorum
Last updated: Mon Apr 15 18:02:33 2019
Last change: Mon Apr 15 18:01:31 2019 by root via crm_attribute on saphana-02
2 nodes configured
6 resources configured
Online: [ saphana-01 saphana-02 ]
Full list of resources:
rsc_sbd (stonith:external/sbd): Started saphana-02
rsc_vip (ocf::heartbeat:IPaddr2): Started saphana-02
Master/Slave Set: msl_SAPHana_HDB [rsc_SAPHana_HDB]
Masters: [ saphana-02 ]
Slaves: [ saphana-01 ]
Clone Set: cln_SAPHanaTopology_HDB [rsc_SAPHanaTopology_HDB]
Started: [ saphana-01 saphana-02 ]
5.5 檢查SAP HANA HSR狀態
詳細操作,請參見1.5 檢查SAP HANA HSR狀態。
5.6 重設故障計數(可選)
詳細操作,請參見1.6 (可選)重設故障計數。