Recently I came across a script from Oracle Support written recently to check in the ASM storage to see if a disk or a cell failure/loss can be tolerated, the script will report a PASS or FAIL status depending on whether rebalancing can occur after the loss of a disk or cell(12 disks) in the Exadata Storage. The risk a cell server can fail is unlikely but could occur I personally faced this issue almost 1 year ago in a Production environment with a Half Rack(7 cell nodes) when we lost a cell node for almost 2-3 days however we had enough free space for rebalancing to occur and we could tolerate the lost cell node and there was no downtime to any of the databases.

 

The Oracle Support note is listed below and the script is also attached to it.

 

Understanding ASM Capacity and Reservation of Free Space in Exadata (Doc ID 1551288.1)

 

Some key points

 

  • Ensure that you keep FREE_MB column in the ASM lsdg output above the Cell Required Mirror Free MB or Disk Required Mirror Free MB at all times, this number should not go Negative.
  • Disk Required Mirror Free MB is the amount of space that should be reserved for disk failure coverage.
  • One Cell Required Mirror Free MB is the amount of space to reserve for single cell failure coverage, regardless of redundancy type.

 

 

Script output below with BEFORE/AFTER results and expected output that will be sent in case of a failure.

 

BEFORE


State    Type    Rebal  Sector  Block       AU  Total_MB   Free_MB  Req_mir_free_MB  Usable_file_MB  Offline_disks  Voting_files  Name

 

MOUNTED  NORMAL  N         512   4096  4194304  54042624  20132832         18014208         1059312              0             N  DATA_EXAD/

 

MOUNTED  NORMAL  N         512   4096  4194304    894240    636448           298080          169184              0             Y  DBFS_DG/

 

MOUNTED  NORMAL  N         512   4096  4194304  13512384   7173544          4504128         1334708              0             N  RECO_EXAD/

 

AFTER


State    Type    Rebal  Sector  Block       AU  Total_MB   Free_MB  Req_mir_free_MB  Usable_file_MB  Offline_disks  Voting_files  Name

MOUNTED  NORMAL  N         512   4096  4194304  54042624  28095768         18014208         5040780              0             N  DATA_EXAD/

MOUNTED  NORMAL  N         512   4096  4194304    894240    636448           298080          169184              0             Y  DBFS_DG/

MOUNTED  NORMAL  N         512   4096  4194304  13512384   7173208          4504128         1334540              0             N  RECO_EXAD/

 

BEFORE


SQL> @check_asm.sql

------ DISK and CELL Failure Diskgroup Space Reserve Requirements  ------

This procedure determines how much space you need to survive a DISK or CELL

failure. It also shows the usable space

available when reserving space for disk or cell failure.

Please see MOS note 1551288.1 for more information.

.  .  .

Description of Derived Values:

Cell Required Mirror Free MB     : Free MB needed to permit successful rebalance

after losing largest CELL in a DG

2 Cell Required Mirror Free MB   : Free MB needed to permit successful rebalance

after losing 2 largest CELLs in high redundancy DG

Disk Required Mirror Free MB     : Free MB needed to rebalance after loss of

single disk (normal redundancy DG) or double disk (high redundancy DG)

Disk Failure Usable File MB      : Usable space available after reserving space

for disk failure (1 disk in normal or 2 disks in high redundancy DG) and

accounting for mirroring

Cell Failure Usable File MB      : Usable space available after reserving space

for 1 cell failure and accounting for mirroring

2 Cell Failure Usable File MB    : Usable space available after reserving space

for 2 cell failures and accounting for mirroring in a HIGH redundancy DG

.  .  .

ASM Version: 11.2.0.2  - WARNING DISK FAILURE COVERAGE ESTIMATES HAVE NOT BEEN

VERIFIED ON 11.2.0.2 !

.  .  .

-------------------------------------------------------------------------

DG Name:                                 DATA_EXAD

DG Type:                                    NORMAL

Num Disks:                                      36

Disk Size MB:                            1,501,184

.  .  .

DG Total MB:                            54,042,624

DG Used MB:                             34,648,092

DG Free MB:                             19,394,532

.  .  .

Cell Required Mirror Free MB:           27,021,312

.  .  .

Disk Required Mirror Free MB:            1,636,279

.  .  .

Disk Failure Usable File MB:             8,879,126

Cell Failure Usable File MB:            -3,813,390

.  .  .

Enough Free Space to Rebalance after loss of ONE disk: PASS

Enough Free Space to Rebalance after loss of ONE cell: FAIL

-------------------------------------------------------------------------

DG Name:                                   DBFS_DG

DG Type:                                    NORMAL

Num Disks:                                      30

Disk Size MB:                               29,808

.  .  .

DG Total MB:                               894,240

DG Used MB:                                257,792

DG Free MB:                                636,448

.  .  .

Cell Required Mirror Free MB:              447,120

.  .  .

Disk Required Mirror Free MB:               53,600

.  .  .

Disk Failure Usable File MB:               291,424

Cell Failure Usable File MB:                94,664

.  .  .

Enough Free Space to Rebalance after loss of ONE disk: PASS

Enough Free Space to Rebalance after loss of ONE cell: PASS

-------------------------------------------------------------------------

DG Name:                                 RECO_EXAD

DG Type:                                    NORMAL

Num Disks:                                      36

Disk Size MB:                              375,344

.  .  .

DG Total MB:                            13,512,384

DG Used MB:                              7,484,712

DG Free MB:                              6,027,672

.  .  .

Cell Required Mirror Free MB:            6,756,192

.  .  .

Disk Required Mirror Free MB:              423,896

.  .  .

Disk Failure Usable File MB:             2,801,888

Cell Failure Usable File MB:              -364,260

.  .  .

Enough Free Space to Rebalance after loss of ONE disk: PASS

Enough Free Space to Rebalance after loss of ONE cell: FAIL

.  .  .

Script completed.

 

PL/SQL procedure successfully completed.

 

SQL> exit

AFTER


 

 

SQL> @check_asm.sql

------ DISK and CELL Failure Diskgroup Space Reserve Requirements  ------

This procedure determines how much space you need to survive a DISK or CELL

failure. It also shows the usable space

available when reserving space for disk or cell failure.

Please see MOS note 1551288.1 for more information.

.  .  .

Description of Derived Values:

Cell Required Mirror Free MB     : Free MB needed to permit successful rebalance

after losing largest CELL in a DG

2 Cell Required Mirror Free MB   : Free MB needed to permit successful rebalance

after losing 2 largest CELLs in high redundancy DG

Disk Required Mirror Free MB     : Free MB needed to rebalance after loss of

single disk (normal redundancy DG) or double disk (high redundancy DG)

Disk Failure Usable File MB      : Usable space available after reserving space

for disk failure (1 disk in normal or 2 disks in high redundancy DG) and

accounting for mirroring

Cell Failure Usable File MB      : Usable space available after reserving space

for 1 cell failure and accounting for mirroring

2 Cell Failure Usable File MB    : Usable space available after reserving space

for 2 cell failures and accounting for mirroring in a HIGH redundancy DG

.  .  .

ASM Version: 11.2.0.2  - WARNING DISK FAILURE COVERAGE ESTIMATES HAVE NOT BEEN

VERIFIED ON 11.2.0.2 !

.  .  .

-------------------------------------------------------------------------

DG Name:                                 DATA_EXAD

DG Type:                                    NORMAL

Num Disks:                                      36

Disk Size MB:                            1,501,184

.  .  .

DG Total MB:                            54,042,624

DG Used MB:                             25,946,856

DG Free MB:                             28,095,768

.  .  .

Cell Required Mirror Free MB:           27,021,312

.  .  .

Disk Required Mirror Free MB:            1,636,279

.  .  .

Disk Failure Usable File MB:            13,229,744

Cell Failure Usable File MB:               537,228

.  .  .

Enough Free Space to Rebalance after loss of ONE disk: PASS

Enough Free Space to Rebalance after loss of ONE cell: PASS

-------------------------------------------------------------------------

DG Name:                                   DBFS_DG

DG Type:                                    NORMAL

Num Disks:                                      30

Disk Size MB:                               29,808

.  .  .

DG Total MB:                               894,240

DG Used MB:                                257,792

DG Free MB:                                636,448

.  .  .

Cell Required Mirror Free MB:              447,120

.  .  .

Disk Required Mirror Free MB:               53,600

.  .  .

Disk Failure Usable File MB:               291,424

Cell Failure Usable File MB:                94,664

.  .  .

Enough Free Space to Rebalance after loss of ONE disk: PASS

Enough Free Space to Rebalance after loss of ONE cell: PASS

-------------------------------------------------------------------------

DG Name:                                 RECO_EXAD

DG Type:                                    NORMAL

Num Disks:                                      36

Disk Size MB:                              375,344

.  .  .

DG Total MB:                            13,512,384

DG Used MB:                              6,339,176

DG Free MB:                              7,173,208

.  .  .

Cell Required Mirror Free MB:            6,756,192

.  .  .

Disk Required Mirror Free MB:              423,896

.  .  .

Disk Failure Usable File MB:             3,374,656

Cell Failure Usable File MB:               208,508

.  .  .

Enough Free Space to Rebalance after loss of ONE disk: PASS

Enough Free Space to Rebalance after loss of ONE cell: PASS

.  .  .

Script completed.

 

PL/SQL procedure successfully completed.

 


Comments are closed