I just worked on a production Exadata X2 Full stack upgrade this past weekend this is something Viscosity specializes in and are experts in this arena. We can actually do a Full Complete Exadata Stack and even ZFS Storage upgrade of ALL components and ensure they are up to date with the latest firmware and patches and actually suggest to do this at least every 6 months or once a year if you can get the downtime.

Everything went very well in our upgrade and I thought we were done and ready to home after upgrading the firmware on the Cisco switch which was one of the last items to upgrade. When we applied the firmware update from a remote location we realized there was a problem when the switch did not come online after a few minutes, also there was no response from a ping or telnet either. We had no choice at this point but to physically get to the Datacenter ASAP and check the status of the switch since there is no ILOM or KVM to control the management of the switch.

All Exadata systems contain a management network running on at least 1000BASE-T (also known as IEEE 802.3ab) is a standard for gigabit Ethernet or 1Gb Ethernet. All servers (database and storage) have a connection to this network, along with each server’s Integrated Lights Out Management (ILOM) card. The Infiniband switches also have connections to the management network, all of which is then routed through the Cisco Catalyst 4948 switch which has 48 ethernet ports. The KVM switch console on the X2-2 and PDUs also have optional connectivity to the management network and are also connected to the switch.

 

When I got to the Datacenter and looked at the Cisco Switch from the back I noticed an Amber colored light for the Status LED light which indicates there is a system fault. I also turned off and unplugged both power cords from the front of the switch and waited about a minute as per Oracle Support’s instructions to power cycle the switch and that also did not bring the switch online, a telnet and ping did not return anything.

Cisco-4948_status_lights

 

 

 

Then we thought of connecting to the switch via the Console port via a serial cable to see if we could manage the switch and get a command prompt to get the status. You cannot simply use an RJ45 ethernet cable to connect from your laptop to the switch’s Console port. You actually need to connect using a USB to Serial to ethernet cable as shown below.

usb_to_serial

 

 

Once you connect to the console port you will see a Serial device on your computer and check the properties to see its COM port number, please note this since it will be used to connect directly to the switch.

Steps to connect to the Cisco Switch from your computer.

  1. You need to find out which com port your prolific usb to serial cable is connected to on your laptop.
  2. Connect to the Console port of the Cisco switch not the Management Port, image shown below and I circled the correct port, the one below which is labeled MGT did not work.
  3. 9600 is usually your connect speed. Use a terminal program such as putty and note your com port and set the baud rate.
  4. Enter return and you should see the prompt.
  5. Type in enable to get a prompt and enter the switch password.
  6. Now issue the boot command to startup the switch.

 

cisco_4948_circle_port

 

Once the Cisco Switch came online we were able to proceed and complete the finally complete the upgrade with the new firmware which allows ssh connectivity and you can also optionally disable telnet if you have strict security guidelines which does not allow it.

 


Recently I came across a script from Oracle Support written recently to check in the ASM storage to see if a disk or a cell failure/loss can be tolerated, the script will report a PASS or FAIL status depending on whether rebalancing can occur after the loss of a disk or cell(12 disks) in the Exadata Storage. The risk a cell server can fail is unlikely but could occur I personally faced this issue almost 1 year ago in a Production environment with a Half Rack(7 cell nodes) when we lost a cell node for almost 2-3 days however we had enough free space for rebalancing to occur and we could tolerate the lost cell node and there was no downtime to any of the databases.

 

The Oracle Support note is listed below and the script is also attached to it.

 

Understanding ASM Capacity and Reservation of Free Space in Exadata (Doc ID 1551288.1)

 

Some key points

 

  • Ensure that you keep FREE_MB column in the ASM lsdg output above the Cell Required Mirror Free MB or Disk Required Mirror Free MB at all times, this number should not go Negative.
  • Disk Required Mirror Free MB is the amount of space that should be reserved for disk failure coverage.
  • One Cell Required Mirror Free MB is the amount of space to reserve for single cell failure coverage, regardless of redundancy type.

 

 

Script output below with BEFORE/AFTER results and expected output that will be sent in case of a failure.

 

BEFORE


State    Type    Rebal  Sector  Block       AU  Total_MB   Free_MB  Req_mir_free_MB  Usable_file_MB  Offline_disks  Voting_files  Name

 

MOUNTED  NORMAL  N         512   4096  4194304  54042624  20132832         18014208         1059312              0             N  DATA_EXAD/

 

MOUNTED  NORMAL  N         512   4096  4194304    894240    636448           298080          169184              0             Y  DBFS_DG/

 

MOUNTED  NORMAL  N         512   4096  4194304  13512384   7173544          4504128         1334708              0             N  RECO_EXAD/

 

AFTER


State    Type    Rebal  Sector  Block       AU  Total_MB   Free_MB  Req_mir_free_MB  Usable_file_MB  Offline_disks  Voting_files  Name

MOUNTED  NORMAL  N         512   4096  4194304  54042624  28095768         18014208         5040780              0             N  DATA_EXAD/

MOUNTED  NORMAL  N         512   4096  4194304    894240    636448           298080          169184              0             Y  DBFS_DG/

MOUNTED  NORMAL  N         512   4096  4194304  13512384   7173208          4504128         1334540              0             N  RECO_EXAD/

 

BEFORE


SQL> @check_asm.sql

------ DISK and CELL Failure Diskgroup Space Reserve Requirements  ------

This procedure determines how much space you need to survive a DISK or CELL

failure. It also shows the usable space

available when reserving space for disk or cell failure.

Please see MOS note 1551288.1 for more information.

.  .  .

Description of Derived Values:

Cell Required Mirror Free MB     : Free MB needed to permit successful rebalance

after losing largest CELL in a DG

2 Cell Required Mirror Free MB   : Free MB needed to permit successful rebalance

after losing 2 largest CELLs in high redundancy DG

Disk Required Mirror Free MB     : Free MB needed to rebalance after loss of

single disk (normal redundancy DG) or double disk (high redundancy DG)

Disk Failure Usable File MB      : Usable space available after reserving space

for disk failure (1 disk in normal or 2 disks in high redundancy DG) and

accounting for mirroring

Cell Failure Usable File MB      : Usable space available after reserving space

for 1 cell failure and accounting for mirroring

2 Cell Failure Usable File MB    : Usable space available after reserving space

for 2 cell failures and accounting for mirroring in a HIGH redundancy DG

.  .  .

ASM Version: 11.2.0.2  - WARNING DISK FAILURE COVERAGE ESTIMATES HAVE NOT BEEN

VERIFIED ON 11.2.0.2 !

.  .  .

-------------------------------------------------------------------------

DG Name:                                 DATA_EXAD

DG Type:                                    NORMAL

Num Disks:                                      36

Disk Size MB:                            1,501,184

.  .  .

DG Total MB:                            54,042,624

DG Used MB:                             34,648,092

DG Free MB:                             19,394,532

.  .  .

Cell Required Mirror Free MB:           27,021,312

.  .  .

Disk Required Mirror Free MB:            1,636,279

.  .  .

Disk Failure Usable File MB:             8,879,126

Cell Failure Usable File MB:            -3,813,390

.  .  .

Enough Free Space to Rebalance after loss of ONE disk: PASS

Enough Free Space to Rebalance after loss of ONE cell: FAIL

-------------------------------------------------------------------------

DG Name:                                   DBFS_DG

DG Type:                                    NORMAL

Num Disks:                                      30

Disk Size MB:                               29,808

.  .  .

DG Total MB:                               894,240

DG Used MB:                                257,792

DG Free MB:                                636,448

.  .  .

Cell Required Mirror Free MB:              447,120

.  .  .

Disk Required Mirror Free MB:               53,600

.  .  .

Disk Failure Usable File MB:               291,424

Cell Failure Usable File MB:                94,664

.  .  .

Enough Free Space to Rebalance after loss of ONE disk: PASS

Enough Free Space to Rebalance after loss of ONE cell: PASS

-------------------------------------------------------------------------

DG Name:                                 RECO_EXAD

DG Type:                                    NORMAL

Num Disks:                                      36

Disk Size MB:                              375,344

.  .  .

DG Total MB:                            13,512,384

DG Used MB:                              7,484,712

DG Free MB:                              6,027,672

.  .  .

Cell Required Mirror Free MB:            6,756,192

.  .  .

Disk Required Mirror Free MB:              423,896

.  .  .

Disk Failure Usable File MB:             2,801,888

Cell Failure Usable File MB:              -364,260

.  .  .

Enough Free Space to Rebalance after loss of ONE disk: PASS

Enough Free Space to Rebalance after loss of ONE cell: FAIL

.  .  .

Script completed.

 

PL/SQL procedure successfully completed.

 

SQL> exit

AFTER


 

 

SQL> @check_asm.sql

------ DISK and CELL Failure Diskgroup Space Reserve Requirements  ------

This procedure determines how much space you need to survive a DISK or CELL

failure. It also shows the usable space

available when reserving space for disk or cell failure.

Please see MOS note 1551288.1 for more information.

.  .  .

Description of Derived Values:

Cell Required Mirror Free MB     : Free MB needed to permit successful rebalance

after losing largest CELL in a DG

2 Cell Required Mirror Free MB   : Free MB needed to permit successful rebalance

after losing 2 largest CELLs in high redundancy DG

Disk Required Mirror Free MB     : Free MB needed to rebalance after loss of

single disk (normal redundancy DG) or double disk (high redundancy DG)

Disk Failure Usable File MB      : Usable space available after reserving space

for disk failure (1 disk in normal or 2 disks in high redundancy DG) and

accounting for mirroring

Cell Failure Usable File MB      : Usable space available after reserving space

for 1 cell failure and accounting for mirroring

2 Cell Failure Usable File MB    : Usable space available after reserving space

for 2 cell failures and accounting for mirroring in a HIGH redundancy DG

.  .  .

ASM Version: 11.2.0.2  - WARNING DISK FAILURE COVERAGE ESTIMATES HAVE NOT BEEN

VERIFIED ON 11.2.0.2 !

.  .  .

-------------------------------------------------------------------------

DG Name:                                 DATA_EXAD

DG Type:                                    NORMAL

Num Disks:                                      36

Disk Size MB:                            1,501,184

.  .  .

DG Total MB:                            54,042,624

DG Used MB:                             25,946,856

DG Free MB:                             28,095,768

.  .  .

Cell Required Mirror Free MB:           27,021,312

.  .  .

Disk Required Mirror Free MB:            1,636,279

.  .  .

Disk Failure Usable File MB:            13,229,744

Cell Failure Usable File MB:               537,228

.  .  .

Enough Free Space to Rebalance after loss of ONE disk: PASS

Enough Free Space to Rebalance after loss of ONE cell: PASS

-------------------------------------------------------------------------

DG Name:                                   DBFS_DG

DG Type:                                    NORMAL

Num Disks:                                      30

Disk Size MB:                               29,808

.  .  .

DG Total MB:                               894,240

DG Used MB:                                257,792

DG Free MB:                                636,448

.  .  .

Cell Required Mirror Free MB:              447,120

.  .  .

Disk Required Mirror Free MB:               53,600

.  .  .

Disk Failure Usable File MB:               291,424

Cell Failure Usable File MB:                94,664

.  .  .

Enough Free Space to Rebalance after loss of ONE disk: PASS

Enough Free Space to Rebalance after loss of ONE cell: PASS

-------------------------------------------------------------------------

DG Name:                                 RECO_EXAD

DG Type:                                    NORMAL

Num Disks:                                      36

Disk Size MB:                              375,344

.  .  .

DG Total MB:                            13,512,384

DG Used MB:                              6,339,176

DG Free MB:                              7,173,208

.  .  .

Cell Required Mirror Free MB:            6,756,192

.  .  .

Disk Required Mirror Free MB:              423,896

.  .  .

Disk Failure Usable File MB:             3,374,656

Cell Failure Usable File MB:               208,508

.  .  .

Enough Free Space to Rebalance after loss of ONE disk: PASS

Enough Free Space to Rebalance after loss of ONE cell: PASS

.  .  .

Script completed.

 

PL/SQL procedure successfully completed.

 



We got some insider information from Oracle Product managers on possible features for the next generation of Exadata X4 systems that will be released soon hopefully later in 2013 or 2014. Please note this information may change on actual release from Oracle.

  • Oracle X4-2 and X4-8 if Oracle keeps the same name
  • Will now support Oracle Virtual Machine – OVM.
  • X4-8(4 CPUS only due to NUMA constraints) and X4-2(2 CPUS)
  • 10 to 12 cores per CPU still not confirmed
  • Up to 1TB of RAM
  • Oracle In-Memory DB option for 12c will run on Exadata X4

 


Steps to enable the bpdufilter on a Cisco 4948 Switch for outside connectivity for Exadata X2

By Nabil Nawaz, Viscosity NA.

We have an Exadata X2 system we are supporting at a managed hosted Datacenter facility that is being supported by me and our company. One fine day in the datacenter the Juniper switch which allows the Exadata system to communicate to the outside world stopped working. Eventually we found out the hosting facility enabled the bpdufilter on the Juniper switch and in turn we needed to do the same setup on out Cisco switch.

Below is a diagram of the highlevel layout of our setup in our datacenter.

Exadata_switch

  • The Exadata X2 Database Machine connects first to the Cisco 4948 Switch.
  • The Cisco switch connects to the Juniper Switch provided by the hosting facility.
  • Juniper Switch is the gateway to outside internet traffic.

  

A BPDU filter what is that?

Bridge Protocol Data Unit’s known also as BPDU’s play a fundamental part in a spanning-tree topology.

The Spanning Tree Protocol (STP) is a network protocol that ensures a loop-free topology for any bridged Ethernet local area network. The basic function of STP is to prevent bridge loops and the broadcast radiation that results from them. Spanning tree also allows a network design to include spare (redundant) links to provide automatic backup paths if an active link fails, without the danger of bridge loops, or the need for manual enabling/disabling of these backup links.

BPDU’s are sent out by a switch to exchange information about bridge ID’s and costs of the root path. Exchanged at a frequency of every 2 seconds by default, BPDU’s allow switches to keep a track of network changes and when to block or forward ports to ensure a loop free topology. A BPDU filter disables spanning-tree which would result in the port to not participate in STP, and loops may occur.

For more information on Spanning Tree Protocol, please refer to the Wikipedia or Cisco documentation links below.

http://en.wikipedia.org/wiki/Spanning_Tree_Protocol

http://www.cisco.com/en/US/docs/switches/lan/catalyst3560/software/release/12.2_55_se/configuration/guide/swstpopt.html#wp1002608

 

Commands to enable bpdu filter.

 

  • ·         Telnet to cisco switch

$ telnet IPADDRESS

  • ·         Enable commandline for switch

telnet> enable

 

  • ·         Prepare to configure switch.

ciscoswitch-ip# configure terminal

Enter configuration commands, one per line.  End with CNTL/Z.

ciscoswitch-ip(config)#interface GigabitEthernet1/48

ciscoswitch-ip(config-if)#

  • ·         Enable BPDU filter

ciscoswitch-ip(config-if)# spanning-tree bpdufilter enable

ciscoswitch-ip(config-if)# end

 

  • ·         Save the configuration to the startup configuration.

 

ciscoswitch-ip# copy running-config startup-config

Destination filename [startup-config]?

 

Building configuration…

Compressed configuration from 3889 bytes to 1546 bytes[OK]

ciscoswitch-ip#reload

Proceed with reload? [confirm]

Connection closed by foreign host

 

  • ·         Verify the configuration and BPDU filter is enabled.

ciscoswitch-ip# show running-config

ciscoswitch-ip# show interfaces status

ciscoswitch-ip# show spanning-tree interface GigabitEthernet1/48 portfast

interface GigabitEthernet1/48

media-type rj45

spanning-tree bpdufilter enable


Steps to change the password on a Cisco Switch

By Nabil Nawaz, Viscosity NA

These steps were used to change the password on a Cisco Switch on Exadata X2.

Telnet to cisco switch(IP Address of Switch) $ telnet <IPADDRESS>

·         Enable command line for switch

telnet> enable

·         Prepare to configure switch

ciscoswitch-ip# configure terminal

exapsw-ip(config)#line vty 0 15

exapsw-ip(config-line)#login

·         Change the password

exapsw-ip(config-line)#password newpassword

exapsw-ip(config-line)#login

exapsw-ip(config-line)#end

 

  • ·         Save the changes to the switch

exapsw-ip#write memory

Building configuration…

Compressed configuration from 4001 bytes to 1608 bytes[OK]

exapsw-ip#

 

  • ·         Try logging again to verify password change