About Nabil.Nawaz

I just worked on a production Exadata X2 Full stack upgrade this past weekend this is something Viscosity specializes in and are experts in this arena. We can actually do a Full Complete Exadata Stack and even ZFS Storage upgrade of ALL components and ensure they are up to date with the latest firmware and patches and actually suggest to do this at least every 6 months or once a year if you can get the downtime.

Everything went very well in our upgrade and I thought we were done and ready to home after upgrading the firmware on the Cisco switch which was one of the last items to upgrade. When we applied the firmware update from a remote location we realized there was a problem when the switch did not come online after a few minutes, also there was no response from a ping or telnet either. We had no choice at this point but to physically get to the Datacenter ASAP and check the status of the switch since there is no ILOM or KVM to control the management of the switch.

All Exadata systems contain a management network running on at least 1000BASE-T (also known as IEEE 802.3ab) is a standard for gigabit Ethernet or 1Gb Ethernet. All servers (database and storage) have a connection to this network, along with each server’s Integrated Lights Out Management (ILOM) card. The Infiniband switches also have connections to the management network, all of which is then routed through the Cisco Catalyst 4948 switch which has 48 ethernet ports. The KVM switch console on the X2-2 and PDUs also have optional connectivity to the management network and are also connected to the switch.

 

When I got to the Datacenter and looked at the Cisco Switch from the back I noticed an Amber colored light for the Status LED light which indicates there is a system fault. I also turned off and unplugged both power cords from the front of the switch and waited about a minute as per Oracle Support’s instructions to power cycle the switch and that also did not bring the switch online, a telnet and ping did not return anything.

Cisco-4948_status_lights

 

 

 

Then we thought of connecting to the switch via the Console port via a serial cable to see if we could manage the switch and get a command prompt to get the status. You cannot simply use an RJ45 ethernet cable to connect from your laptop to the switch’s Console port. You actually need to connect using a USB to Serial to ethernet cable as shown below.

usb_to_serial

 

 

Once you connect to the console port you will see a Serial device on your computer and check the properties to see its COM port number, please note this since it will be used to connect directly to the switch.

Steps to connect to the Cisco Switch from your computer.

  1. You need to find out which com port your prolific usb to serial cable is connected to on your laptop.
  2. Connect to the Console port of the Cisco switch not the Management Port, image shown below and I circled the correct port, the one below which is labeled MGT did not work.
  3. 9600 is usually your connect speed. Use a terminal program such as putty and note your com port and set the baud rate.
  4. Enter return and you should see the prompt.
  5. Type in enable to get a prompt and enter the switch password.
  6. Now issue the boot command to startup the switch.

 

cisco_4948_circle_port

 

Once the Cisco Switch came online we were able to proceed and complete the finally complete the upgrade with the new firmware which allows ssh connectivity and you can also optionally disable telnet if you have strict security guidelines which does not allow it.

 


Recently I came across a script from Oracle Support written recently to check in the ASM storage to see if a disk or a cell failure/loss can be tolerated, the script will report a PASS or FAIL status depending on whether rebalancing can occur after the loss of a disk or cell(12 disks) in the Exadata Storage. The risk a cell server can fail is unlikely but could occur I personally faced this issue almost 1 year ago in a Production environment with a Half Rack(7 cell nodes) when we lost a cell node for almost 2-3 days however we had enough free space for rebalancing to occur and we could tolerate the lost cell node and there was no downtime to any of the databases.

 

The Oracle Support note is listed below and the script is also attached to it.

 

Understanding ASM Capacity and Reservation of Free Space in Exadata (Doc ID 1551288.1)

 

Some key points

 

  • Ensure that you keep FREE_MB column in the ASM lsdg output above the Cell Required Mirror Free MB or Disk Required Mirror Free MB at all times, this number should not go Negative.
  • Disk Required Mirror Free MB is the amount of space that should be reserved for disk failure coverage.
  • One Cell Required Mirror Free MB is the amount of space to reserve for single cell failure coverage, regardless of redundancy type.

 

 

Script output below with BEFORE/AFTER results and expected output that will be sent in case of a failure.

 

BEFORE


State    Type    Rebal  Sector  Block       AU  Total_MB   Free_MB  Req_mir_free_MB  Usable_file_MB  Offline_disks  Voting_files  Name

 

MOUNTED  NORMAL  N         512   4096  4194304  54042624  20132832         18014208         1059312              0             N  DATA_EXAD/

 

MOUNTED  NORMAL  N         512   4096  4194304    894240    636448           298080          169184              0             Y  DBFS_DG/

 

MOUNTED  NORMAL  N         512   4096  4194304  13512384   7173544          4504128         1334708              0             N  RECO_EXAD/

 

AFTER


State    Type    Rebal  Sector  Block       AU  Total_MB   Free_MB  Req_mir_free_MB  Usable_file_MB  Offline_disks  Voting_files  Name

MOUNTED  NORMAL  N         512   4096  4194304  54042624  28095768         18014208         5040780              0             N  DATA_EXAD/

MOUNTED  NORMAL  N         512   4096  4194304    894240    636448           298080          169184              0             Y  DBFS_DG/

MOUNTED  NORMAL  N         512   4096  4194304  13512384   7173208          4504128         1334540              0             N  RECO_EXAD/

 

BEFORE


SQL> @check_asm.sql

------ DISK and CELL Failure Diskgroup Space Reserve Requirements  ------

This procedure determines how much space you need to survive a DISK or CELL

failure. It also shows the usable space

available when reserving space for disk or cell failure.

Please see MOS note 1551288.1 for more information.

.  .  .

Description of Derived Values:

Cell Required Mirror Free MB     : Free MB needed to permit successful rebalance

after losing largest CELL in a DG

2 Cell Required Mirror Free MB   : Free MB needed to permit successful rebalance

after losing 2 largest CELLs in high redundancy DG

Disk Required Mirror Free MB     : Free MB needed to rebalance after loss of

single disk (normal redundancy DG) or double disk (high redundancy DG)

Disk Failure Usable File MB      : Usable space available after reserving space

for disk failure (1 disk in normal or 2 disks in high redundancy DG) and

accounting for mirroring

Cell Failure Usable File MB      : Usable space available after reserving space

for 1 cell failure and accounting for mirroring

2 Cell Failure Usable File MB    : Usable space available after reserving space

for 2 cell failures and accounting for mirroring in a HIGH redundancy DG

.  .  .

ASM Version: 11.2.0.2  - WARNING DISK FAILURE COVERAGE ESTIMATES HAVE NOT BEEN

VERIFIED ON 11.2.0.2 !

.  .  .

-------------------------------------------------------------------------

DG Name:                                 DATA_EXAD

DG Type:                                    NORMAL

Num Disks:                                      36

Disk Size MB:                            1,501,184

.  .  .

DG Total MB:                            54,042,624

DG Used MB:                             34,648,092

DG Free MB:                             19,394,532

.  .  .

Cell Required Mirror Free MB:           27,021,312

.  .  .

Disk Required Mirror Free MB:            1,636,279

.  .  .

Disk Failure Usable File MB:             8,879,126

Cell Failure Usable File MB:            -3,813,390

.  .  .

Enough Free Space to Rebalance after loss of ONE disk: PASS

Enough Free Space to Rebalance after loss of ONE cell: FAIL

-------------------------------------------------------------------------

DG Name:                                   DBFS_DG

DG Type:                                    NORMAL

Num Disks:                                      30

Disk Size MB:                               29,808

.  .  .

DG Total MB:                               894,240

DG Used MB:                                257,792

DG Free MB:                                636,448

.  .  .

Cell Required Mirror Free MB:              447,120

.  .  .

Disk Required Mirror Free MB:               53,600

.  .  .

Disk Failure Usable File MB:               291,424

Cell Failure Usable File MB:                94,664

.  .  .

Enough Free Space to Rebalance after loss of ONE disk: PASS

Enough Free Space to Rebalance after loss of ONE cell: PASS

-------------------------------------------------------------------------

DG Name:                                 RECO_EXAD

DG Type:                                    NORMAL

Num Disks:                                      36

Disk Size MB:                              375,344

.  .  .

DG Total MB:                            13,512,384

DG Used MB:                              7,484,712

DG Free MB:                              6,027,672

.  .  .

Cell Required Mirror Free MB:            6,756,192

.  .  .

Disk Required Mirror Free MB:              423,896

.  .  .

Disk Failure Usable File MB:             2,801,888

Cell Failure Usable File MB:              -364,260

.  .  .

Enough Free Space to Rebalance after loss of ONE disk: PASS

Enough Free Space to Rebalance after loss of ONE cell: FAIL

.  .  .

Script completed.

 

PL/SQL procedure successfully completed.

 

SQL> exit

AFTER


 

 

SQL> @check_asm.sql

------ DISK and CELL Failure Diskgroup Space Reserve Requirements  ------

This procedure determines how much space you need to survive a DISK or CELL

failure. It also shows the usable space

available when reserving space for disk or cell failure.

Please see MOS note 1551288.1 for more information.

.  .  .

Description of Derived Values:

Cell Required Mirror Free MB     : Free MB needed to permit successful rebalance

after losing largest CELL in a DG

2 Cell Required Mirror Free MB   : Free MB needed to permit successful rebalance

after losing 2 largest CELLs in high redundancy DG

Disk Required Mirror Free MB     : Free MB needed to rebalance after loss of

single disk (normal redundancy DG) or double disk (high redundancy DG)

Disk Failure Usable File MB      : Usable space available after reserving space

for disk failure (1 disk in normal or 2 disks in high redundancy DG) and

accounting for mirroring

Cell Failure Usable File MB      : Usable space available after reserving space

for 1 cell failure and accounting for mirroring

2 Cell Failure Usable File MB    : Usable space available after reserving space

for 2 cell failures and accounting for mirroring in a HIGH redundancy DG

.  .  .

ASM Version: 11.2.0.2  - WARNING DISK FAILURE COVERAGE ESTIMATES HAVE NOT BEEN

VERIFIED ON 11.2.0.2 !

.  .  .

-------------------------------------------------------------------------

DG Name:                                 DATA_EXAD

DG Type:                                    NORMAL

Num Disks:                                      36

Disk Size MB:                            1,501,184

.  .  .

DG Total MB:                            54,042,624

DG Used MB:                             25,946,856

DG Free MB:                             28,095,768

.  .  .

Cell Required Mirror Free MB:           27,021,312

.  .  .

Disk Required Mirror Free MB:            1,636,279

.  .  .

Disk Failure Usable File MB:            13,229,744

Cell Failure Usable File MB:               537,228

.  .  .

Enough Free Space to Rebalance after loss of ONE disk: PASS

Enough Free Space to Rebalance after loss of ONE cell: PASS

-------------------------------------------------------------------------

DG Name:                                   DBFS_DG

DG Type:                                    NORMAL

Num Disks:                                      30

Disk Size MB:                               29,808

.  .  .

DG Total MB:                               894,240

DG Used MB:                                257,792

DG Free MB:                                636,448

.  .  .

Cell Required Mirror Free MB:              447,120

.  .  .

Disk Required Mirror Free MB:               53,600

.  .  .

Disk Failure Usable File MB:               291,424

Cell Failure Usable File MB:                94,664

.  .  .

Enough Free Space to Rebalance after loss of ONE disk: PASS

Enough Free Space to Rebalance after loss of ONE cell: PASS

-------------------------------------------------------------------------

DG Name:                                 RECO_EXAD

DG Type:                                    NORMAL

Num Disks:                                      36

Disk Size MB:                              375,344

.  .  .

DG Total MB:                            13,512,384

DG Used MB:                              6,339,176

DG Free MB:                              7,173,208

.  .  .

Cell Required Mirror Free MB:            6,756,192

.  .  .

Disk Required Mirror Free MB:              423,896

.  .  .

Disk Failure Usable File MB:             3,374,656

Cell Failure Usable File MB:               208,508

.  .  .

Enough Free Space to Rebalance after loss of ONE disk: PASS

Enough Free Space to Rebalance after loss of ONE cell: PASS

.  .  .

Script completed.

 

PL/SQL procedure successfully completed.

 



We got some insider information from Oracle Product managers on possible features for the next generation of Exadata X4 systems that will be released soon hopefully later in 2013 or 2014. Please note this information may change on actual release from Oracle.

  • Oracle X4-2 and X4-8 if Oracle keeps the same name
  • Will now support Oracle Virtual Machine – OVM.
  • X4-8(4 CPUS only due to NUMA constraints) and X4-2(2 CPUS)
  • 10 to 12 cores per CPU still not confirmed
  • Up to 1TB of RAM
  • Oracle In-Memory DB option for 12c will run on Exadata X4

 


Oracle’s Flagship Product The SuperCluster M6-32

 

Another update from Oracle Open World 2013, Oracle has announced their Flagship Product the Supercluster M6-32 this latest version is their fastest in memory system yet on the Sparc chipset. The hardware specs are very impressive and you can combine both Database and Applications all together on one box, perfect for consolidation.

oracle_supercluster

 
  • Oracle’s fastest and most scalable engineered system
  • SuperCluster on Sparc is the fastest chipset on the market – beat IBM P Series
  • Ideal for running mission-critical database and applications in memory and consolidating the largest workloads
  • With up to 32 TB of Memory
  • 32 Processors, 12 cores/CPU, 384 cores total
  • Can scale out and add more storage and servers
  • OLTP, Datawarehousing
  • Complex applications
  • Supports In-memory databases
  • Applications and Databases can be run simultaneously together
  • Will Support Solaris Containers or Zoning

 


				

Finally an In-Memory Option in the Oracle RDBMS Database software. At this years Oracle Open World 2013 Larry Ellison talked about a new upcoming feature for Oracle Database 12c – the in-memory database option – that’s going to allow simultaneous row-level storage (just like we’ve always stored data) and column-level storage (essentially as an in-memory object structure) which will make the need for non-selective, non-PK indexes irrelevant.

This new feature will be quite simple to implement; all we’ll need to do is set a new initialization parameter (inmemory_size) to the appropriate size based on the available memory on your server. During Larry Demo query performance improvement peaked at 1390X for several Billion rows queried! This looks like it may give Exadata some competition, this remains to be seen.

  • Feature can be used on Traditional Database Servers.
  • 100X Faster Queries: Real-Time Analytics
  • Get near Instantaneous query results
  • Querying OLTP database or datawarehouse 
  • 2X Increase Updates and Deletes
  • Insert rows 3X to 4X faster 
  • Join tables up to 10x faster
  • Data stored in BOTH row and column format 
  • Less Tuning and Indexes required 
  • No SQL or Application changes

 

 

 

 

 


Oracle Flashback Guaranteed Restore Point Misunderstanding

Nabil Nawaz, Viscosity NA

 

Recently I had a conversation with a DBA that I work with regarding Guaranteed Restore Points (GRP) in Oracle 11gR2. They had the understanding that if a GRP was created regardless if the database flashback feature was on or off then it would only be created with only the flashback logs necessary to flashback or rewind the database back to the GRP point in time. In other words the flashback logs would never grow in size for a GRP since they did not believe those would be needed by Oracle. This conversation came up when they had to wait several minutes to drop a GRP in the database, they did not expect it to take so long since there should only be a limited amount of flashback logs created for the GRP that would need to be dropped.In reality their understanding was not correct at all and here is the reason why, please see below the supporting information for this.

For example if you create a new GRP in your database then the view v$restore_point will be populated as follows, please note the STORAGE_SIZE column which I rounded up to Megabytes(MB) and in this case a new GRP is only 6MB.

GUARANTEE_FLASHBACK_DATABASE STORAGE_SIZE(MB) TIME NAME
YES

6

8/16/2013 11:24

BEFORE_RELEASE

 

Let’s take a look at another GRP that has existed for some time in another database, please note I rounded the STORAGE_SIZE column to Gigabytes(GB) and in this case the GRP is 75GB in size. This shows and proves that when a GRP is created it is small but as activity runs on the database it will grow and in fact ALL flashback logs will be retained since the GRP is created up until it is finally dropped.

 

GUARANTEE_FLASHBACK_DATABASE STORAGE_SIZE(GB) TIME NAME
YES

75

7/2/2013 16:43

BEFORE_REL_1

 

In the database alert log you will see the following output when dropping a GRP with lots of activity since it was created. You will see the message “Deleted Oracle managed file” repeatedly.

Fri Aug 16 11:21:17 2013

Drop guaranteed restore point BEFORE_RELEASE

Deleted Oracle managed file +RECO_EXAD/odssit/flashback/log_1.5780.819737031

Deleted Oracle managed file +RECO_EXAD/odssit/flashback/log_2.42344.819737033

Deleted Oracle managed file +RECO_EXAD/odssit/flashback/log_3.5651.819737037

Deleted Oracle managed file +RECO_EXAD/odssit/flashback/log_4.5322.819737041

Deleted Oracle managed file +RECO_EXAD/odssit/flashback/log_5.51038.819737285

 

Also per the Oracle Support Documentation which also states all flashback logs will be kept to satisfy the restore point.

“If you enable Flashback Database and define one or more guaranteed restore points, then the database performs normal flashback logging. In this case, the recovery area retains the flashback logs required to flash back to any arbitrary time between the present and the earliest currently defined guaranteed restore point. Flashback logs are not deleted in response to space pressure if they are required to satisfy the guarantee.”

Posted in 11g

Steps to enable the bpdufilter on a Cisco 4948 Switch for outside connectivity for Exadata X2

By Nabil Nawaz, Viscosity NA.

We have an Exadata X2 system we are supporting at a managed hosted Datacenter facility that is being supported by me and our company. One fine day in the datacenter the Juniper switch which allows the Exadata system to communicate to the outside world stopped working. Eventually we found out the hosting facility enabled the bpdufilter on the Juniper switch and in turn we needed to do the same setup on out Cisco switch.

Below is a diagram of the highlevel layout of our setup in our datacenter.

Exadata_switch

  • The Exadata X2 Database Machine connects first to the Cisco 4948 Switch.
  • The Cisco switch connects to the Juniper Switch provided by the hosting facility.
  • Juniper Switch is the gateway to outside internet traffic.

  

A BPDU filter what is that?

Bridge Protocol Data Unit’s known also as BPDU’s play a fundamental part in a spanning-tree topology.

The Spanning Tree Protocol (STP) is a network protocol that ensures a loop-free topology for any bridged Ethernet local area network. The basic function of STP is to prevent bridge loops and the broadcast radiation that results from them. Spanning tree also allows a network design to include spare (redundant) links to provide automatic backup paths if an active link fails, without the danger of bridge loops, or the need for manual enabling/disabling of these backup links.

BPDU’s are sent out by a switch to exchange information about bridge ID’s and costs of the root path. Exchanged at a frequency of every 2 seconds by default, BPDU’s allow switches to keep a track of network changes and when to block or forward ports to ensure a loop free topology. A BPDU filter disables spanning-tree which would result in the port to not participate in STP, and loops may occur.

For more information on Spanning Tree Protocol, please refer to the Wikipedia or Cisco documentation links below.

http://en.wikipedia.org/wiki/Spanning_Tree_Protocol

http://www.cisco.com/en/US/docs/switches/lan/catalyst3560/software/release/12.2_55_se/configuration/guide/swstpopt.html#wp1002608

 

Commands to enable bpdu filter.

 

  • ·         Telnet to cisco switch

$ telnet IPADDRESS

  • ·         Enable commandline for switch

telnet> enable

 

  • ·         Prepare to configure switch.

ciscoswitch-ip# configure terminal

Enter configuration commands, one per line.  End with CNTL/Z.

ciscoswitch-ip(config)#interface GigabitEthernet1/48

ciscoswitch-ip(config-if)#

  • ·         Enable BPDU filter

ciscoswitch-ip(config-if)# spanning-tree bpdufilter enable

ciscoswitch-ip(config-if)# end

 

  • ·         Save the configuration to the startup configuration.

 

ciscoswitch-ip# copy running-config startup-config

Destination filename [startup-config]?

 

Building configuration…

Compressed configuration from 3889 bytes to 1546 bytes[OK]

ciscoswitch-ip#reload

Proceed with reload? [confirm]

Connection closed by foreign host

 

  • ·         Verify the configuration and BPDU filter is enabled.

ciscoswitch-ip# show running-config

ciscoswitch-ip# show interfaces status

ciscoswitch-ip# show spanning-tree interface GigabitEthernet1/48 portfast

interface GigabitEthernet1/48

media-type rj45

spanning-tree bpdufilter enable


Steps to change the password on a Cisco Switch

By Nabil Nawaz, Viscosity NA

These steps were used to change the password on a Cisco Switch on Exadata X2.

Telnet to cisco switch(IP Address of Switch) $ telnet <IPADDRESS>

·         Enable command line for switch

telnet> enable

·         Prepare to configure switch

ciscoswitch-ip# configure terminal

exapsw-ip(config)#line vty 0 15

exapsw-ip(config-line)#login

·         Change the password

exapsw-ip(config-line)#password newpassword

exapsw-ip(config-line)#login

exapsw-ip(config-line)#end

 

  • ·         Save the changes to the switch

exapsw-ip#write memory

Building configuration…

Compressed configuration from 4001 bytes to 1608 bytes[OK]

exapsw-ip#

 

  • ·         Try logging again to verify password change