There cases where we need to ensure that large packet “address-ability” exists. This is needed to verify configuration for non standard packet sizes, i.e, MTU of 9000. For example if we are deploying a NAS or backup server across the network.

Setting the MTU can be done by editing the configuration script for the relevant interface in /etc/sysconfig/network-scripts/. In our example, we will use the eth1 interface, thus the file to edit would be ifcfg-eth1.

Add a line to specify the MTU, for example:
DEVICE=eth1
BOOTPROTO=static
ONBOOT=yes
IPADDR=192.168.20.2
NETMASK=255.255.255.0
MTU=9000

Assuming that MTU is set on the system, just do a ifdown eth1 followed by ifup eth1.
An ifconfig eth1 will tell if its set correctly

eth1 Link encap:Ethernet HWaddr 00:0F:EA:94:xx:xx
inet addr:192.168.20.2 Bcast:192.168.1.255 Mask:255.255.255.0
inet6 addr: fe80::20f:eaff:fe91:407/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:9000 Metric:1
RX packets:141567 errors:0 dropped:0 overruns:0 frame:0
TX packets:141306 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:101087512 (96.4 MiB) TX bytes:32695783 (31.1 MiB)
Interrupt:18 Base address:0xc000

To validate end-2-end MTU 9000 packet management

Execute the following on Linux systems:

ping -M do -s 8972 [destinationIP]
For example: ping datadomain.viscosityna.com -s 8972

The reason for the 8972 on Linux/Unix system, the ICMP/ping implementation doesn’t encapsulate the 28 byte ICMP (8) + TCP (20) (ping + standard transmission control protocol packet) header. Therefore, take in account : 9000 and subtract 28 = 8972.

[root@racnode01]# ping -s 8972 -M do datadomain.viscosityna.com
PING datadomain.viscosityna.com. (192.168.20.32) 8972(9000) bytes of data.
8980 bytes from racnode1.viscosityna.com. (192.168.20.2): icmp_seq=0 ttl=64 time=0.914 ms

To illustrate if proper MTU packet address-ability is not in place, I can set a larger packet size in the ping (8993). The packet gets fragmented you will see
“Packet needs to be fragmented by DF set”. In this example, the ping command uses ” -s” to set the packet size, and “-M do” sets the Do Not Fragment

[root@racnode01]# ping -s 8993 -M do datadomain.viscosityna.com
5 packets transmitted, 5 received, 0% packet loss, time 4003ms
rtt min/avg/max/mdev = 0.859/0.955/1.167/0.109 ms, pipe 2
PING datadomain.viscosityna.com. (192.168.20.32) 8993(9001) bytes of data.
From racnode1.viscosityna.com. (192.168.20.2) icmp_seq=0 Frag needed and DF set (mtu = 9000)

By adjusting the packet size, you can figure out what the mtu for the link is. This will represent the lowest mtu allowed by any device in the path, e.g., the switch, source or target node, target or anything else inbetween.

Finally, another way to verify the correct usage of the MTU size is the command ‘netstat -a -i -n’ (the column MTU size should be 9000 when you are performing tests on Jumbo Frames)


We are about to apply 12102 PSU1 (19392646) on a Standalone Cluster. This patch applies the patch to RDBMS home as part of opatchauto.
Here’s the output in case anyone wants to compare:

[root@ol64afd OPatch]# ./opatchauto apply -analyze /mnt/hgfs/12cGridSoftware/12102-PSU1/19392646 -ocmrf ocm/bin/ocm.rsp
OPatch Automation Tool
Copyright (c) 2014, Oracle Corporation. All rights reserved.

OPatchauto version : 12.1.0.1.5
OUI version : 12.1.0.2.0
Running from : /u01/app/oracle/product/12.1.0/grid

opatchauto log file: /u01/app/oracle/product/12.1.0/grid/cfgtoollogs/opatchauto/19392646/opatch_gi_2014-10-30_16-03-29_analyze.log

NOTE: opatchauto is running in ANALYZE mode. There will be no change to your system.

Parameter Validation: Successful

Grid Infrastructure home:
/u01/app/oracle/product/12.1.0/grid
RAC home(s):
/u01/app/oracle/product/12.1.0/database

Configuration Validation: Successful

Patch Location: /mnt/hgfs/12cGridSoftware/12102-PSU1/19392646
Grid Infrastructure Patch(es): 19303936 19392590 19392604
RAC Patch(es): 19303936 19392604

Patch Validation: Successful
Command “/u01/app/oracle/product/12.1.0/database/OPatch/opatch version -oh /u01/app/oracle/product/12.1.0/database -invPtrLoc /u01/app/oracle/product/12.1.0/grid/oraInst.loc -v2c 12.1.0.1.5” execution failed:
bash: /u01/app/oracle/product/12.1.0/database/OPatch/opatch: No such file or directory

For more details, please refer to the log file “/u01/app/oracle/product/12.1.0/grid/cfgtoollogs/opatchauto/19392646/opatch_gi_2014-10-30_16-03-29_analyze.debug.log”.

Apply Summary:

Following patch(es) failed to be analyzed:
GI Home: /u01/app/oracle/product/12.1.0/grid: 19303936, 19392590, 19392604
RAC Home: /u01/app/oracle/product/12.1.0/database: 19303936, 19392604

opatchauto failed with error code 2.
[root@ol64afd OPatch]#

Need to copy 12.1.0.2.1 OPatch into Grid_HOME and DB_HOME. Now re-run

[root@ol64afd OPatch]# ./opatchauto apply -analyze /mnt/hgfs/12cGridSoftware/12102-PSU1/19392646 -ocmrf ocm/bin/ocm.rsp
OPatch Automation Tool
Copyright (c) 2014, Oracle Corporation. All rights reserved.

OPatchauto version : 12.1.0.1.5
OUI version : 12.1.0.2.0
Running from : /u01/app/oracle/product/12.1.0/grid

opatchauto log file: /u01/app/oracle/product/12.1.0/grid/cfgtoollogs/opatchauto/19392646/opatch_gi_2014-10-30_16-05-43_analyze.log

NOTE: opatchauto is running in ANALYZE mode. There will be no change to your system.

Parameter Validation: Successful

Grid Infrastructure home:
/u01/app/oracle/product/12.1.0/grid
RAC home(s):
/u01/app/oracle/product/12.1.0/database

Configuration Validation: Successful

Patch Location: /mnt/hgfs/12cGridSoftware/12102-PSU1/19392646
Grid Infrastructure Patch(es): 19303936 19392590 19392604
RAC Patch(es): 19303936 19392604

Patch Validation: Successful

Analyzing patch(es) on “/u01/app/oracle/product/12.1.0/database” …
Patch “/mnt/hgfs/12cGridSoftware/12102-PSU1/19392646/19303936” successfully analyzed on “/u01/app/oracle/product/12.1.0/database” for apply.
Patch “/mnt/hgfs/12cGridSoftware/12102-PSU1/19392646/19392604” successfully analyzed on “/u01/app/oracle/product/12.1.0/database” for apply.

Analyzing patch(es) on “/u01/app/oracle/product/12.1.0/grid” …
Patch “/mnt/hgfs/12cGridSoftware/12102-PSU1/19392646/19303936” successfully analyzed on “/u01/app/oracle/product/12.1.0/grid” for apply.
Patch “/mnt/hgfs/12cGridSoftware/12102-PSU1/19392646/19392590” successfully analyzed on “/u01/app/oracle/product/12.1.0/grid” for apply.
Patch “/mnt/hgfs/12cGridSoftware/12102-PSU1/19392646/19392604” successfully analyzed on “/u01/app/oracle/product/12.1.0/grid” for apply.

Apply Summary:
Following patch(es) are successfully analyzed:
GI Home: /u01/app/oracle/product/12.1.0/grid: 19303936, 19392590, 19392604
RAC Home: /u01/app/oracle/product/12.1.0/database: 19303936, 19392604

opatchauto succeeded.
[root@ol64afd OPatch]# ./opatchauto apply /mnt/hgfs/12cGridSoftware/12102-PSU1/19392646 -ocmrf ocm/bin/ocm.rsp
OPatch Automation Tool
Copyright (c) 2014, Oracle Corporation. All rights reserved.

OPatchauto version : 12.1.0.1.5
OUI version : 12.1.0.2.0
Running from : /u01/app/oracle/product/12.1.0/grid

opatchauto log file: /u01/app/oracle/product/12.1.0/grid/cfgtoollogs/opatchauto/19392646/opatch_gi_2014-10-30_16-07-28_deploy.log

Parameter Validation: Successful

Grid Infrastructure home:
/u01/app/oracle/product/12.1.0/grid
RAC home(s):
/u01/app/oracle/product/12.1.0/database

Configuration Validation: Successful

Patch Location: /mnt/hgfs/12cGridSoftware/12102-PSU1/19392646
Grid Infrastructure Patch(es): 19303936 19392590 19392604
RAC Patch(es): 19303936 19392604

Patch Validation: Successful

Stopping RAC (/u01/app/oracle/product/12.1.0/database) … Successful
Following database(s) and/or service(s) were stopped and will be restarted later during the session: yoda

Applying patch(es) to “/u01/app/oracle/product/12.1.0/database” …
Patch “/mnt/hgfs/12cGridSoftware/12102-PSU1/19392646/19303936” successfully applied to “/u01/app/oracle/product/12.1.0/database”.
Patch “/mnt/hgfs/12cGridSoftware/12102-PSU1/19392646/19392604” successfully applied to “/u01/app/oracle/product/12.1.0/database”.

Stopping CRS … Successful

Applying patch(es) to “/u01/app/oracle/product/12.1.0/grid” …
Patch “/mnt/hgfs/12cGridSoftware/12102-PSU1/19392646/19303936” successfully applied to “/u01/app/oracle/product/12.1.0/grid”.
Patch “/mnt/hgfs/12cGridSoftware/12102-PSU1/19392646/19392590” successfully applied to “/u01/app/oracle/product/12.1.0/grid”.
Patch “/mnt/hgfs/12cGridSoftware/12102-PSU1/19392646/19392604” successfully applied to “/u01/app/oracle/product/12.1.0/grid”.

Starting CRS … Successful

Starting RAC (/u01/app/oracle/product/12.1.0/database) … Successful

Apply Summary:
Following patch(es) are successfully installed:
GI Home: /u01/app/oracle/product/12.1.0/grid: 19303936, 19392590, 19392604
RAC Home: /u01/app/oracle/product/12.1.0/database: 19303936, 19392604


Oracle Multitenant plugin
the file names in the multitenant manifest or in the full transportable database import command must match all files exactly for the operation to complete

There are some difficulties when OMF/ASM is used, files are copied, and a physical standby database is in place. There have been improvements made to the multitenant plugin operation on both the primary and standby environments, however at this time additional work must still be done on the standby database when a full transportable database import operation is performed.

RMAN has been enhanced so that, when copying files between databases it recognizes the GUID and acts accordingly when writing the files.

* If the clone/auxiliary instance being connected to for clone operations is a CDB root, the GUID of the RMAN target database is used to determine the directory structure to write the datafiles. Connect to the CDB root as the RMAN clone/auxiliary instance when the source database should be a 12c non-CDB or PDB that is going to be migrated and plugged into a remote CDB as a brand new PDB. This will ensure that the files copied by RMAN will be written to the GUID directory of source database for the migration.

* If the clone/auxiliary instance being connected to for clone operations is a PDB, the GUID of the auxiliary PDB will be used to determine the directory structure to write the datafiles. Connect to the destination PDB as the RMAN clone auxiliary instance when the source database is a 12c non-CDB or PDB that requires a cross platform full transportable database import and the data and files will be imported into an existing PDB. This will ensure the files copied by RMAN will be written to the GUID directory of the PDB target database for the migration.

Posted in ASM

With busy weeks of IOUG and other conferences coming up, we have little time to blog…. So, in the coming weeks, I’m just going to do some “baby” blogs; i.e., some quick tips and new features

Here’s a new 12c new feature that simplifies snapshotting databases

Snapshot Optimized Recovery

There’s many of you that take snapshot copies of database, either via server-side snapshot tools or using storage level snapshots. Usually this required a cold database or putting the database in hot-backup mode. However, there are downsides to both options

In Oracle 12c, third-party snapshots technologies that meet the following requirements can be taken without requiring the database to be placed in backup mode:

Database is crash consistent at the point of the snapshot.
Write ordering is preserved for each file within a snapshot.
Snapshot stores the time at which a snapshot is completed.

The new RECOVER SNAPSHOT TIME command is introduced to recover a snapshot to a consistent point, without any additional manual procedures for point-in-time recovery needs.
This command performs the recovery in a single step. Recovery can be either to the current time or to a point in time after the snapshot was taken

Though there is a bit upfront overhead; e.g.,additional redo logging and a complete database checkpoint.


It’s time for the the annual IOUG Collaborate Conference again, April 7-11 in Las Vegas at the Venetian and Sands Expo Center.

We have a line up of great tracks and speakers focused on Cloud Computing, and this is a mini-compilation of the sessions focused on Cloud Tracks.

Enjoy and look forward to meeting everyone at Collaborate (#C14LV).

Best Wishes,

Charles Kim and the Cloud SIG Team (George, Bert, Kai, Ron, Steve).


Oracle’s Application Express platform (APEX) has emerged as the most compelling RAD platform available in the Oracle ecosystem. It is easy to learn, provides broadly useful functionality and is free to develop and use for existing Oracle customers. However, it has never been positioned as a “serious” development platform for mission critical enterprise (and certainly not commercial) product development. After all, it’s free and it doesn’t support Java. How important a platform could it be whose language underpinnings are SQL and PL/SQL? Why would I trust a platform for critical product development that’s not encouraged or even considered for such things by its own sales force?

Yet development trends are telling a different story. Organizations across industry boundaries and of every size and profile increasingly rely on APEX’s predictable ROI to create not just internal one-off applications and utilities but full-scale, enterprise-critical and even commercial cloud-based offerings.

Come to this session and see how we are leveraging APEX with FOEX as the preferred enterprise development tool and how it can be used to create compelling applications across full spectrums of criticality and enterprise expectation.


I just worked on a production Exadata X2 Full stack upgrade this past weekend this is something Viscosity specializes in and are experts in this arena. We can actually do a Full Complete Exadata Stack and even ZFS Storage upgrade of ALL components and ensure they are up to date with the latest firmware and patches and actually suggest to do this at least every 6 months or once a year if you can get the downtime.

Everything went very well in our upgrade and I thought we were done and ready to home after upgrading the firmware on the Cisco switch which was one of the last items to upgrade. When we applied the firmware update from a remote location we realized there was a problem when the switch did not come online after a few minutes, also there was no response from a ping or telnet either. We had no choice at this point but to physically get to the Datacenter ASAP and check the status of the switch since there is no ILOM or KVM to control the management of the switch.

All Exadata systems contain a management network running on at least 1000BASE-T (also known as IEEE 802.3ab) is a standard for gigabit Ethernet or 1Gb Ethernet. All servers (database and storage) have a connection to this network, along with each server’s Integrated Lights Out Management (ILOM) card. The Infiniband switches also have connections to the management network, all of which is then routed through the Cisco Catalyst 4948 switch which has 48 ethernet ports. The KVM switch console on the X2-2 and PDUs also have optional connectivity to the management network and are also connected to the switch.

 

When I got to the Datacenter and looked at the Cisco Switch from the back I noticed an Amber colored light for the Status LED light which indicates there is a system fault. I also turned off and unplugged both power cords from the front of the switch and waited about a minute as per Oracle Support’s instructions to power cycle the switch and that also did not bring the switch online, a telnet and ping did not return anything.

Cisco-4948_status_lights

 

 

 

Then we thought of connecting to the switch via the Console port via a serial cable to see if we could manage the switch and get a command prompt to get the status. You cannot simply use an RJ45 ethernet cable to connect from your laptop to the switch’s Console port. You actually need to connect using a USB to Serial to ethernet cable as shown below.

usb_to_serial

 

 

Once you connect to the console port you will see a Serial device on your computer and check the properties to see its COM port number, please note this since it will be used to connect directly to the switch.

Steps to connect to the Cisco Switch from your computer.

  1. You need to find out which com port your prolific usb to serial cable is connected to on your laptop.
  2. Connect to the Console port of the Cisco switch not the Management Port, image shown below and I circled the correct port, the one below which is labeled MGT did not work.
  3. 9600 is usually your connect speed. Use a terminal program such as putty and note your com port and set the baud rate.
  4. Enter return and you should see the prompt.
  5. Type in enable to get a prompt and enter the switch password.
  6. Now issue the boot command to startup the switch.

 

cisco_4948_circle_port

 

Once the Cisco Switch came online we were able to proceed and complete the finally complete the upgrade with the new firmware which allows ssh connectivity and you can also optionally disable telnet if you have strict security guidelines which does not allow it.

 


My new favorite 12c Oracle Clusterware command is the 'crsctl stat res "resource name" -dependency'

What this command does, is to provide a dependency tree structure for resource the in question.  This will display startup (default) and shutdown dependencies.  

From this we can understand the pull-up, pushdown, weak, and hard dependencies between clusterware resources 


[oracle@rac02 ~]$ crsctl stat res ora.dagobah.db -dependency
================================================================================
Resource Start Dependencies
================================================================================
---------------------------------ora.dagobah.db---------------------------------
ora.dagobah.db(ora.database.type)->
| type:ora.listener.type[weak:type]
| | type:ora.cluster_vip_net1.type[hard:type,pullup:type]
| | | ora.net1.network(ora.network.type)[hard,pullup]
| | | ora.gns<Resource not found>[weak:global]
| type:ora.scan_listener.type[weak:type:global]
| | ora.scan1.vip(ora.scan_vip.type)[hard,pullup]
| | | ora.net1.network(ora.network.type)[hard,pullup:global]
| | | ora.gns<Resource not found>[weak:global]
| | | type:ora.scan_vip.type[dispersion:type:active]
| | type:ora.scan_listener.type[dispersion:type:active]
| ora.ons(ora.ons.type)[weak:uniform]
| | ora.net1.network(ora.network.type)[hard,pullup]
| ora.gns<Resource not found>[weak:global]
| ora.PDBDATA.dg(ora.diskgroup.type)[weak:global:uniform]
| | ora.asm(ora.asm.type)[hard,pullup:always]
| | | ora.LISTENER.lsnr(ora.listener.type)[weak]
| | | | type:ora.cluster_vip_net1.type[hard:type,pullup:type]
| | | | | ora.net1.network(ora.network.type)[hard,pullup]
| | | | | ora.gns<Resource not found>[weak:global]
| | | ora.ASMNET1LSNR_ASM.lsnr(ora.asm_listener.type)[hard,pullup]
| | | | ora.gns<Resource not found>[weak:global]
| ora.FRA.dg(ora.diskgroup.type)[hard:global:uniform,pullup:global]
| | ora.asm(ora.asm.type)[hard,pullup:always]
| | | ora.LISTENER.lsnr(ora.listener.type)[weak]
| | | | type:ora.cluster_vip_net1.type[hard:type,pullup:type]
| | | | | ora.net1.network(ora.network.type)[hard,pullup]
| | | | | ora.gns<Resource not found>[weak:global]
| | | ora.ASMNET1LSNR_ASM.lsnr(ora.asm_listener.type)[hard,pullup]
| | | | ora.gns<Resource not found>[weak:global]
--------------------------------------------------------------------------------

Now the same for shutdown (pushdown) dependencies

[oracle@rac02 ~]$ crsctl stat res ora.dagobah.db -dependency -stop
================================================================================
Resource Stop Dependencies
================================================================================
---------------------------------ora.dagobah.db---------------------------------
ora.dagobah.db(ora.database.type)->
| ora.dagobah.hoth.svc(ora.service.type)[hard:intermediate]
| ora.dagobah.r2d2.svc(ora.service.type)[hard:intermediate]
--------------------------------------------------------------------------------

Why is this command and output important?  Well, in cases where a particular resource doesn't come up, you may want to understand relationship with its dependents
The reason is, if you are creating your own resource dependencies using the CRS API (formally known as CLSCRS API).

<pre>CLSCRS is a set of C-based APIs for Oracle Clusterware. The CLSCRS APIs enable you to manage the operation of entities that are managed by Oracle Clusterware. These entities include resources, resource types, servers, and server pools. You can use the APIs to register user applications with Oracle Clusterware so that the clusterware can manage them and maintain high availability. Once an application is registered, you can manage, monitor and query the application's status.  The APIs allow you to use the callbacks for diagnostic logging.

</pre>

In this blog I want to illustrate the benefits of deploying PDB with RAC Services.  Although the key ingredient is the Service, RAC provides the final mile for scalability and availability.  In my mind I would not implement PDB w/o RAC

Anyways here we go…

The goal is to illustrate that Database [RAC] Services integration with PDBs provides seamless management and availability.

Initially, we have only the PDB$SEED. 

SQL> select * from v$pdbs;


CON_ID       DBID     CON_UID GUID                             NAME                           OPEN_MODE  RES OPEN_TIME                                                              CREATE_SCN TOTAL_SIZE
---------- ---------- ---------- -------------------------------- ------------------------------ ---------- --- --------------------------------------------------------------------------- ---------- ----------
2 4080865680 4080865680 F13EFFD958E24857E0430B2910ACF6FD PDB$SEED                       READ ONLY  NO  17-FEB-14 01.01.13.909 PM                                                 1720768  283115520

Let's create a PDB from the SEED (I have shown this from an earlier Blog post)

SQL> CREATE PLUGGABLE DATABASE pdbhansolo admin user hansolo identified by hansolo roles=(dba);

Pluggable database created.

Now we have the new PDB listed.

SQL> select * from v$pdbs;

CON_ID     DBID       CON_UID     GUID                            NAME                           OPEN_MODE  RES OPEN_TIME                                                              CREATE_SCN TOTAL_SIZE
---------- ---------- ---------- -------------------------------- ------------------------------ ----------- ----------------------------------------------------------------------- ---------- ----------
2          4080865680 4080865680 F13EFFD958E24857E0430B2910ACF6FD PDB$SEED                       READ ONLY  NO  17-FEB-14 01.01.13.909 PM                                                 1720768  283115520
         3 3403102439 3403102439 F2A023F791663F8DE0430B2910AC37F7 PDBHANSOLO                     MOUNTED        17-FEB-14 01.27.08.942 PM                                                 1846849          0

But notice that its in "MOUNTED" status. Even if I restart the whole CDB, the new PDB will not come up in OPEN READ WRITE mode. If we want to have the
PDB available on startup. Here's how we go about resolving it.

When we create or plug in a new PDB, a default Service gets created, as with previous versions, it is highly recommended not to connect to Service. Oracle took this one step forward and forced users to create a user generated Service. So let's associate a user Service with that PDB. Notice that there's a "-pdb" flag in the add service command.

$ srvctl add service -d dagobah -s hoth -pdb pdbhansolo


[oracle@rac02 ~]$ srvctl config service -d dagobah -verbose
Service name: Hoth
Service is enabled
Server pool: Dagobah
Cardinality: 1
Disconnect: false
Service role: PRIMARY
Management policy: AUTOMATIC
DTP transaction: false
AQ HA notifications: false
Global: false
Commit Outcome: false
Failover type:
Failover method:
TAF failover retries:
TAF failover delay:
Connection Load Balancing Goal: LONG
Runtime Load Balancing Goal: NONE
TAF policy specification: NONE
Edition:
----> Pluggable database name: pdbhansolo
Maximum lag time: ANY
SQL Translation Profile:
Retention: 86400 seconds
Replay Initiation Time: 300 seconds
Session State Consistency:
Preferred instances: Dagobah_1
Available instances: 

And the Service is registered with the listener

     
[oracle@rac02 ~]$ lsnrctl stat

LSNRCTL for Linux: Version 12.1.0.1.0 - Production on 17-FEB-2014 13:34:41

Copyright (c) 1991, 2013, Oracle.  All rights reserved.

Connecting to (ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=1521))
STATUS of the LISTENER
------------------------
Alias                     LISTENER
Version                   TNSLSNR for Linux: Version 12.1.0.1.0 - Production
Start Date                17-FEB-2014 12:59:46
Uptime                    0 days 0 hr. 34 min. 54 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      OFF
Listener Parameter File   /u01/app/12.1.0/grid/network/admin/listener.ora
Listener Log File         /u01/app/oracle/diag/tnslsnr/rac02/listener/alert/log.xml
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=172.16.41.11)(PORT=1521)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=172.16.41.21)(PORT=1521)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcps)(HOST=rac02.viscosityna.com)(PORT=5500))(Security=(my_wallet_directory=/u02/app/oracle/product/12.1.0/db/admin/Dagobah/xdb_wallet))(Presentation=HTTP)(Session=RAW))
Services Summary...
Service "+ASM" has 1 instance(s).
  Instance "+ASM3", status READY, has 2 handler(s) for this service...
Service "Dagobah" has 1 instance(s).
  Instance "Dagobah_1", status READY, has 1 handler(s) for this service...
Service "DagobahXDB" has 1 instance(s).
  Instance "Dagobah_1", status READY, has 1 handler(s) for this service...
---->Service "Hoth" has 1 instance(s).
  Instance "Dagobah_1", status READY, has 1 handler(s) for this service...
 Service "pdbhansolo" has 1 instance(s).
  Instance "Dagobah_1", status READY, has 1 handler(s) for this service...
Service "r2d2" has 1 instance(s).
  Instance "Dagobah_1", status READY, has 1 handler(s) for this service...
The command completed successfully

Now let's test this. I close the PDB and I also stopped the CDB (probably not necessary, but what the heck :-))

SQL> alter session set container=cdb$root;

Session altered.

SQL> alter pluggable database pdbhansolo close;

[oracle@rac02 ~]$ srvctl stop database -d dagobah

[oracle@rac02 ~]$ lsnrctl stat

LSNRCTL for Linux: Version 12.1.0.1.0 - Production on 18-FEB-2014 15:36:49

Copyright (c) 1991, 2013, Oracle.  All rights reserved.

Connecting to (ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=1521))
STATUS of the LISTENER
------------------------
Alias                     LISTENER
Version                   TNSLSNR for Linux: Version 12.1.0.1.0 - Production
Start Date                18-FEB-2014 12:57:30
Uptime                    0 days 2 hr. 39 min. 19 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      OFF
Listener Parameter File   /u01/app/12.1.0/grid/network/admin/listener.ora
Listener Log File         /u01/app/oracle/diag/tnslsnr/rac02/listener/alert/log.xml
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=172.16.41.11)(PORT=1521)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=172.16.41.21)(PORT=1521)))
Services Summary...
Service "+APX" has 1 instance(s).
  Instance "+APX3", status READY, has 1 handler(s) for this service...
Service "+ASM" has 1 instance(s).
  Instance "+ASM3", status READY, has 2 handler(s) for this service...
The command completed successfully


[oracle@rac02 ~]$ srvctl start database -d dagobah

[oracle@rac02 ~]$ lsnrctl stat

LSNRCTL for Linux: Version 12.1.0.1.0 - Production on 18-FEB-2014 15:37:39

Copyright (c) 1991, 2013, Oracle.  All rights reserved.

Connecting to (ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=1521))
STATUS of the LISTENER
------------------------
Alias                     LISTENER
Version                   TNSLSNR for Linux: Version 12.1.0.1.0 - Production
Start Date                18-FEB-2014 12:57:30
Uptime                    0 days 2 hr. 40 min. 9 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      OFF
Listener Parameter File   /u01/app/12.1.0/grid/network/admin/listener.ora
Listener Log File         /u01/app/oracle/diag/tnslsnr/rac02/listener/alert/log.xml
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=172.16.41.11)(PORT=1521)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=172.16.41.21)(PORT=1521)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcps)(HOST=rac02.viscosityna.com)(PORT=5500))(Security=(my_wallet_directory=/u02/app/oracle/product/12.1.0/db/admin/Dagobah/xdb_wallet))(Presentation=HTTP)(Session=RAW))
Services Summary...
Service "+APX" has 1 instance(s).
  Instance "+APX3", status READY, has 1 handler(s) for this service...
Service "+ASM" has 1 instance(s).
  Instance "+ASM3", status READY, has 2 handler(s) for this service...
Service "Dagobah" has 1 instance(s).
  Instance "Dagobah_1", status READY, has 1 handler(s) for this service...
Service "DagobahXDB" has 1 instance(s).
  Instance "Dagobah_1", status READY, has 1 handler(s) for this service...
---->Service "Hoth" has 1 instance(s).
  Instance "Dagobah_1", status READY, has 1 handler(s) for this service...
 Service "pdbhansolo" has 1 instance(s).
  Instance "Dagobah_1", status READY, has 1 handler(s) for this service...</strong>
Service "r2d2" has 1 instance(s).
  Instance "Dagobah_1", status READY, has 1 handler(s) for this service...
The command completed successfully
[oracle@rac02 ~]$

SQL> select NAME,OPEN_MODE from v$pdbs;

NAME                           OPEN_MODE
------------------------------ ----------
PDB$SEED                       READ ONLY
--> PDBHANSOLO                     READ WRITE

Now I can connect to this PDB using my lovely EZConnect string

sqlplus hansolo/hansolo@rac02/hoth

So let's see this relationship between the service (Hoth) and the PLuggable Database (pdbhansolo)

$crsctl stat res ora.dagobah.hoth.svc -p

<strong>NAME=ora.dagobah.hoth.svc</strong>
TYPE=ora.service.type
ACL=owner:oracle:rwx,pgrp:oinstall:rwx,other::r--
….
….
DELETE_TIMEOUT=60
DESCRIPTION=Oracle Service resource
GEN_SERVICE_NAME=Hoth
GLOBAL=false
GSM_FLAGS=0
HOSTING_MEMBERS=
INSTANCE_FAILOVER=1
INTERMEDIATE_TIMEOUT=0
LOAD=1
LOGGING_LEVEL=1
MANAGEMENT_POLICY=AUTOMATIC
MAX_LAG_TIME=ANY
MODIFY_TIMEOUT=60
NLS_LANG=
NOT_RESTARTING_TEMPLATE=
OFFLINE_CHECK_INTERVAL=0
PLACEMENT=restricted
<strong>PLUGGABLE_DATABASE=pdbhansolo
</strong>PROFILE_CHANGE_TEMPLATE=
RELOCATE_BY_DEPENDENCY=1

The key here is that RAC Service of PDBHANSOLO (non-default) becomes an important aspect of PDB auto-startup. Without the use the non-default service the PDB does not open 'read write' automatically. So where does RAC fit in here. Well if I have say a 6 node RAC cluster, I can have some PDSB-Services not started on certain nodes, this effectively prevents access to those PDB from certain nodes, thus I can have a certain pre-defined workload distribution. That's a topic for my next Blog and Oracle Users Group presentation 🙂


Recently I came across a script from Oracle Support written recently to check in the ASM storage to see if a disk or a cell failure/loss can be tolerated, the script will report a PASS or FAIL status depending on whether rebalancing can occur after the loss of a disk or cell(12 disks) in the Exadata Storage. The risk a cell server can fail is unlikely but could occur I personally faced this issue almost 1 year ago in a Production environment with a Half Rack(7 cell nodes) when we lost a cell node for almost 2-3 days however we had enough free space for rebalancing to occur and we could tolerate the lost cell node and there was no downtime to any of the databases.

 

The Oracle Support note is listed below and the script is also attached to it.

 

Understanding ASM Capacity and Reservation of Free Space in Exadata (Doc ID 1551288.1)

 

Some key points

 

  • Ensure that you keep FREE_MB column in the ASM lsdg output above the Cell Required Mirror Free MB or Disk Required Mirror Free MB at all times, this number should not go Negative.
  • Disk Required Mirror Free MB is the amount of space that should be reserved for disk failure coverage.
  • One Cell Required Mirror Free MB is the amount of space to reserve for single cell failure coverage, regardless of redundancy type.

 

 

Script output below with BEFORE/AFTER results and expected output that will be sent in case of a failure.

 

BEFORE


State    Type    Rebal  Sector  Block       AU  Total_MB   Free_MB  Req_mir_free_MB  Usable_file_MB  Offline_disks  Voting_files  Name

 

MOUNTED  NORMAL  N         512   4096  4194304  54042624  20132832         18014208         1059312              0             N  DATA_EXAD/

 

MOUNTED  NORMAL  N         512   4096  4194304    894240    636448           298080          169184              0             Y  DBFS_DG/

 

MOUNTED  NORMAL  N         512   4096  4194304  13512384   7173544          4504128         1334708              0             N  RECO_EXAD/

 

AFTER


State    Type    Rebal  Sector  Block       AU  Total_MB   Free_MB  Req_mir_free_MB  Usable_file_MB  Offline_disks  Voting_files  Name

MOUNTED  NORMAL  N         512   4096  4194304  54042624  28095768         18014208         5040780              0             N  DATA_EXAD/

MOUNTED  NORMAL  N         512   4096  4194304    894240    636448           298080          169184              0             Y  DBFS_DG/

MOUNTED  NORMAL  N         512   4096  4194304  13512384   7173208          4504128         1334540              0             N  RECO_EXAD/

 

BEFORE


SQL> @check_asm.sql

------ DISK and CELL Failure Diskgroup Space Reserve Requirements  ------

This procedure determines how much space you need to survive a DISK or CELL

failure. It also shows the usable space

available when reserving space for disk or cell failure.

Please see MOS note 1551288.1 for more information.

.  .  .

Description of Derived Values:

Cell Required Mirror Free MB     : Free MB needed to permit successful rebalance

after losing largest CELL in a DG

2 Cell Required Mirror Free MB   : Free MB needed to permit successful rebalance

after losing 2 largest CELLs in high redundancy DG

Disk Required Mirror Free MB     : Free MB needed to rebalance after loss of

single disk (normal redundancy DG) or double disk (high redundancy DG)

Disk Failure Usable File MB      : Usable space available after reserving space

for disk failure (1 disk in normal or 2 disks in high redundancy DG) and

accounting for mirroring

Cell Failure Usable File MB      : Usable space available after reserving space

for 1 cell failure and accounting for mirroring

2 Cell Failure Usable File MB    : Usable space available after reserving space

for 2 cell failures and accounting for mirroring in a HIGH redundancy DG

.  .  .

ASM Version: 11.2.0.2  - WARNING DISK FAILURE COVERAGE ESTIMATES HAVE NOT BEEN

VERIFIED ON 11.2.0.2 !

.  .  .

-------------------------------------------------------------------------

DG Name:                                 DATA_EXAD

DG Type:                                    NORMAL

Num Disks:                                      36

Disk Size MB:                            1,501,184

.  .  .

DG Total MB:                            54,042,624

DG Used MB:                             34,648,092

DG Free MB:                             19,394,532

.  .  .

Cell Required Mirror Free MB:           27,021,312

.  .  .

Disk Required Mirror Free MB:            1,636,279

.  .  .

Disk Failure Usable File MB:             8,879,126

Cell Failure Usable File MB:            -3,813,390

.  .  .

Enough Free Space to Rebalance after loss of ONE disk: PASS

Enough Free Space to Rebalance after loss of ONE cell: FAIL

-------------------------------------------------------------------------

DG Name:                                   DBFS_DG

DG Type:                                    NORMAL

Num Disks:                                      30

Disk Size MB:                               29,808

.  .  .

DG Total MB:                               894,240

DG Used MB:                                257,792

DG Free MB:                                636,448

.  .  .

Cell Required Mirror Free MB:              447,120

.  .  .

Disk Required Mirror Free MB:               53,600

.  .  .

Disk Failure Usable File MB:               291,424

Cell Failure Usable File MB:                94,664

.  .  .

Enough Free Space to Rebalance after loss of ONE disk: PASS

Enough Free Space to Rebalance after loss of ONE cell: PASS

-------------------------------------------------------------------------

DG Name:                                 RECO_EXAD

DG Type:                                    NORMAL

Num Disks:                                      36

Disk Size MB:                              375,344

.  .  .

DG Total MB:                            13,512,384

DG Used MB:                              7,484,712

DG Free MB:                              6,027,672

.  .  .

Cell Required Mirror Free MB:            6,756,192

.  .  .

Disk Required Mirror Free MB:              423,896

.  .  .

Disk Failure Usable File MB:             2,801,888

Cell Failure Usable File MB:              -364,260

.  .  .

Enough Free Space to Rebalance after loss of ONE disk: PASS

Enough Free Space to Rebalance after loss of ONE cell: FAIL

.  .  .

Script completed.

 

PL/SQL procedure successfully completed.

 

SQL> exit

AFTER


 

 

SQL> @check_asm.sql

------ DISK and CELL Failure Diskgroup Space Reserve Requirements  ------

This procedure determines how much space you need to survive a DISK or CELL

failure. It also shows the usable space

available when reserving space for disk or cell failure.

Please see MOS note 1551288.1 for more information.

.  .  .

Description of Derived Values:

Cell Required Mirror Free MB     : Free MB needed to permit successful rebalance

after losing largest CELL in a DG

2 Cell Required Mirror Free MB   : Free MB needed to permit successful rebalance

after losing 2 largest CELLs in high redundancy DG

Disk Required Mirror Free MB     : Free MB needed to rebalance after loss of

single disk (normal redundancy DG) or double disk (high redundancy DG)

Disk Failure Usable File MB      : Usable space available after reserving space

for disk failure (1 disk in normal or 2 disks in high redundancy DG) and

accounting for mirroring

Cell Failure Usable File MB      : Usable space available after reserving space

for 1 cell failure and accounting for mirroring

2 Cell Failure Usable File MB    : Usable space available after reserving space

for 2 cell failures and accounting for mirroring in a HIGH redundancy DG

.  .  .

ASM Version: 11.2.0.2  - WARNING DISK FAILURE COVERAGE ESTIMATES HAVE NOT BEEN

VERIFIED ON 11.2.0.2 !

.  .  .

-------------------------------------------------------------------------

DG Name:                                 DATA_EXAD

DG Type:                                    NORMAL

Num Disks:                                      36

Disk Size MB:                            1,501,184

.  .  .

DG Total MB:                            54,042,624

DG Used MB:                             25,946,856

DG Free MB:                             28,095,768

.  .  .

Cell Required Mirror Free MB:           27,021,312

.  .  .

Disk Required Mirror Free MB:            1,636,279

.  .  .

Disk Failure Usable File MB:            13,229,744

Cell Failure Usable File MB:               537,228

.  .  .

Enough Free Space to Rebalance after loss of ONE disk: PASS

Enough Free Space to Rebalance after loss of ONE cell: PASS

-------------------------------------------------------------------------

DG Name:                                   DBFS_DG

DG Type:                                    NORMAL

Num Disks:                                      30

Disk Size MB:                               29,808

.  .  .

DG Total MB:                               894,240

DG Used MB:                                257,792

DG Free MB:                                636,448

.  .  .

Cell Required Mirror Free MB:              447,120

.  .  .

Disk Required Mirror Free MB:               53,600

.  .  .

Disk Failure Usable File MB:               291,424

Cell Failure Usable File MB:                94,664

.  .  .

Enough Free Space to Rebalance after loss of ONE disk: PASS

Enough Free Space to Rebalance after loss of ONE cell: PASS

-------------------------------------------------------------------------

DG Name:                                 RECO_EXAD

DG Type:                                    NORMAL

Num Disks:                                      36

Disk Size MB:                              375,344

.  .  .

DG Total MB:                            13,512,384

DG Used MB:                              6,339,176

DG Free MB:                              7,173,208

.  .  .

Cell Required Mirror Free MB:            6,756,192

.  .  .

Disk Required Mirror Free MB:              423,896

.  .  .

Disk Failure Usable File MB:             3,374,656

Cell Failure Usable File MB:               208,508

.  .  .

Enough Free Space to Rebalance after loss of ONE disk: PASS

Enough Free Space to Rebalance after loss of ONE cell: PASS

.  .  .

Script completed.

 

PL/SQL procedure successfully completed.