===== Real Application Cluster - Infiniband für den Interconnect =====
Konfiguration:
* Infiniband Treiber einrichten
* Oracle Rac mit UDP testen
* Oracle Kernel auf RDS Umstellen
==== Kernel parameter überprüfen ====
^Parameter^Value^
|net.ipv4.ip_local_port_ range| 1024 65000|
|Net.core. rmem_default|262144|
|Net.core. rmem_max|262144|
|Net.core. wmem_default|262144|
|Net.core. wmem_max|262144|
===== Was für eine Karte ist installiert
$ lspci | grep Infini
...
47:00.0 InfiniBand: Mellanox Technologies MT25418 [ConnectX IB DDR, PCIe 2.0GT/s] (rev a0)
...
Treiber: \\
* http://www.mellanox.com/content/pages.php?pg=products_dyn&product_family=26&menu_section=34#tab-three
==== Infiniband Treiber einrichten ====
Download des Treiber unter der SilverStorm website.\\
=== Installation ===
Oracle verwendet RDS als natives InfiniBand Protokol. \\
RDS benötigt IP over IB beide Protokolle müssen ausgewählt werden.
=== Konfiguration ===
Datei ipoib.cfg (vom Installer angelegt) prüfen.
=== Start / Stop ===
Prüfen ob RDS nach dem Boot läuft \\
Starten der RDS driver über "/etc/init.d/rds start"\\
\\
oder über:\\
# iba_start {rds | ics_init | ipoib…}
# iba_stop {rds | ics_init | ipoib…}
mit lsmod prüfen ob der Treiber im Kernel gebunden ist:
lsmod | grep rds
Infiniband Connect zwischen den Knoten überprüfen
#1. Check the list of connect hosts
[root@GPIrac1 ~]# ibnetdiscover -l
Ca : 0x0002c99999999996 ports 2 devid 0x634a vendid 0x2c9 "GPIrac3 HCA-1"
Ca : 0x0023788888888880 ports 2 devid 0x634a vendid 0x2c9 "GPIrac2 HCA-1"
Ca : 0x002377777777778 ports 2 devid 0x634a vendid 0x2c9 "GPIrac1 HCA-1"
Switch : 0x00227777777777a8 ports 32 devid 0xbd36 vendid 0x2c9 "Infiniscale-IV Mellanox Technologies"
#2. Get GuID of connected Ports to ping
[root@GPIrac1 ~]# ibnetdiscover -p | grep GPI | awk '{print $11 " -> " $17 }' | grep GPI
0x0002c9999999999 -> 'GPIrac3
0x0002c8888888888 -> 'GPIrac2
0x0002c7777777777 -> 'GPIrac1
#3. Start ping Server on the target hosts in a new ssh session
[root@GPIrac3 ~]# ibping -S
and
[root@GPIrac2 ~]# ibping -S
and
[root@GPIrac1 ~]# ibping -S
#4. Ping the target from all Hosts
example for host 1
[root@GPIrac1 ~]# ibping -G 0x0002c9999999999
[root@GPIrac1 ~]# ibping -G 0x0002c8888888888
[root@GPIrac1 ~]# ibping -G 0x0002c7777777777
or
ibping -G 0x0002c9999999999 -f -c 10
-f flood, -c 10 roundtrips
==== Oracle Rac mit UDP testen ====
Verwendung des Infiniband Devies als Interconnect sicherstellen:
oifcfg getif –global
oifcfg delif –global
oifcfg setif –global ib1/192.168.1.0:cluster_interconnect
==== Oracle Kernel auf RDS Umstellen ====
Was wird akuelle für ein Protokoll für den Interconnect verwendet?\\
Siehe alert.log der aktuellen DB:\\
...
Cluster communication is configured to use the following interface(s) for this instance
10.10.10.118
cluster interconnect IPC version:Oracle UDP/IP (generic)
...
Datenbank stoppen.\\
!! Im Cluster auch prüfen ob wirklich alle Instancen gestoppt sind!!! \\
sonst : ***ORA-27550: Target ID protocol check failed. tid vers=1, type=1, remote instance number=3, local instance number=1***\\
\\
Durch linken des Oracle Kernels mit der rds library RDS aktivieren.
cd $ORACLE_HOME/rdbms/lib
make -f ins_rdbms.mk ipc_rds
[oracle@c7000rac1 lib]$ make -f ins_rdbms.mk ipc_rds
# ... rm -f $ORACLE_HOME/lib/libskgxp11.so
# ... cp $ORACLE_HOME/lib//libskgxpr.so /u01/app/oracle/product/11.2.0/dbhome_1/lib/libskgxp11.so
==== Oracle Linux 6.1 bzw. Scientific Linux (SL) Problematik ====
Installation der Mellanox Treiber schlägt fehl:
\\
\\
Problem:
./mlnxofedinstall
The 2.6.32-100.34.1.el6uek.x86_64 kernel is installed, but do not have drivers available.
Cannot continue.
**Lösung**:\\
Mellanox Treiber manuel installieren\\
(Sorry, in Englisch da für englischsprachigen Kunden .-) )\\
Ablauf:
* Mount ISO Image of Mellanox Driver to /mnt/mellanox
* Create directory /tmp/build_driver_mellanox
* Copy from src directory from iso image the source code to /tmp/build_driver_ mellanox
* Copy Conf File to /tmp/build_driver_ mellanox (from failed installation!)
* Start Kernel driver installation ( ./install.pl –c ofed.conf)
* Install Tools from rpm directory of iso cd
Mount the Infiniband Setup ISO Image via the command
# mkdir /mnt/mellanox
# mount -o loop -t iso9660 /installfiles/06_mellanox/MLNX_OFED_LINUX-1.5.3-3.0.0-rhel5.7-x86_64.iso /mnt/mellanox
Building RPMs for un-supported kernels.
mkdir /tmp/build_driver_mellanox
cd /mnt/mellanox/src
cp MLNX_OFED_SRC-1.5.3-3.0.0.tgz /tmp/build_driver_mellanox
cd /tmp/build_driver_mellanox
tar zxvf MLNX_OFED_SRC-1.5.3-3.0.0.tgz
cd MLNX_OFED_SRC-1.5.3-3.0.0
Copy ofed.conf to the /tmp/build_driver_mellanox/MLNX_OFED_SRC-1.5.3-3.0.0 directory\\
Deinstall package scsi-target-utils\\
yum remove scsi-target-utils
Start the build and installation of the kernel module.
./install.pl -c ofed.conf
Below is the list of OFED packages that you have chosen
(some may have been added by the installer due to package dependencies):
ofed-scripts
kernel-ib
kernel-ib-devel
kernel-mft
Uninstalling the previous version of OFED
Build ofed-scripts RPM
… …… ………… …… …..
Device (15b3:634a):
47:00.0 InfiniBand: Mellanox Technologies MT25418 [ConnectX VPI PCIe 2.0 2.5GT/s - IB DDR / 10GigE] (rev a0)
Link Width: 8x
PCI Link Speed: 2.5Gb/s
Installation finished successfully.
If you get the output "Installation finished successfully" the driver installation is finished. \\
Go back to iso image and install the Mellanox Support tools from RPM directory\\
cd /mnt/mellanox/RPMS/
yum --nogpgcheck install opensm-3.3.9.MLNX_20111006_e52d5fc-0.1.x86_64.rpm rds-tools-2.0.4-1.x86_64.rpm infiniband-diags-1.5.8.MLNX_20110906-0.1.x86_64.rpm ibutils2-2.0-0.34.g9d3133a.x86_64.rpm ibutils-1.5.7-0.1.g05a9d1a.x86_64.rpm opensm-libs-3.3.9.MLNX_20111006_e52d5fc-0.1.x86_64.rpm libibmad-1.3.7.MLNX_20110814-0.1.x86_64.rpm libibumad-1.3.7.MLNX_20110814-0.1.x86_64.rpm
Please check in the configuration files for Infiniband, that the RDS driver is loaded at startup.
$ cat /etc/infiniband/openib.conf | grep RDS
# Load RDS module
RDS_LOAD=yes
If not, please change Value for parameter RDS_LOAD to yes on each node. If whole parameter is missing, add the parameter.
See also this : http://www.hpcadvisorycouncil.com/events/2011/switzerland_workshop/pdf/Presentations/Day%201/2_InfiniBand_Training.pdf
==== Quellen ===
* http://www.qlogic.com/SiteCollectionDocuments/Solutions/HP/hp_silverstorm_10g_rac_rds_v8_public.pdf
* http://www.servercare.com/assets/files/downloads/2008_121_RAC_11g_BEST_PRACTICES_AND_TUNING.doc
* http://www.oracle.com/technetwork/database/features/availability/s281216-tsien-131087.pdf
* http://www.openfabrics.org/archives/spring2010sonoma/Monday/8.30%20Tim%20Shetler%20Oracle/Sonoma_Workshop_2010_Oracle-final.pdf
* http://www.hpcuserforum.com/presentations/April2009Roanoke/OFAOPENFABRICSSpring2009OFA.ppt
* http://www.voltaire.com/Solutions/Database_Applications/oracle_10g_and_11g_real_application_clusters_rac
* http://www.voltaire.com/download/VOLT-OracleSolutionGuide-092108.pdf
* http://www.dell.com/downloads/global/power/ps2q07-20070279-Mahmood.pdf
* http://www.unyoug.com/uploads/files/20090313/UNYOUG_20090313_Leveraging_Infiniband_v2.ppt
* http://www.texmemsys.com/files/oracle_performance_tuning_with_ssd.pdf
* http://www.oracle.com/global/de/events/2007/locals/germany/odd_hochverfuegbarkeit/ORACLE_ODD_IT_Betrieb_RAC_ASM_Solbach.pdf