如何一步步搭建Exadata虚拟机——Cell节点
http://www.dbaleet.org/how_to_build_an_exadata_simulator_step_by_step_1_build_a_cell_node
overmars同学在四月初就询问过我具体应该如何搭建一套Exadata虚拟机,当时我的回答是在五月前我会写一篇如何搭建Exadata虚拟机的文章,请到时关注我的blog。这里得向overmars同学道歉,因为由于一些个人的原因爽约了。 Anyway,Just hope it is not too late。
我知道很多Oracle DBA对学习Exadata有兴趣,却一直苦于身边没有一套可以学习测试Exadata的环境。要知道Exadata是Oracle是软硬件结合的一体机,单纯是通过自己的个人电脑是永远无法模拟出来真实的Exadata环境。所以这里说的Exadata虚拟机说白了只是按照猫话出来的老虎。Exadata虚拟机在Oracle内部一直就存在,但是仅限于Oracle University或者Oracle Internal用来培训或者学习Exadata之用,在Oracle内部的网站中这个虚拟机标识为“ Internal Use Only, Strict Confidential” 的字样。本人无意违反O记的policy,所以需要自己从头到尾开始构建。
好了废话不多说了,要构建一套Exadata虚拟环境,至少需要两台虚拟机,一台用于Cell节点,一台用于DB节点。
首先您的机器需要较高的配置:
- CPU Intel Core i3以上(或者AMD Athlon II X4以上), 推荐Core i5 (AMD Phenom II X4) ;
- 内存(Memory)至少4G以上,推荐配置8G;
- 磁盘(Harddisk)空余至少在40G以上,当然如果有SSD更好 <img src='http://www.dbaleet.org/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' />
- 安装好虚拟机, 推荐使用Oracle Virtualbox (https://www.virtualbox.org/);
- Oracle Linux 5.7安装介质。 可前往 https://edelivery.oracle.com/下载,下载前需要进行注册,注册是免费的。Oracle Linux 5.7的介质名为V27570-01.zip, 解压后的文件名为OracleLinux-R5-U7-Server-x86_64-dvd.iso
- Exadata 11.2.3.2 Cell的安装介质。可前往 https://edelivery.oracle.com/下载,下载前需要进行注册,注册是免费的。Exadata 11.2.3.2的Cell介质名为V33693-01.zip解压后文件名为cellImageMaker_11.2.3.2.0_LINUX.X64_120713-1.x86_64.tar;
- Oracle Clusterware 11.2.0.3以及Oracle database 11.2.0.3的Linux x86_64的安装介质,文件名为:p10404530_112030_Linux-x86-64_1of7.zip p10404530_112030_Linux-x86-64_2of7.zip p10404530_112030_Linux-x86-64_3of7.zip
- 最新的补丁工具Opatch。 补丁号: 6880880:OPatch patch of version 11.2.0.3.4 for Oracle software releases 11.2.0.x (APRIL 2013)
- Exadata RDBMS Bundle Patch 17 补丁号:16474946
然后就可以正式开始我们的Exadata之旅了。
首先需要在虚拟机中安装Oracle Linux 5.7, (Red Hat Enterprise Linux理论也可以,但我没有测试过),内存分配1GB通常就足够了。安装过程很简单,需要注意的是需要选上软件开发包,例如gcc/aio之类的,图形界面(GUI)可不装。推荐使用静态IP地址,我的网络配置如下:
[root@cell ~]# cat /etc/sysconfig/network-scripts/ifcfg-eth0 # Intel Corporation 82540EM Gigabit Ethernet Controller DEVICE=eth0 BOOTPROTO=static BROADCAST=192.168.56.255 HWADDR=08:00:27:B0:39:02 IPADDR=192.168.56.101 NETMASK=255.255.255.0 NETWORK=192.168.56.0 ONBOOT=yes [root@cell ~]# cat /etc/hosts # Do not remove the following line, or various programs # that require network functionality will fail. 127.0.0.1 cell localhost.localdomain localhost ::1 localhost6.localdomain6 localhost6
注意: 安装完成以后Oracle Linux默认使用UEK,如果这里使用UEK, 则在后面的步骤中无法正常启动cellsrv服务。可以修改grub的配置将其默认启动内核修改为redhat兼容内核:
[root@cell ~]# vi /etc/grub.conf
将default=0修改为default=1 ,然后重启。
因为默认Oracle Linux启动了很多我们不需要的服务,为了节省资源:建议将以下服务停止并且禁用。
chkconfig --level 2345 auditd off && service auditd stop chkconfig --level 2345 autofs off && service autofs stop chkconfig --level 2345 avahi-daemon off && service avahi-daemon stop chkconfig --level 2345 bluetooth off && service bluetooth stop chkconfig --level 2345 cups off && service cups stop chkconfig --level 2345 ip6tables off && service ip6tables stop chkconfig --level 2345 iptables off && service iptables stop chkconfig --level 2345 isdn off && service isdn stop chkconfig --level 2345 kudzu off && service kudzu stop chkconfig --level 2345 mcstrans off && service auditd stop chkconfig --level 2345 netfs off && service netfs stop chkconfig --level 2345 pcscd off && service pcscd stop chkconfig --level 2345 restorecond off && service restorecond stop chkconfig --level 2345 rhnsd off && service rhnsd stop chkconfig --level 2345 sendmail off && service sendmail stop chkconfig --level 2345 setroubleshoot off && service settroubleshoot stop chkconfig --level 2345 smartd off && service smartd stop chkconfig --level 2345 xinet off && service xinet stop chkconfig --level 2345 yum-updatesd off && service yum-updatesd stop
当然上述服务的禁用也可以通过 ntsysv –level 2345在图形界面进行选择,取消掉不需要的服务,然后重启。
然后我们将Exadata Cell Image V33693-01.zip上传到虚拟机内,然后解压,得到cellImageMaker_11.2.3.2.0_LINUX.X64_120713-1.x86_64.tar,继续解压得到dl180文件夹。
[root@cell ~]# unzip V33693-01.zip Archive: V33693-01.zip inflating: README.txt inflating: cellImageMaker_11.2.3.2.0_LINUX.X64_120713-1.x86_64.tar [root@cell ~]# tar -pxvf cellImageMaker_11.2.3.2.0_LINUX.X64_120713-1.x86_64.tar dl180......
在dl180/boot/cellbits下找到cell.bin文件。这个bin文件实际上是一个zip压缩包我们使用unzip来对它进行解压:
[root@cell ~]# unzip cell.bin Archive: cell.bin warning [cell.bin]: 6408 extra bytes at beginning or within zipfile (attempting to process anyway) inflating: cell-11.2.3.2.1_LINUX.X64_130109-1.x86_64.rpm inflating: jdk-1_5_0_15-linux-amd64.rpm
解压后得到cell-11.2.3.2.1_LINUX.X64_130109-1.x86_64.rpm和jdk-1_5_0_15-linux-amd64.rpm
我们先来安装jdk:
[root@cell ~]# rpm -ivh jdk-1_5_0_15-linux-amd64.rpm
然后再安装cell:
[root@cell ~]# rpm -ivh cell-11.2.3.2.1_LINUX.X64_130109-1.x86_64.rpm
安装的时候报错,提示有LWP包依赖:
配置好yum源直接使用yum安装LWP:
[root@cell ~]# yum install perl-libwww-perl
再次安装cell,又一次提示错误,前提条件不满足,到底是什么前提条件不满足没有提示,只能通过以下方式生成具体的检查条件的脚本然后再看是什么条件不满足:
[root@cell ~]# rpm --scripts -qp cell-11.2.3.2.1_LINUX.X64_130109-1.x86_64.rpm >>diag.log
打开diag.log,很快看到应该是/var/log/oracle目录不存在导致的,于是手工建立这个目录并修改权限为775。
[root@cell ~]# mkdir -p /var/log/oracle [root@cell ~]# chmod -R 775 /var/log/oracle
再次安装cell这次没有报错。
接下来的步骤应该是在cell虚拟机中建立对应的虚拟的磁盘和闪盘:
[root@cell ~]# mkdir -p /opt/oracle/cell/disks/raw [root@cell ~]cd /opt/oracle/cell/disks/raw [root@cell ~]vi dd.sh [root@cell ~]cat dd.sh dd if=/dev/zero of=disk01 bs=1M count=1000 dd if=/dev/zero of=disk02 bs=1M count=1000 dd if=/dev/zero of=disk03 bs=1M count=1000 dd if=/dev/zero of=disk04 bs=1M count=1000 dd if=/dev/zero of=disk05 bs=1M count=1000 dd if=/dev/zero of=disk06 bs=1M count=1000 dd if=/dev/zero of=disk07 bs=1M count=1000 dd if=/dev/zero of=disk08 bs=1M count=1000 dd if=/dev/zero of=disk09 bs=1M count=1000 dd if=/dev/zero of=disk10 bs=1M count=1000 dd if=/dev/zero of=disk11 bs=1M count=1000 dd if=/dev/zero of=disk12 bs=1M count=1000 dd if=/dev/zero of=FLASH01 bs=1M count=1000 dd if=/dev/zero of=FLASH02 bs=1M count=1000 dd if=/dev/zero of=FLASH03 bs=1M count=1000 dd if=/dev/zero of=FLASH04 bs=1M count=1000
执行dd.sh创建对应的磁盘和闪盘:其中磁盘12块,每块大小为1GB,闪盘4块,没块大小也是1GB。
[root@cell raw]# chmod 660 * [root@cell raw]# ls -ltr total 16400068 -rw-rw---- 1 root root 692 May 16 16:24 dd.sh -rw-rw---- 1 root root 1048576000 May 16 16:24 disk01 -rw-rw---- 1 root root 1048576000 May 16 16:24 disk02 -rw-rw---- 1 root root 1048576000 May 16 16:24 disk03 -rw-rw---- 1 root root 1048576000 May 16 16:24 disk04 -rw-rw---- 1 root root 1048576000 May 16 16:25 disk05 -rw-rw---- 1 root root 1048576000 May 16 16:25 disk06 -rw-rw---- 1 root root 1048576000 May 16 16:25 disk07 -rw-rw---- 1 root root 1048576000 May 16 16:26 disk08 -rw-rw---- 1 root root 1048576000 May 16 16:26 disk09 -rw-rw---- 1 root root 1048576000 May 16 16:27 disk10 -rw-rw---- 1 root root 1048576000 May 16 16:27 disk11 -rw-rw---- 1 root root 1048576000 May 16 16:27 disk12 -rw-rw---- 1 root root 1048576000 May 16 16:27 FLASH01 -rw-rw---- 1 root root 1048576000 May 16 16:27 FLASH02 -rw-rw---- 1 root root 1048576000 May 16 16:27 FLASH03 -rw-rw---- 1 root root 1048576000 May 16 16:28 FLASH04
然后删除dd脚本,切换到celladmin用户,重新启动celld服务。
[root@cell ~]# su - celladmin [celladmin@cell ~]$ cellcli -e alter cell restart services all
发现cellsrv服务无法启动,查看/opt/oracle/cell11.2.3.2.1_LINUX.X64_130109/log/diag/asm/cell/cell/trace/alert.log发现有类似如下的报错信息:
CELLSRV version=11.2.3.2.1,label=OSS_11.2.3.2.1_LINUX.X64_130109,Wed_Jan__9_06:09:48_PST_2013 Non critical error DIA-48913 caught while writing to trace file "/opt/oracle/cell11.2.3.2.1_LINUX.X64_130109/log/diag/asm/cell/cell/trace/svtrc_2244_0.trc" Error message: DIA-48913: Writing into trace file failed, file size limit [0] reached
从错误号就可以判断应该是最大文件数不足, 于是需要再修改操作系统的最大文件数限制:
在/etc/sysctl.ctl最后添加一行: fs.file-max = 65536,然后刷新生效:
[root@cell ~]# sysctl -p net.ipv4.ip_forward = 0 net.ipv4.conf.default.rp_filter = 2 net.ipv4.conf.default.accept_source_route = 0 kernel.sysrq = 0 kernel.core_uses_pid = 1 net.ipv4.tcp_syncookies = 1 kernel.msgmnb = 65536 kernel.msgmax = 65536 kernel.shmmax = 68719476736 kernel.shmall = 4294967296 fs.file-max = 65536
在/etc/security/limit.conf文件最后添加两行:
* soft nofile 65536 * hard nofile 65536
然后退出重新登录, 切换到 celladmin,使用ulimit -a进行查看是否生效:
[root@cell ~]# ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 11999 max locked memory (kbytes, -l) 32 max memory size (kbytes, -m) unlimited open files (-n) 65536 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 10240 cpu time (seconds, -t) unlimited max user processes (-u) 11999 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited
再次启动cell所有的服务:
[celladmin@cell ~]$ cellcli -e alter cell restart services all
这次发现cell下的cellsrv, ms, rs服务都可以正常启动了。
接下来需要在cellinit.ora中添加网卡的信息:
[celladmin@cell ~]$ cellcli -e create cell cell1 interconnect1=eth0
执行成功以后,可以看到cellinit.ora文件中添加了一行ipaddress1=192.168.56.101/24类似的信息。
[root@cell config]# cat /opt/oracle/cell/cellsrv/deploy/config/cellinit.ora #CELL Initialization Parameters version=0.0 DEPLOYED=TRUE HTTP_PORT=8888 RMI_PORT=23791 SSL_PORT=23943 JMS_PORT=9127 BMC_SNMP_PORT=162 ipaddress1=192.168.56.101/24
接下来创建celldisk, griddisk, flashcache, flashlog:
[celladmin@cell ~]$ cellcli CellCLI: Release 11.2.3.2.1 - Production on Thu May 16 23:11:41 CST 2013 Copyright (c) 2007, 2012, Oracle. All rights reserved. Cell Efficiency Ratio: 1 CellCLI> alter cell restart services all Stopping the RS, CELLSRV, and MS services... The SHUTDOWN of services was successful. Starting the RS, CELLSRV, and MS services... Getting the state of RS services... running Starting CELLSRV services... The STARTUP of CELLSRV services was successful. Starting MS services... The STARTUP of MS services was successful. CellCLI> create celldisk all CellDisk FD_00_cell1 successfully created CellDisk FD_01_cell1 successfully created CellDisk FD_02_cell1 successfully created CellDisk FD_03_cell1 successfully created CellDisk CD_disk01_cell1 successfully created CellDisk CD_disk02_cell1 successfully created CellDisk CD_disk03_cell1 successfully created CellDisk CD_disk04_cell1 successfully created CellDisk CD_disk05_cell1 successfully created CellDisk CD_disk06_cell1 successfully created CellDisk CD_disk07_cell1 successfully created CellDisk CD_disk08_cell1 successfully created CellDisk CD_disk09_cell1 successfully created CellDisk CD_disk10_cell1 successfully created CellDisk CD_disk11_cell1 successfully created CellDisk CD_disk12_cell1 successfully created CellCLI> create flashcache all size=2G Flash cache cell1_FLASHCACHE successfully created CellCLI> create flashlog all Flash log cell1_FLASHLOG successfully created CellCLI> list flashcache detail name: cell1_FLASHCACHE cellDisk: FD_00_cell1,FD_03_cell1,FD_02_cell1,FD_01_cell1 creationTime: 2013-05-16T17:11:57+08:00 degradedCelldisks: effectiveCacheSize: 2G id: 33020341-ba55-4b35-9b3a-4030b5085475 size: 2G status: normal CellCLI> list flashlog detail name: cell1_FLASHLOG cellDisk: FD_01_cell1,FD_03_cell1,FD_02_cell1,FD_00_cell1 creationTime: 2013-05-16T17:12:10+08:00 degradedCelldisks: effectiveSize: 512M efficiency: 100.0 id: f10e1ac7-5e3f-4c1e-8f3b-8e9ab19fffeb size: 512M status: normal CellCLI> list cell cell1 online CellCLI> list celldisk CD_disk01_cell1 normal CD_disk02_cell1 normal CD_disk03_cell1 normal CD_disk04_cell1 normal CD_disk05_cell1 normal CD_disk06_cell1 normal CD_disk07_cell1 normal CD_disk08_cell1 normal CD_disk09_cell1 normal CD_disk10_cell1 normal CD_disk11_cell1 normal CD_disk12_cell1 normal FD_00_cell1 normal FD_01_cell1 normal FD_02_cell1 normal FD_03_cell1 normal CellCLI> list griddisk data_CD_disk01_cell1 active data_CD_disk02_cell1 active data_CD_disk03_cell1 active data_CD_disk04_cell1 active data_CD_disk05_cell1 active data_CD_disk06_cell1 active data_CD_disk07_cell1 active data_CD_disk08_cell1 active data_CD_disk09_cell1 active data_CD_disk10_cell1 active data_CD_disk11_cell1 active data_CD_disk12_cell1 active
CellCLI> list celldisk CD_disk01_cell1 detail name: CD_disk01_cell1 comment: creationTime: 2013-05-16T16:40:29+08:00 deviceName: /opt/oracle/cell11.2.3.2.1_LINUX.X64_130109/disks/raw/disk01 devicePartition: /opt/oracle/cell11.2.3.2.1_LINUX.X64_130109/disks/raw/disk01 diskType: HardDisk errorCount: 0 freeSpace: 0 id: ecc913eb-5f74-4ad6-9d05-f811af986921 interleaving: none lun: /opt/oracle/cell11.2.3.2.1_LINUX.X64_130109/disks/raw/disk01 physicalDisk: /opt/oracle/cell11.2.3.2.1_LINUX.X64_130109/disks/raw/disk01 raidLevel: "RAID 0" size: 992M status: normal CellCLI> list celldisk FD_00_cell1 detail name: FD_00_cell1 comment: creationTime: 2013-05-16T16:40:25+08:00 deviceName: /opt/oracle/cell11.2.3.2.1_LINUX.X64_130109/disks/raw/FLASH01 devicePartition: /opt/oracle/cell11.2.3.2.1_LINUX.X64_130109/disks/raw/FLASH01 diskType: FlashDisk errorCount: 0 freeSpace: 304M freeSpaceMap: offset=688M,size=304M id: c9488ae4-d3b9-4aa2-a4e5-d3539e44b417 interleaving: none lun: /opt/oracle/cell11.2.3.2.1_LINUX.X64_130109/disks/raw/FLASH01 physicalDisk: /opt/oracle/cell11.2.3.2.1_LINUX.X64_130109/disks/raw/FLASH01 raidLevel: "RAID 0" size: 992M status: normal CellCLI> list griddisk data_CD_disk01_cell1 detail name: data_CD_disk01_cell1 availableTo: cachingPolicy: default cellDisk: CD_disk01_cell1 comment: creationTime: 2013-05-16T16:49:51+08:00 diskType: HardDisk errorCount: 0 id: 7a36bc8a-1611-474d-85fc-fa730e73176d offset: 48M size: 944M status: active
至此cell节点虚拟机基本创建完毕。