说明

DB2的purescale功能由于牵扯到多款产品,包括GPFS, RSCT,TSA,所以非常复杂,搭建的过程稍有不慎,就会遇到很多的报错。本文详细地介绍了如何在VMware虚拟机里搭建出一个Linux环境下的purescale集群,集群有两个节点node01,node02,每个节点上一个member,一个cf。

测试过程中,输入的命令以蓝色表示,前面#表示用root用户执行,$表示实例用户执行。需要强调的地方用红色标出。


测试环境

windows 7
FlashFXP
SecureCRT
DB2 10.5FP8
SUSE Linux 11.4
Vmware 10.0.1


测试步骤


1. 安装两个SUSE

Vmware里安装两个SUSE,安装的步骤是一样的,注意要点:


1.1 选择自定义安装


1.2  磁盘和内存大小

选择磁盘和内存大小的时候,磁盘建议至少40G,内存至少要2560M (我一开始的时候设置内存为2000M,最后启动CF的时候总是报错)



其他的都采用默认选项就好,注意检查默认网络采用的是否是NAT方式。


2. 修改主机名和hosts文件 (两台机器都要做)


2.1 查看IP地址:

安装完成之后,两台虚拟机会自动开启,登录之后,在桌面上空白处点击右键->Open in Terminal->输入命令 "ipconfig -a":


查到IP地址之后,后面的大部分步骤都可以不用在虚拟机里做了,直接使用win7下的secureCRT或者PuTTY连接到虚拟机即可。


2.2 修改hostname

在/etc/rc.d/boot.localnet开头添加一行
export HOSTNAME=node01

另一台机器主机名修改为node02,这个重启之后生效的,所以不必着急现在就看到效果。


2.3 修改 /etc/hosts文件如下:

其中192.168.187.148,192.168.187.149分别是两台机器的IP地址

# cat /etc/hosts

127.0.0.1       localhost
192.168.187.148 node01

192.168.187.149 node02


3. 安装必要的软件包(两台机器都要做)

安装软件包前,要先把“光盘”放入光驱,在虚拟机名子上右键->设置



之后运行以下命令:
# zypper install pam-32bit 
# zypper install glibc-locale-32bit 
# zypper install iscsitarget

还要设置一些环境变量和文件
# echo "export DB2USENONIB=TRUE" >> /etc/profile.local
# cp -v /usr/src/linux-3.0.101-63-obj/x86_64/default/include/generated/autoconf.h /lib/modules/3.0.101-63-default/build/include/linux

4. 添加磁盘 (两台机器都要做)

由于purescale是把数据放在共享存储的,所以我们需要一个共享磁盘,方案是在node01上添加一块磁盘,然后共享给node02



之后按默认选择就可以,大小我使用了40G。
node02上也要添加一块硬盘,但用不到,主要目地是使两边的盘符保持一致,所以大小设置为0.01G即可。

都添加完成之后,重启两个虚拟机。

5 iscsi实现磁盘共享

在上一步中,添加了磁盘,这一步的目的是让它变成共享磁盘,node01和node02都能访问。


5.1 共享之前,查看磁盘状态

共享之前,可以分别在两个节点上查看磁盘状态,每个节点上有两个磁盘,sda和sdb:

node01:~ # fdisk -l

Disk /dev/sdb: 42.9 GB, 42949672960 bytes
255 heads, 63 sectors/track, 5221 cylinders, total 83886080 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Disk /dev/sdb doesn't contain a valid partition table

Disk /dev/sda: 42.9 GB, 42949672960 bytes
255 heads, 63 sectors/track, 5221 cylinders, total 83886080 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x000a4042

  Device Boot      Start         End      Blocks   Id  System
/dev/sda1            2048     4208639     2103296   82  Linux swap / Solaris
/dev/sda2   *     4208640    83886079    39838720   83  Linux

node02:~ # fdisk -l

Disk /dev/sda: 42.9 GB, 42949672960 bytes
255 heads, 63 sectors/track, 5221 cylinders, total 83886080 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x0007a70b

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1            2048     4208639     2103296   82  Linux swap / Solaris
/dev/sda2   *     4208640    83886079    39838720   83  Linux

Disk /dev/sdb: 10 MB, 10485760 bytes
64 heads, 32 sectors/track, 10 cylinders, total 20480 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Disk /dev/sdb doesn't contain a valid partition table

5.2 只在node01上配置iSCSI Target

目地是把node01上新添加的那个磁盘/dev/sdb为作target,需要到虚拟机里配置:











5.3 在node01上配置iSCSI Initiator:




填写node01的IP地址




True表示已经连接上

再次查看,发现多了一个磁盘/dev/sdc
node01:~ # fdisk -l

Disk /dev/sdb: 42.9 GB, 42949672960 bytes
255 heads, 63 sectors/track, 5221 cylinders, total 83886080 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Disk /dev/sdb doesn't contain a valid partition table

Disk /dev/sda: 42.9 GB, 42949672960 bytes
255 heads, 63 sectors/track, 5221 cylinders, total 83886080 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x000a4042

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1            2048     4208639     2103296   82  Linux swap / Solaris
/dev/sda2   *     4208640    83886079    39838720   83  Linux

Disk /dev/sdc: 42.9 GB, 42949672960 bytes
64 heads, 32 sectors/track, 40960 cylinders, total 83886080 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Disk /dev/sdc doesn't contain a valid partition table

5.4 同样的办法,在node02上也配置iSCSI Initiator

输入IP那一步,仍然要写node01的IP地址,配置完成之后,fdisk -l的输出如下:

node02:~ # fdisk -l

Disk /dev/sda: 42.9 GB, 42949672960 bytes
255 heads, 63 sectors/track, 5221 cylinders, total 83886080 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x0007a70b

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1            2048     4208639     2103296   82  Linux swap / Solaris
/dev/sda2   *     4208640    83886079    39838720   83  Linux

Disk /dev/sdb: 10 MB, 10485760 bytes
64 heads, 32 sectors/track, 10 cylinders, total 20480 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Disk /dev/sdb doesn't contain a valid partition table

Disk /dev/sdc: 42.9 GB, 42949672960 bytes
64 heads, 32 sectors/track, 40960 cylinders, total 83886080 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Disk /dev/sdc doesn't contain a valid partition table
node02:~ # 

这一步完成之后,两台机器上都多了一个磁盘/dev/sdc,实际上他们共享的是node01上的/dev/sdb

6. 创建DB2用户和组(两台机器都要做)

# groupadd -g 999 db2iadm  
# groupadd -g 998 db2fadm
# groupadd -g 997 dasadm 
# useradd -u 1004 -g db2iadm -m -d /home/db2inst1 db2inst1   
# useradd -u 1003 -g db2fadm -m -d /home/db2fend db2fend
# useradd -u 1002 -g dasadm -m -d /home/dasusr dasusr 
# passwd db2inst1 
# passwd db2fend

# passwd dasusr  


7. 配置ssh信任(两台机器都要做)

这里root用户和db2inst1用户都要配置,配置SSH信任的办法可以自行参考网络,或者
http://blog.csdn.net/qingsong3333/article/details/73695895

在home目录下,如果不存在,就新建一个目录".ssh",

root:
# cd $HOME
# mkdir .ssh
# ssh-keygen
# cd .ssh
# touch authorized_keys
# cat id_rsa.pub >> authorized_keys
# chmod 600 authorized_keys

# su - db2inst1
$ mkdir .ssh
$ ssh-keygen
$ cd .ssh
$ touch authorized_keys
$ cat id_rsa.pub >> authorized_keys
$ chmod 600 authorized_keys

然后分别将自己机器上的id_rsa.pub文本内容追加到对方机器里对应用户(root对root,db2inst1对db2inst1)上的authorized_keys文件里

两台机器上,都分别用root和db2inst1用户测试,如果以下命令都不需要输入密码,则成功:
# ssh node01 date
# ssh node02 date

# su - db2inst1
$ ssh node01 date
$ ssh node02 date


8. 安装DB2(两台机器都要做)

可以使用FlashFXP将安装包上传到虚拟机上:
# tar -zxvf v10.5fp8_linuxx64_server_t.tar.gz
#  ./server_t/db2_install
DBI1324W  Support of the db2_install command is deprecated.

 
Default directory for installation of products - /opt/ibm/db2/V10.5

***********************************************************
Install into default directory (/opt/ibm/db2/V10.5) ? [yes/no] 
yes
 

Specify one of the following keywords to install DB2 products.

  SERVER 
  CONSV 
  EXP 
  CLIENT 
  RTCL 
 
Enter "help" to redisplay product names.

Enter "quit" to exit.

***********************************************************
SERVER
***********************************************************
Do you want to install the DB2 pureScale Feature? [yes/no] 
yes 
DB2 installation is being initialized.

 Total number of tasks to be performed: 53 
Total estimated time for all tasks to be performed: 2183 second(s) 

Task #1 start
Description: Checking license agreement acceptance 
Estimated time 1 second(s) 
Task #1 end 

.


9. 配置GPFS(只在node01上做)

可以使用db2cluster命令来做,也可以仅使用GPFS的命令。(其实db2cluster命令就是调用的GPFS命令)

9.1 创建GPFS cluster:

新建一个gpfs.nodes文件,内容如下
# cat  /tmp/gpfs.nodes
node01:quorum-manager
node02:quorum-manager

# /usr/lpp/mmfs/bin/mmcrcluster -p node01 -s node02 -n /tmp/gpfs.nodes -r /usr/bin/ssh -R /usr/bin/scp


9.2 添加license:

# /usr/lpp/mmfs/bin/mmchlicense server --accept -N  node01,node02


9.3 修改配置参数:

# /usr/lpp/mmfs/bin/mmchconfig maxFilesToCache=20000
# /usr/lpp/mmfs/bin/mmchconfig usePersistentReserve=yes
# /usr/lpp/mmfs/bin/mmchconfig verifyGpfsReady=yes
# /usr/lpp/mmfs/bin/mmchconfig totalPingTimeout=75
# /usr/lpp/mmfs/bin/mmlscluster
GPFS cluster information
========================
  GPFS cluster name:         node01
  GPFS cluster id:           2620703579963216106
  GPFS UID domain:           node01
  Remote shell command:      /usr/bin/ssh
  Remote file copy command:  /usr/bin/scp
  Repository type:           CCR

 Node  Daemon node name  IP address       Admin node name  Designation
-----------------------------------------------------------------------
   1   node01            192.168.187.148  node01           quorum-manager
   2   node02            192.168.187.149  node02           quorum-manager

9.4 启动GPFS daemons

启动GPFS 并查看状态,等一会之后,如果变成active说明没问题:

# /usr/lpp/mmfs/bin/mmstartup -a

# /usr/lpp/mmfs/bin/mmgetstate -a


9.5 创建GPFS文件系统

新建一个gpfs.disks文件,内容如下
# cat  /tmp/gpfs.disks
%nsd:
 device=/dev/sdc
 nsd=qsmiao
 usage=dataAndMetadata
 
# /usr/lpp/mmfs/bin/mmcrnsd -F /tmp/gpfs.disks -v yes

# /usr/lpp/mmfs/bin/mmlsnsd

 File system   Disk name    NSD servers                                    
---------------------------------------------------------------------------
 (free disk)   qsmiao         (directly attached)      

# /usr/lpp/mmfs/bin/mmcrfs /gpfs20170623 gpfsdev qsmiao -B 1024K -m 1 -M 2 -r 1 -R 2

其中gpfs20170623是给目录起的名子,gpfsdev是给gpfs设备起的名子
  
# cat /etc/fstab

# /usr/lpp/mmfs/bin/mmmount all -a
Thu Jun 22 06:50:14 EDT 2017: mmmount: Mounting file systems ...

9.6 在node01和node02都可以看到创建的文件系统

node01:~ # df
Filesystem     1K-blocks     Used Available Use% Mounted on
/dev/sda2       39213504 11834268  25387300  32% /
udev             1403056      132   1402924   1% /dev
tmpfs            1403056      808   1402248   1% /dev/shm
/dev/gpfsdev    41943040   478208  41464832   2% /gpfs20170623
node01:~ # echo "I'm writing to shared filesystem" > /gpfs20170623/hello.txt

node02:~ # df
Filesystem     1K-blocks     Used Available Use% Mounted on
/dev/sda2       39213504 11834016  25387552  32% /
udev             1403056      132   1402924   1% /dev
tmpfs            1403056      804   1402252   1% /dev/shm
/dev/gpfsdev    41943040 41943040         0 100% /gpfs20170623
node02:~ # cat  /gpfs20170623/hello.txt
I'm writing to shared filesystem
  
至此,我们可以看到,GPFS共享文件系统已经创建好,node01和node02可以对其并发访问

10. 创建实例(只在node01上做)


# /opt/ibm/db2/V10.5/instance/db2icrt -cf node01 -cfnet node01 -m node01 -mnet node01 -instance_shared_dir /gpfs20170623 -tbdev 192.168.187.2 -u db2fend db2inst1

上面的命令在node01上创建了一个member,一个CF。其中-tbdev 为网关地址即可, 请参考最后如何查看网关地址。

# su - db2inst1
$ db2set DB2_SD_ALLOW_SLOW_NETWORK=ON
$ db2licm -a db2aese_u.lic 
$ db2start
06/23/2017 13:46:38     0   0   SQL1063N  DB2START processing was successful.
SQL1063N  DB2START processing was successful.
$ db2instance -list
ID        TYPE             STATE                HOME_HOST               CURRENT_HOST       <..略..>
--        ----             -----                ---------               ------------       <..略..>
0       MEMBER           STARTED                   node01                     node01       <..略..>
128     CF               PRIMARY                   node01                     node01       <..略..>

HOSTNAME                   STATE                INSTANCE_STOPPED        ALERT
--------                   -----                ----------------        -----
  node01                  ACTIVE                              NO           NO
$ su - root

11. 添加另一个member(只在node01上做)

#   /opt/ibm/db2/V10.5/instance/db2iupdt -d -add -m node02 -mnet node02 db2inst1

12. 添加另一个CF(只在node01上做):

# su - db2inst1
$ db2stop
06/23/2017 14:10:27     1   0   SQL1032N  No start database manager command was issued.
06/23/2017 14:10:55     0   0   SQL1064N  DB2STOP processing was successful.
SQL6033W  Stop command processing was attempted on "2" node(s).  "1" node(s) were successfully stopped.  "1" node(s) were already stopped.  "0" node(s) could not be stopped.
$ su - root
#  /opt/ibm/db2/V10.5/instance/db2iupdt -d -add -cf node02 -cfnet node02 db2inst1

13. 启动实例(只在node01上做):

# su - db2inst1
$ db2start
06/23/2017 14:31:53     1   0   SQL1063N  DB2START processing was successful.
06/23/2017 14:32:04     0   0   SQL1063N  DB2START processing was successful.
SQL1063N  DB2START processing was successful.
$ db2instance -list                                                                      
ID        TYPE             STATE                HOME_HOST               CURRENT_HOST       <..略..>
--        ----             -----                ---------               ------------       <..略..>
0       MEMBER           STARTED                   node01                     node01       <..略..>
1       MEMBER           STARTED                   node02                     node02       <..略..>
128     CF               PRIMARY                   node01                     node01       <..略..>
129     CF                  PEER                   node02                     node02       <..略..>

HOSTNAME                   STATE                INSTANCE_STOPPED        ALERT
--------                   -----                ----------------        -----
  node02                  ACTIVE                              NO           NO
  node01                  ACTIVE                              NO           NO


14. 创建数据库(只在node01上做)

$ db2 "create db sample"
$ db2 "connect to sample"  
$ db2 "create table t1(id int, address char(20))"

$ db2 "insert into t1 values(123, 'Beijing')" 


15. 验证(只在node02上做)

# su - db2inst1
$ db2 "connect to sample"
$ db2 "list applications global"
Auth Id  Application    Appl.      Application Id                DB       # of
         Name           Handle                                   Name    Agents
-------- -------------- ---------- ----------------------------- -------- -----
DB2INST1 db2bp          75         *N0.db2inst1.170623184141     SAMPLE   1    
DB2INST1 db2bp          65591      *N1.db2inst1.170623184447     SAMPLE   1   

$ db2 "insert into t1 values(223,'NanJing')"
$ db2 "select * from t1"

ID          ADDRESS             
----------- --------------------
        123 Beijing             
        223 NanJing             

  2 record(s) selected.
  
$ db2 "force applications all"
$ db2stop

答疑

1.) DPF环境下,实例目录是共享的。但Purescale实例目录不是共享的,我只见你在node01上创建了实例,为什么node02上也有自己的实例和实例目录?
答:在添加node02节点上的member或者CF时,它会自动在node02上创建实例,也就是第10步。

2.) 如何查看网关?
答:查看网关,其中0.0.0.0开头的,即为默认网关,即192.168.187.2
#  netstat -rn
Kernel IP routing table
Destination     Gateway         Genmask         Flags   MSS Window  irtt Iface
0.0.0.0         192.168.187.2   0.0.0.0         UG        0 0          0 eth0
127.0.0.0       0.0.0.0         255.0.0.0       U         0 0          0 lo
169.254.0.0     0.0.0.0         255.255.0.0     U         0 0          0 eth0
192.168.187.0   0.0.0.0         255.255.255.0   U         0 0          0 eth0

3.)
为什么第11步的时候,db2stop说有一个节点SQL1032N,另一个节点SQL1064N?
答:新添加的member是没有启动的。

参考资料

http://www.db2china.net/Article/31549
https://www.ibm.com/developerworks/cn/data/library/techarticle/dm-1207maoq/


 
Logo

华为开发者空间,是为全球开发者打造的专属开发空间,汇聚了华为优质开发资源及工具,致力于让每一位开发者拥有一台云主机,基于华为根生态开发、创新。

更多推荐