为了提高hive可用性,提高集群的稳定性。对已经搭建好的hive进行高可用改造。没有搭建好hive的童鞋,请先自行搭建好hive,再来参考本文。

HiveServer2 高可用

  1. 修改配置hive-site.xml,增加一下内容
<property>

 <name>hive.server2.support.dynamic.service.discovery</name>

 <value>true</value>

</property>

<property>

 <name>hive.server2.zookeeper.namespace</name>

 <value>hiveserver2_zk</value>

</property>

<property>

 <name>hive.zookeeper.quorum</name>

 <value>hdp11,hdp12,hdp13</value>

</property>

<property>

 <name>hive.zookeeper.client.port</name>

 <value>2181</value>

</property>

<property>
  1. 将安装的好的hive文件夹,同步到hdp12
scp  /opt/bigdata/hive
  1. 修改hdp12中的配置 hive-site.xml,修改
  <property>
 	<name>hive.server2.thrift.bind.host</name>
  	<value>hdp12</value>
  </property>
  1. 分别重启启动两台的hiveServer2和metaStore
hiveservice.sh start
  1. 进入zk目录执行zkCli.sh,打开zk客户端,执行
ls /hiveserver2_zk

注意需要等待一会,hiveserver2启动得比较慢

查看hive的hiveserver2日志发现报错信息:
在这里插入图片描述

2021-03-01 19:44:15,923 WARN [main] server.HiveServer2 (HiveServer2.java:startHiveServer2(1064)) - Error starting HiveServer2 on attempt 1, will retry in 60000ms

java.lang.NoClassDefFoundError: org/apache/tez/dag/api/TezConfiguration

  at org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolSession$AbstractTriggerValidator.startTriggerValidator(TezSessionPoolSession.java:74)

  at org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.initTriggers(TezSessionPoolManager.java:207)

  at org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.startPool(TezSessionPoolManager.java:114)

  at org.apache.hive.service.server.HiveServer2.initAndStartTezSessionPoolManager(HiveServer2.java:839)

  at org.apache.hive.service.server.HiveServer2.startOrReconnectTezSessions(HiveServer2.java:822)

  at org.apache.hive.service.server.HiveServer2.start(HiveServer2.java:745)

  at org.apache.hive.service.server.HiveServer2.startHiveServer2(HiveServer2.java:1037)

  at org.apache.hive.service.server.HiveServer2.access$1600(HiveServer2.java:140)

  at org.apache.hive.service.server.HiveServer2$StartOptionExecutor.execute(HiveServer2.java:1305)

  at org.apache.hive.service.server.HiveServer2.main(HiveServer2.java:1149)

  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

  at java.lang.reflect.Method.invoke(Method.java:498)

  at org.apache.hadoop.util.RunJar.run(RunJar.java:318)

  at org.apache.hadoop.util.RunJar.main(RunJar.java:232)

Caused by: java.lang.ClassNotFoundException: org.apache.tez.dag.api.TezConfiguration

  at java.net.URLClassLoader.findClass(URLClassLoader.java:382)

  at java.lang.ClassLoader.loadClass(ClassLoader.java:424)

  at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)

  at java.lang.ClassLoader.loadClass(ClassLoader.java:357)

  ... 16 more

这个信息是缺少Tez引擎导致的,不影响使用,可通过安装Tez来解决。

  1. 高可用搭建完毕,使用jdbc或者beeline两种方式进行访问
  • beeline

控制台输入: beeline 打开窗口

在窗口中输入:

!connect jdbc:hive2://hdp11,hdp12/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2_zk along root
  • Jdbc

。。。

  1. 验证HiveServer2是否是高可用

在hdp12上,杀掉占用10000端口的进程,即杀掉hdp12的hiveServer2进程

[along@hdp12 logs]$ netstat -ntpl |grep 10000

(Not all processes could be identified, non-owned process info

 will not be shown, you would have to be root to see it all.)

tcp6    0   0 :::10000        :::*          LISTEN   87776/java

​ 在hdp12上打开beeline,连接

beeline> !connect jdbc:hive2://hdp11,hdp12/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2_zk

连接成功

Connected to: Apache Hive (version 3.1.2)

Driver: Hive JDBC (version 3.1.2)

Transaction isolation: TRANSACTION_REPEATABLE_READ

0: jdbc:hive2://hdp11,hdp12/> 

0: jdbc:hive2://hdp11,hdp12/> 

0: jdbc:hive2://hdp11,hdp12/> 

0: jdbc:hive2://hdp11,hdp12/> 

0: jdbc:hive2://hdp11,hdp12/> show databases;

+----------------------+

|  database_name   |

+----------------------+

| default       |

再查看zk中的命名空间:

[zk: localhost:2181(CONNECTED) 35] ls /hiveserver2_zk

[serverUri=hdp12:10000;version=3.1.2;sequence=0000000017]

Metastore 高可用

  1. 修改2个节点hive配置文件hive-site.xml
  <property>

    <name>hive.metastore.uris</name>

    <value>thrift://hdp11:9083,thrift://hdp12:9083</value>

  </property>
  1. 执行hiveservice.sh重启hive的hivesever2和metasrore服务
[along@hdp11 /]$ hiveservices.sh status

Metastore服务运行正常

HiveServer2服务运行正常
  1. 在11上通过zk命名空间连接beeline,并执行任意查询语句
[along@hdp11 logs]$ beeline 

Beeline version 3.1.2 by Apache Hive

beeline> !connect jdbc:hive2://hdp11,hdp12/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2_zk

查询语句

0: jdbc:hive2://hdp11,hdp12/> show tables;

+-----------------------------+

|     tab_name      |

+-----------------------------+

| dwd_action_log       |

| dwd_dim_activity_info    |

| dwd_dim_base_province    |


  1. 在11干掉metastore服务,再执行查询语句
[along@hdp11 /]$ ps -ef | grep metastore

along   3751   1 1 17:21 pts/1  00:00:09 /opt/bigdata/jdk1.8.0_212/bin/java -Dproc_jar -Dproc_metastore

……

/opt/bigdata/hive/lib/hive-metastore-3.1.2.jar org.apache.hadoop.hive.metastore.HiveMetaStore

along   5031  1367 0 17:33 pts/0  00:00:00 grep --color=auto metastore

 [along@hdp11 /]$ kill -9 3751

[along@hdp11 /]$ hiveservices.sh status

Metastore服务运行异常

HiveServer2服务运行正常

执行查询语句,高可用验证完成

0: jdbc:hive2://hdp11,hdp12/> show tables;

+-----------------------------+

|     tab_name      |

+-----------------------------+

| dwd_action_log       |

| dwd_dim_activity_info    |

| dwd_dim_base_province    |

Metastore服务运行异常

HiveServer2服务运行正常


执行查询语句,高可用验证完成

```sql
0: jdbc:hive2://hdp11,hdp12/> show tables;

+-----------------------------+

|     tab_name      |

+-----------------------------+

| dwd_action_log       |

| dwd_dim_activity_info    |

| dwd_dim_base_province    |
Logo

华为开发者空间,是为全球开发者打造的专属开发空间,汇聚了华为优质开发资源及工具,致力于让每一位开发者拥有一台云主机,基于华为根生态开发、创新。

更多推荐