Hadoop集群搭建Hive集群
基于hadoop集群搭建hive集群
·
Hive介绍
hive是基于Hadoop的一个数据仓库工具,用来进行数据提取、转化、加载,这是一种可以存储、查询和分析存储在Hadoop中的大规模数据的机制。hive数据仓库工具能将结构化的数据文件映射为一张数据库表,并提供SQL查询功能,能将SQL语句转变成MapReduce任务来执行。Hive的优点是学习成本低,可以通过类似SQL语句实现快速MapReduce统计,使MapReduce变得更加简单,而不必开发专门的MapReduce应用程序。hive十分适合对数据仓库进行统计分析。
HA Hadoop 高可用集群部署
部署环境介绍
- 系统:CentOS Linux release 7.5.1804 (Core)
- Hadoop:hadoop-2.7.3
- Zookeeper:zookeeper-3.4.10
- Jdk: jdk1.8.0_171
- Hive:apache-hive-2.3.9
软件准备
wget https://mirrors.cnnic.cn/apache/hive/hive-2.3.9/apache-hive-2.3.9-bin.tar.gz
wget https://dev.mysql.com/get/Downloads/Connector-J/mysql-connector-java-5.1.44.tar.gz
集群部署规划
主机 | 地址 | 系统用户 | 安装软件 |
---|---|---|---|
hadoop-3 | 192.168.10.53 | hadoop | hive |
hadoop-4 | 192.168.10.54 | hadoop | hive |
hadoop-5 | 192.168.10.55 | hadoop | hive |
Hbase集群搭建
- 将hbase包解压至指定目录并解压
[hadoop@hadoop-3 software]$ cd /data/software/
[hadoop@hadoop-3 software]$ tar -zxf hbase-1.4.5-bin.tar.gz -C /data/
[hadoop@hadoop-3 software]$ cd /data/
[hadoop@hadoop-3 data]$ mv hbase-1.4.5-bin hive
- 配置环境变量
[hadoop@hadoop-3 software]$ cat /etc/profile.d/hadoop.sh
export JAVA_HOME=/data/java/jdk1.8.0_171
export JRE_HOME=/data/java/jdk1.8.0_171/jre
export CLASSPATH=./:/data/java/jdk1.8.0_171/lib:/data/java/jdk1.8.0_171/jre/lib
export HADOOP_HOME=/data/hadoop
export ZOOKEEPER_HOME=/data/zookeeper
export HBASE_HOME=/data/hbase
export HIVE_HOME=/data/hive
export PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$ZOOKEEPER_HOME/bin:$HBASE_HOME/bin:$HIVE_HOME/bin
- 修改hive-env.sh文件
[hadoop@hadoop-3 software]$ cd /data/hive/conf/
[hadoop@hadoop-3 conf]$ cp -a hive-env.sh
hive-env.sh hive-env.sh.template
[hadoop@hadoop-3 conf]$ cp -a hive-env.sh.template hive-env.sh
[hadoop@hadoop-3 conf]$ grep -n 'HADOOP_HOME' hive-env.sh
47:# Set HADOOP_HOME to point to a specific hadoop install directory
49:HADOOP_HOME=/data/hadoop
- 修改hive-site.xml文件
[hadoop@hadoop-3 conf]$ cp -a hive-default.xml.template hive-site.xml
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://mysql-1:3306/hive?useSSL=false</value>
<description>
JDBC connect string for a JDBC metastore.
To use SSL to encrypt/authenticate the connection, provide database-specific SSL flag in the connection URL.
For example, jdbc:postgresql://myhost/db?ssl=true for postgres database.
</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>Driver class name for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hive</value>
<description>Username to use against metastore database</description>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>hive</value>
<description>password to use against metastore database</description>
</property>
<!--配置缓存目录-->
<property>
<name>hive.exec.local.scratchdir</name>
<value>/data/hive/iotmp</value>
<description>Local scratch space for Hive jobs</description>
</property>
<property>
<name>hive.downloaded.resources.dir</name>
<value>/data/hive/iotmp</value>
<description>Temporary local directory for added resources in the remote file system.</description>
</property>
<!--权限管理配置-->
<property>
<name>hive.security.authorization.enabled</name>
<value>true</value>
<description>打开认证</description>
</property>
<property>
<name>hive.server2.enable.doAs</name>
<value>false</value>
<description>代理,默认是true</description>
</property>
<property>
<name>hive.users.in.admin.role</name>
<value>admin</value>
<description>添加admin角色用户,可添加多个</description>
</property>
<property>
<name>hive.security.authorization.manager</name>
<value>org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizerFactory</value>
<description>认证管理类</description>
</property>
<property>
<name>hive.security.authenticator.manager</name>
<!-- <value>org.apache.hadoop.hive.ql.security.HadoopDefaultAuthenticator</value>-->
<value>org.apache.hadoop.hive.ql.security.SessionStateUserAuthenticator</value>
<description>认证管理类</description>
</property>
- 创建缓存目录
adoop@hadoop-3 ~]$ mkdir /data/hive/iotmp
- 修改hive-config.sh文件
[hadoop@hadoop-3 ~]$ grep -n 'export' /data/hive/bin/hive-config.sh
20:export JAVA_HOME=/data/java/jdk1.8.0_171
21:export HIVE_HOME=/data/hive
22:export HADOOP_HOME=/data/hadoop
- 修改log日志的存放路径
[hadoop@hadoop-3 software]$ cd /data/hive/conf/
[hadoop@hadoop-3 conf]$ cp -a hive-log4j2.properties.template hive-log4j2.properties
[hadoop@hadoop-3 conf]$ grep -n 'hive.log.dir' hive-log4j2.properties
25:property.hive.log.dir = /data/hive/logs
- 将mysql-connector-java-5.1.44-bin.jar文件放置/data/hive/lib目录下
[hadoop@hadoop-3 software]$ cd /data/software/
[hadoop@hadoop-3 software]$ tar -zxf mysql-connector-java-5.1.44.tar.gz
[hadoop@hadoop-3 software]$ mv mysql-connector-java-5.1.44/mysql-connector-java-5.1.44-bin.jar /data/hive/lib/
- 将配置好的hive传至其他节点
[hadoop@hadoop-3 conf]$ scp -r /data/hive/ hadoop@hadoop-4:/data/
[hadoop@hadoop-3 conf]$ scp -r /data/hive/ hadoop@hadoop-5:/data/
- slave节点更改hive-site.xml 文件
<property>
<name>hive.metastore.uris</name>
<value>thrift://hadoop-3:9083</value>
<description>Thrift URI for the remote metastore. Used by metastore client to connect to remote metastore.</description>
</property>
- mysql创建数据库并赋予hive用户权限
[root@mysql-1 ~]# mysql -uroot -p
Enter password:
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 3
Server version: 5.7.30-log MySQL Community Server (GPL)
Copyright (c) 2000, 2020, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql> create database hive;
Query OK, 1 row affected (0.01 sec)
mysql> grant all privileges on hive.* to hive@'%' identified by 'hive';
Query OK, 0 rows affected, 1 warning (0.00 sec)
mysql> flush privileges;
Query OK, 0 rows affected (0.00 sec)
- 初始化 Hive 元数据库
[hadoop@hadoop-3 conf]$ schematool -initSchema -dbType mysql - verbose
- 启动metastore服务
[hadoop@hadoop-3 ~]$ cd /data/hive/bin
[hadoop@hadoop-3 bin]$ nohup hive --service metastore &
[1] 3483
[hadoop@hadoop-3 bin]$ nohup: 忽略输入并把输出追加到"nohup.out"
[hadoop@hadoop-3 bin]$ netstat -lntp | grep 9083
(Not all processes could be identified, non-owned process info
will not be shown, you would have to be root to see it all.)
tcp 0 0 0.0.0.0:9083 0.0.0.0:* LISTEN 3483/java
- 启动hiveserver2服务
[hadoop@hadoop-4 ~]$ nohup hive --service hiveserver2 &
[1] 2974
[hadoop@hadoop-4 ~]$ nohup: 忽略输入并把输出追加到"nohup.out"
[hadoop@hadoop-4 ~]$ netstat -lntp | grep 1000
(Not all processes could be identified, non-owned process info
will not be shown, you would have to be root to see it all.)
tcp 0 0 0.0.0.0:10000 0.0.0.0:* LISTEN 2974/java
tcp 0 0 0.0.0.0:10002 0.0.0.0:* LISTEN 2974/java
- 链接测试
[hadoop@hadoop-3 bin]$ beeline
beeline> !connect jdbc:hive2://192.168.10.54:10000
Connecting to jdbc:hive2://192.168.10.54:10000
Enter username for jdbc:hive2://192.168.10.54:10000: root
Enter password for jdbc:hive2://192.168.10.54:10000:
Connected to: Apache Hive (version 2.3.9)
Driver: Hive JDBC (version 2.3.9)
Transaction isolation: TRANSACTION_REPEATABLE_READ
0: jdbc:hive2://192.168.10.54:10000> show databases;
+----------------+
| database_name |
+----------------+
| default |
| xiaoming |
+----------------+
2 rows selected (1.863 seconds)
0: jdbc:hive2://192.168.10.54:10000> set role admin;
No rows affected (0.094 seconds)
0: jdbc:hive2://192.168.10.54:10000> show roles;
+-----------+
| role |
+-----------+
| admin |
| public |
| xiaoming |
+-----------+
3 rows selected (0.053 seconds)
0: jdbc:hive2://192.168.10.54:10000>
更多推荐
已为社区贡献1条内容
所有评论(0)