安装kylin的环境准备:
hadoop
hive
zookeeper(hbase依赖zookeeper,因为我没有使用hbase默认的zookeeper)
hbase
jdk
spark可以选择性安装
在官网下载kylin安装包后解压,配置好环境变量即可

注意:在运行官方实例时物理机的内存最少要16g,分配给虚拟机主节点的内存最少4g,否则hbase会不停挂。也会影响其他集群的运行(比如我出现过hadoop集群datanode启动失败的问题,总之问题很多),kylin集群与单节点并没有关系,另外还要注意日志文件的大小,在测试时发现日志文件的增大的速度很快,预计只需要半个小时左右会增长到20g,所以最好在服务器上测试

启动:

./kylin.sh start

WEB界面
hostname:7070/kylin,用户名默认ADMIN,密码:KYLIN

具体搭建参考官网http://kylin.apache.org/cn/

最重要的是kylin.properties配置文件,在配置kylin.server.model=xx时,kylin主节点的模式为all,从节点的模式为query,只有这一点不一样

我在hbase(1.3.1)与hive(1.2.1)整合时发现版本不对应的问题,将hbase更换成1.2.1后发现没解决,最后将1.3.1重新安装,出现phoenix连接hbase的问题,现在发现kylin启动后web界面访问不了的问题,查看logs下的日志发现

Caused by: org.apache.hadoop.hbase.TableExistsException: kylin_metadata_acl
	at 

sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at 

sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
	

at sun.reflect.DelegatingConstructorAccessorImpl.newInstance

(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance

(Constructor.java:422)
	at org.apache.hadoop.ipc.RemoteException.instantiateException

(RemoteException.java:106)
	at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException

(RemoteException.java:95)
	at 

org.apache.hadoop.hbase.util.ForeignExceptionUtil.toIOException(ForeignExceptionUtil.java:45)
	

at org.apache.hadoop.hbase.client.HBaseAdmin$ProcedureFuture.convertResult(HBaseAdmin.java:4713)
	

at org.apache.hadoop.hbase.client.HBaseAdmin$ProcedureFuture.waitProcedureResult

(HBaseAdmin.java:4671)
	at org.apache.hadoop.hbase.client.HBaseAdmin$ProcedureFuture.get

(HBaseAdmin.java:4604)
	at org.apache.hadoop.hbase.client.HBaseAdmin.createTable

(HBaseAdmin.java:679)
	at org.apache.hadoop.hbase.client.HBaseAdmin.createTable

(HBaseAdmin.java:609)
	at org.apache.kylin.storage.hbase.HBaseConnection.createHTableIfNeeded

(HBaseConnection.java:294)
	at 

org.apache.kylin.storage.hbase.HBaseConnection.createHTableIfNeeded(HBaseConnection.java:265)
	

at org.apache.kylin.rest.security.RealAclHBaseStorage.prepareHBaseTable

(RealAclHBaseStorage.java:49)
	at 

org.apache.kylin.rest.security.MockAclHBaseStorage.prepareHBaseTable(MockAclHBaseStorage.java:53)
	

at org.apache.kylin.rest.service.AclService.init(AclService.java:121)
	at 

sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at 

sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at 

sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at 

java.lang.reflect.Method.invoke(Method.java:497)
	at 

org.springframework.beans.factory.annotation.InitDestroyAnnotationBeanPostProcessor

$LifecycleElement.invoke(InitDestroyAnnotationBeanPostProcessor.java:344)
	at 

org.springframework.beans.factory.annotation.InitDestroyAnnotationBeanPostProcessor

$LifecycleMetadata.invokeInitMethods(InitDestroyAnnotationBeanPostProcessor.java:295)
	at 

org.springframework.beans.factory.annotation.InitDestroyAnnotationBeanPostProcessor.postProcessBe

foreInitialization(InitDestroyAnnotationBeanPostProcessor.java:130)
	... 125 more

上面的错误说表已经存在,kylin在第一次启动时会自动创建源数据表,由于我们是独立的zookeeper,所以进入到zookeeper的安装目录,在bin下执行./zkCli.sh,使用命令查看hbase下的表ls /hbase/table ,发现有三张以kylin开头的表,kylin_metadata_acl、kylin_metadata、kylin_metadata_user。单独删除一张发现不起作用,将三张全部删除,重启hbase,kylin,web界面正常访问,phoenix解决方法类似(删除系统表)

在条件允许的情况下分配给job(比如mr)的内存要尽可能大,例如:

<!-- reduce -->
<property>
<name>mapreduce.reduce.memory.mb</name>
<value>3096</value>
<description>每个Reduce Task需要的内存量</description>
</property>
<property>
<name>mapreduce.reduce.java.opts</name>
<value>-Xmx3096m</value>
<description>reduce任务内存</description>

否者在build cube时会不停的失败需要不停的重新运行任务(Resume),在运行kylin自带的例子(sample.sh)时会出现连接不上hadoop的问题:

Exception: java.net.ConnectException: Call From node-02/192.168.8.129 to 0.0.0.0:10020 failed on 

connection exception: java.net.ConnectException: 拒绝连接; For more details see:  

http://wiki.apache.org/hadoop/ConnectionRefused
java.net.ConnectException: Call From node-02/192.168.8.129 to 0.0.0.0:10020 failed on connection 

exception: java.net.ConnectException: 拒绝连接; For more details see:  

http://wiki.apache.org/hadoop/ConnectionRefused

根据Mr jobid去8088界面发现任务运行状态是success,但是kylin显示失败,暂时只能再次运行(很痛苦的过程)

在任务运行的过程中发现hbase集群中一个节点挂掉问题,后面整个集群完全挂掉,但是kylin可以访问,查询不了build cube任务,具体问题暂时没解决。报错信息如下:

2019-03-21 13:45:44,739 ERROR [pool-7-thread-1] manager.ExecutableManager:209 : error get All Job 

Ids
org.apache.kylin.job.exception.PersistentException: 

org.apache.hadoop.hbase.DoNotRetryIOException: hconnection-0x10fe55ee closed
	at org.apache.kylin.job.dao.ExecutableDao.getJobIds(ExecutableDao.java:149)
	at org.apache.kylin.job.manager.ExecutableManager.getAllJobIds

(ExecutableManager.java:207)
	at org.apache.kylin.job.impl.threadpool.DefaultScheduler$FetcherRunner.run

(DefaultScheduler.java:85)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301

(ScheduledThreadPoolExecutor.java:180)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run

(ScheduledThreadPoolExecutor.java:294)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.hbase.DoNotRetryIOException: hconnection-0x10fe55ee closed
	at org.apache.hadoop.hbase.client.ConnectionManager

$HConnectionImplementation.locateRegion(ConnectionManager.java:1174)
	at org.apache.hadoop.hbase.client.ConnectionManager

$HConnectionImplementation.relocateRegion(ConnectionManager.java:1154)
	at org.apache.hadoop.hbase.client.ConnectionManager

$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1359)
	at org.apache.hadoop.hbase.client.ConnectionManager

$HConnectionImplementation.locateRegion(ConnectionManager.java:1183)
	at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations

(RpcRetryingCallerWithReadReplicas.java:305)
	at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call

(ScannerCallableWithReplicas.java:156)
	at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call

(ScannerCallableWithReplicas.java:60)
	at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries

(RpcRetryingCaller.java:212)
	at org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:314)
	at org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:289)
	at org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction

(ClientScanner.java:164)
	at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:159)
	at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:796)
	at org.apache.kylin.storage.hbase.HBaseResourceStore.visitFolder

(HBaseResourceStore.java:137)
	at org.apache.kylin.storage.hbase.HBaseResourceStore.listResourcesImpl

(HBaseResourceStore.java:107)
	at org.apache.kylin.common.persistence.ResourceStore.listResources

(ResourceStore.java:121)
	at org.apache.kylin.job.dao.ExecutableDao.getJobIds(ExecutableDao.java:138)
	... 9 more

猜想可能是内存不足的问题

Logo

华为开发者空间,是为全球开发者打造的专属开发空间,汇聚了华为优质开发资源及工具,致力于让每一位开发者拥有一台云主机,基于华为根生态开发、创新。

更多推荐