视图:
1、视图是一个逻辑表,本身不存储数据。数据来源于它依赖的物理表
2、视图只是简化复杂的查询,相当于一个sql数据集,对于一个大表来说,简化查询sql,并不能提高查询效率。反而增加资源开销,建议频繁查询的数据落地

1.视图如何创建
hive> create view if not exists test.test_view as 
    > select 
    >   apptypeid,
    >   accid,
    >   dt
    > from test.test1
    > 
    > union all
    > select 
    >   apptypeid,
    >   accid,
    >   dt
    > from test.test2;
OK
Time taken: 0.968 seconds
查看视图表:
hive> show create table test.test_view;
OK
CREATE VIEW `test.test_view` AS select 
  `test1`.`apptypeid`,
  `test1`.`accid`,
  `test1`.`dt`
from `test`.`test1` 

union all
select 
  `test2`.`apptypeid`,
  `test2`.`accid`,
  `test2`.`dt`
from `test`.`test2`
Time taken: 0.028 seconds, Fetched: 12 row(s)

注:1.视图依赖物理表test1、test2,每次查询视图表,都会从test1/test2中取数据,如果是频繁查询或者使用的地方比较多,建议数据直接落地会比较好
2.视图表里面可以编写复杂的统计逻辑,这里只做了简单的union all

下面我们来看看执行计划,验证查询视图表是否是查询物理表的数据:
hive> explain select count(1) from test.test_view where dt='20210610';
OK
STAGE DEPENDENCIES:
  Stage-1 is a root stage
  Stage-0 depends on stages: Stage-1

STAGE PLANS:
  Stage: Stage-1
    Map Reduce
      Map Operator Tree:
          TableScan
            alias: test1
            properties:
              insideView TRUE
            Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
            Select Operator
              Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
              Union
                Statistics: Num rows: 2 Data size: 3815370 Basic stats: COMPLETE Column stats: NONE
                Select Operator
                  Statistics: Num rows: 2 Data size: 3815370 Basic stats: COMPLETE Column stats: NONE
                  Group By Operator
                    aggregations: count(1)
                    mode: hash
                    outputColumnNames: _col0
                    Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: NONE
                    Reduce Output Operator
                      sort order: 
                      Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: NONE
                      value expressions: _col0 (type: bigint)
          TableScan
            alias: test2
            properties:
              insideView TRUE
            Statistics: Num rows: 1 Data size: 3815370 Basic stats: PARTIAL Column stats: NONE
            Select Operator
              Statistics: Num rows: 1 Data size: 3815370 Basic stats: PARTIAL Column stats: NONE
              Union
                Statistics: Num rows: 2 Data size: 3815370 Basic stats: COMPLETE Column stats: NONE
                Select Operator
                  Statistics: Num rows: 2 Data size: 3815370 Basic stats: COMPLETE Column stats: NONE
                  Group By Operator
                    aggregations: count(1)
                    mode: hash
                    outputColumnNames: _col0
                    Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: NONE
                    Reduce Output Operator
                      sort order: 
                      Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: NONE
                      value expressions: _col0 (type: bigint)
      Reduce Operator Tree:
        Group By Operator
          aggregations: count(VALUE._col0)
          mode: mergepartial
          outputColumnNames: _col0
          Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: NONE
          File Output Operator
            compressed: false
            Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: NONE
            table:
                input format: org.apache.hadoop.mapred.SequenceFileInputFormat
                output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
                serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe

  Stage: Stage-0
    Fetch Operator
      limit: -1
      Processor Tree:
        ListSink

Time taken: 0.885 seconds, Fetched: 68 row(s)

从执行计划可以验证我们的判断,统计test.test_view表的条数时,是分别从test1和test2表中做count操作的

扩展:

2.删除视图:
drop view test.test_view; 
3.更新视图:
create or replace view test.test_view2 as
select 
  apptypeid,
  accid,
  dt
from test.test3

union all
select 
  apptypeid,
  accid,
  dt
from test.test4;
Logo

为开发者提供学习成长、分享交流、生态实践、资源工具等服务,帮助开发者快速成长。

更多推荐