在sql里求就不用去excel里求了,在excel还需要透视或者使用公式就比较麻烦。

直接在sql写组内平均、求和也是通过窗口函数实现,over里面的partition by实现分组条件,外面用聚合函数得出需要的值。注意:order by cnt rows between unbounded preceding and unbounded following的含义是:对分组(根据省份分组)后的数据cnt进行升序,并取所有行

比如我们要求湖北、江西、湖南三个省的0401-04-0403三日均、三日和、每日占比等数据:

select
    dt,
    pro_name,
    cnt,
---分组求平均
    avg(cnt) over(partition by pro_name) as avg_cnt,
---分组求和
    sum(cnt) over(partition by pro_name) as sum_cnt,
---组内求比例
    round(cnt/sum(cnt) over partition by pro_name,4) as rate
from
    (
        select
            dt,
            pro_name,
            count(distinct uid) as cnt
        from
            table_1
        where
            dt >= '2021-04-01'
            and dt <= '2021-04-03'
            and pro_name in ('湖北省', '江西省', '湖南省')
        group by
            dt,
            pro_name
    ) t1
order by
    pro_name,
    dt;
select
    dt,
    pro_name,
    cnt,
---分组求平均
    avg(cnt) over(
        partition by pro_name
        order by
            cnt rows between unbounded preceding
            and unbounded following
    ) as avg_cnt,
---分组求和
    sum(cnt) over(
        partition by pro_name
        order by
            cnt rows between unbounded preceding
            and unbounded following
    ) as sum_cnt,
---组内求比例
    round(
        cnt / sum(cnt) over(
            partition by pro_name
            order by
                cnt rows between unbounded preceding
                and unbounded following
        ),
        4
    ) as rate
from
    (
        select
            dt,
            pro_name,
            count(distinct uid) as cnt
        from
            table_1
        where
            dt >= '2021-04-01'
            and dt <= '2021-04-03'
            and pro_name in ('湖北省', '江西省', '湖南省')
        group by
            dt,
            pro_name
    ) t1
order by
    pro_name,
    dt;

结果:396+427+295=1118; (396+427+295)/3=372.67

Logo

为开发者提供学习成长、分享交流、生态实践、资源工具等服务,帮助开发者快速成长。

更多推荐