准备的数据,现在要求将b, c两列中空白的值使用前面的值补充。
补充完的结果需要如下:
第一步:
确定数据是否有行号,如果没有,则自己增加一列行号,如准备数据的a列。如果已经有,此步可忽略。
第二步:
增加两列b1,c1, 将空值设置为0,非空值设置为1。
select a,b,c,
case when b = '' or b is null then 0 else 1 end b1,
case when c = '' or c is null then 0 else 1 end c1
from xy_ods.ods_test_fill
执行结果如下:
第三步:
使用开窗函数sum将b1,c1列按照从上到下顺序累加。
with tmp_a as (
select a,b,c,
case when b = '' or b is null then 0 else 1 end b1,
case when c = '' or c is null then 0 else 1 end c1
from xy_ods.ods_test_fill
)
select a,b,c,
sum(b1) over(partition by '1' order by a) as cnt_b,
sum(c1) over(partition by '1' order by a) as cnt_c
from tmp_a
执行结果如下:
这一步骤的目的是将空值与上一个非空值放到一个组里面。(非常重要)
第四步:
使用开窗函数按照分组求最大值
with tmp_a as (
select a,b,c,
case when b = '' or b is null then 0 else 1 end b1,
case when c = '' or c is null then 0 else 1 end c1
from xy_ods.ods_test_fill
),
tmp_b as (
select a,b,c,
sum(b1) over(partition by '1' order by a) as cnt_b,
sum(c1) over(partition by '1' order by a) as cnt_c
from tmp_a
)
select a,b,c,
max(b) over(partition by cnt_b order by a) as b2,
max(c) over(partition by cnt_c order by a) as c2
from tmp_b;
显示结果如下:
b2,c2就是需要的结果。