输入表:描述了每个id存在的时段
idstartdateenddate
41727527252017-07-312017-08-02
41727789592017-08-012017-08-02
41727799682017-07-312017-08-01
41727819862017-08-092017-08-10
41729454442017-08-032017-08-04
41730140562017-08-042017-08-06
需求描述:要id和date一一对应的表,这样可以方便的求出每个日期上有哪些id存在。注意,对于每个id,enddate上不算这个id存在。
with testdata as ( select 4172752725 as id, '2017-07-31' as startdate,
'2017-08-02' as enddate union all select 4172778959 as id, '2017-08-01' as
startdate, '2017-08-02' as enddate union all select 4172779968 as id,
'2017-07-31' as startdate, '2017-08-01' as enddate union all select 4172781986
as id, '2017-08-09' as startdate, '2017-08-10' as enddate union all select
4172945444 as id, '2017-08-03' as startdate, '2017-08-04' as enddate union all
select 4173014056 as id, '2017-08-04' as startdate, '2017-08-06' as enddate )
select z.id ,date_add(z.startdate,z.pos) as checkindate from ( select id
,startdate ,posexplode(split(repeat('d',datediff(enddate,startdate) - 1),'d'))
from testdata ) z order by z.id asc ;
输出表:
idcheckindate
41727527252017-07-31
41727527252017-08-01
41727789592017-08-01
41727799682017-07-31
41727819862017-08-09
41729454442017-08-03
41730140562017-08-04
41730140562017-08-05
输出一些中间结果以帮助理解:
select id ,startdate ,enddate ,datediff(enddate,startdate)
,datediff(enddate,startdate)-1 ,repeat('d',datediff(enddate,startdate) - 1)
,split(repeat('d',datediff(enddate,startdate) - 1),'d')
,posexplode(split(repeat('d',datediff(enddate,startdate) - 1),'d')) from
testdata order by id asc
idstartdateenddatedatediffdatediff-1repeatsplitposcol
41727527252017-07-312017-08-0221d["",""]0
41727527252017-07-312017-08-0221d["",""]1
41727789592017-08-012017-08-0210 [""]0
41727799682017-07-312017-08-0110 [""]0
41727819862017-08-092017-08-1010 [""]0
41729454442017-08-032017-08-0410 [""]0
41730140562017-08-042017-08-0621d["",""]0
41730140562017-08-042017-08-0621d["",""]1