How can I write iterative query on Oracle SQL? - Stack Overflow

I have the columns party_id and data_date. I want to sort the data_dates in ascending order for each pa

I have the columns party_id and data_date. I want to sort the data_dates in ascending order for each party_id. I will always take the first data_date. After that, the data_date I select must be at least 30 days later than the previous one I selected.

For example, for party_id 12345, I have the following data_dates:

party_id data_date
12345 01.01.2023
12345 04.02.2023
12345 05.02.2023
12345 30.03.2023
12345 31.03.2023
12345 04.04.2023

I have the columns party_id and data_date. I want to sort the data_dates in ascending order for each party_id. I will always take the first data_date. After that, the data_date I select must be at least 30 days later than the previous one I selected.

For example, for party_id 12345, I have the following data_dates:

party_id data_date
12345 01.01.2023
12345 04.02.2023
12345 05.02.2023
12345 30.03.2023
12345 31.03.2023
12345 04.04.2023

For this party_id, the selected dates should be 01.01.2023, 04.02.2023, and 30.03.2023.
This is because:

  • I first selected 01.01.2023.
  • The difference between 01.01.2023 and 04.02.2023 is more than 30 days, so I choose 04.02.2023.
  • When I check 05.02.2023, the difference with 04.02.2023 is only 1 day, so I do not select this date. Comparing 30.03.2023 with the last selected date, 04.02.2023, I see that the difference is more than 30 days, so I select 30.03.2023 as well.
  • I do not select the last date, 04.04.2023, because the difference with the most recent selected date, 31.03.2023, is 4 days.

I tried to do it on Oracle SQL in one query but I could not achieve it. I need to do it with one step.

Share Improve this question edited Mar 21 at 12:58 Guillaume Outters 2,6321 gold badge17 silver badges20 bronze badges asked Mar 21 at 12:17 BaranBaran 12 bronze badges
Add a comment  | 

3 Answers 3

Reset to default 0

Using recursive query.
First, we take prev_date for every row.
Next row in desired sequence is when data_date>30 days from selected and prev_date<30 days from selected.
Anchor part of recursion is first row for every party_id.

See example

with dataRanges(PARTY_ID,DATA_DATE,prev_date) as(
   select t.*
     ,lag(data_date,1,data_date)over(partition by party_id order by data_date) prev_date
   from test t
)
,r(PARTY_ID,DATA_DATE,prev_date) as (
  (  select t.PARTY_ID,t.DATA_DATE,t.prev_date
   from dataRanges t
   where t.data_date=(select min(data_date) from test t2 where t2.party_id=t.party_id)
  )
  union all
 (select t.PARTY_ID,t.DATA_DATE,t.prev_date
  from dataRanges t  inner join  r on  t.party_id=r.party_id 
     and t.data_date>(r.data_date+30) and  t.prev_date<(r.data_date+30)
 )  
)
select r.* from r;

With sample data

PARTY_ID DATA_DATE
12345 01-JAN-23
12345 04-FEB-23
12345 05-FEB-23
12345 30-MAR-23
12345 31-MAR-23
12345 04-APR-23

output is

PARTY_ID DATA_DATE PREV_DATE
12345 01-JAN-23 01-JAN-23
12345 04-FEB-23 01-JAN-23
12345 30-MAR-23 05-FEB-23

fiddle

You can use a recursive Common Table Expression to automate what you do manually.
And in that recursive part, the lag() window function will help you compare each row with the preceding one.

By eliminating entries too near from the previously chosen ones

/!\ Does not work on Oracle 11 (every date gets returned)

You'll start with all entries tagged as "hesitating" (if it's a 30-day period start or not),
then confirm as "firsts" the ones with no other date in the preceding 30 days,
confirm as "not firsts" those in the 30-day period after a just confirmed "first",
then reevaluates if this allows other "hesitating" to be confirmed "firsts",
and so on.

with
    -- Index our entries:
    i as (select row_number() over (partition by party_id order by data_date) id, t.* from t),
    -- Know whose election as a "first" will make each entry masked for good.
    l(id, party_id, data_date, pass, kind) as
    (
        -- "pass" increments to keep hesitating entries queued for the next iteration
        -- "kind":
        --   1: first of a serie
        --   0: don't know yet if first of serie or not
        --  -1: confirmed masked (follows a confirmed first)
        --  -2: first, but finished (has no more followers to evaluate)
        select i.*, 0 pass, 0 kind from i
            union all
        select id, party_id, data_date, pass + 1,
            case
                when kind = 0 then
                      case
                        -- Is the preceding entry more than 30 days before? We're a first!
                        when lag(data_date) over (partition by party_id order by data_date) is null then 1
                        when lag(data_date) over (partition by party_id order by data_date) < data_date - interval '30' day then 1
                        -- Else (if preceding is less than 30 days away), if said preceding entry is itself a first, we're marked to mask.
                        when lag(kind) over (partition by party_id order by data_date) = 1 then -1
                        -- Still not sure.
                        else 0
                    end
                -- If we are a first but have no more followers waiting for us, get away.
                when kind = 1 and coalesce(lead(kind) over (partition by party_id order by data_date), 1) <> 0 then -2
                else kind
            end
        from l
        where kind >= 0 -- Only work with confirmed firsts, and still hesitating ones.
        and pass < 99 -- In case I missed something...
    )
select party_id, data_date from i where (party_id, id) in (select party_id, id from l where kind = -2)
order by party_id, data_date;

Here is a demo for your 12345 party.

(this was slightly adapted from an answer for the same problem in PostgreSQL)

By jumping 30 days by 30 days

A more efficient way (still based on recursive CTE and lag()) is, after each step of choosing to display a non-preceded dates, to directly jump to "the next date after 30 days have passed".

This relies on range with interval, which is supported on Oracle 11g (and maybe before?).

with
  -- Identify unambiguous window starts: those with no predecessor in the 30 previous days.
  maybe as
  (
    select
      t.*,
      row_number() over (partition by party_id order by data_date) num, -- Will ease our reading of results.
      -- startpoint:
      -- - true: confirmed start of a new window
      -- - null: maybe, maybe not; will be later decided (depending on if the previous data_date (nearer than 20 days ago), has itself been eaten by a previous window (thus let us be a new start) or not (then the previous is a start and we're eaten by it)).
      case when lag(data_date) over (partition by party_id order by data_date) >= data_date - interval '30' day then null else 1 end startpoint
    from t
  ),
  -- Continents of data_date never more than 30 days far one from another.
  c as
  (
    select
      maybe.*,
      -- Identify it by the num of the unambiguous starting point.
      max(case when startpoint = 1 then num end) over (partition by party_id order by data_date) continent,
      -- Now attributes for *hypothetical* new island starts:
      -- for each data_date, choose its successor in case this one becomes an island start
      -- The successor is the first row from the same continent, but further than 30 days from this one.
      min(num) over (partition by party_id order by data_date range between interval '30' day following and unbounded following) successor,
      -- Number of rows which would belong to this 30 days window (in case the current row is a window start).
      count(1) over (partition by party_id order by data_date range between current row and interval '30' day following) n_included
    from maybe
  ),
  -- Now iterate starting from the continents,
  -- to see if we can determine islands within them.
  -- (each start of island "eats" the 30 following days, so the first row after 30 days can be elected as the start of a new island)
  i(party_id, data_date, num, startpoint, continent, successor, n_included) as
  (
    select * from c where startpoint = 1
    union all
    select nexti.party_id, nexti.data_date, nexti.num, nexti.startpoint, nexti.continent, nexti.successor, nexti.n_included -- Need to deploy the * on Oracle 11.
    -- Do not fet to filter on island, as successor has been computed without this criteria (before we had determined islands).
    from i join c nexti on nexti.party_id = i.party_id and nexti.continent = i.continent
    where nexti.num = i.successor
  )
select * from i order by party_id, num;

This has been put in a fiddle.

The solution is a port of what I proposed on PostgreSQL.

MODEL clause can do it:

with data(party_id, data_date)  as (
    select 12345, to_date('01.01.2023', 'dd.mm.yyyy') union all
    select 12345, to_date('04.02.2023', 'dd.mm.yyyy') union all
    select 12345, to_date('05.02.2023', 'dd.mm.yyyy') union all
    select 12345, to_date('30.03.2023', 'dd.mm.yyyy') union all
    select 12345, to_date('31.03.2023', 'dd.mm.yyyy') union all
    select 12345, to_date('04.04.2023', 'dd.mm.yyyy') -- union all
),
rdata as (
    select 
        row_number() over(partition by party_id order  by data_date) as id,
        party_id, data_date
    from data
)
select party_id, data_date from (
    select * from rdata
    model
        partition by (party_id)
        dimension by (id)
        measures( data_date as data_date, cast(null as date) as latest_date )
        rules
        (
            latest_date[any] = 
                nvl2(
                    latest_date[cv()-1],
                      case when data_date[cv()] <= latest_date[cv()-1] + 30
                        then latest_date[cv()-1] else data_date[cv()] end
                    , data_date[cv()]
                ),      
            data_date[any] = 
                nvl2(latest_date[cv()-1],
                    case when
                        data_date[cv()] >= latest_date[cv()-1] + 30 then data_date[cv()] end,
                    data_date[cv()])
                    
        )
)
where data_date is not null
;

发布者:admin,转转请注明出处:http://www.yc00.com/questions/1744354856a4570180.html

相关推荐

  • How can I write iterative query on Oracle SQL? - Stack Overflow

    I have the columns party_id and data_date. I want to sort the data_dates in ascending order for each pa

    8天前
    20

发表回复

评论列表(0条)

  • 暂无评论

联系我们

400-800-8888

在线咨询: QQ交谈

邮件:admin@example.com

工作时间:周一至周五,9:30-18:30,节假日休息

关注微信