BookmarkSubscribeRSS Feed
astolz0
Calcite | Level 5

Hi everyone,

attached you find a slightly larger sample which better explains what we are looking for (this test sample is very similar to the one posted before; however, we added four more id_groups).

Test.jpg

The following code by FrederikE gives us exactly what we want, but it is hard to implement with thousands of id_groups in our full sample:

PROC SQL;

     CREATE TABLE test2

     AS SELECT *

     FROM

          (SELECT *, id_group AS id_group_new FROM test)
     UNION

          (SELECT a.*, b.id_group_new FROM test AS a

          INNER JOIN

          (SELECT *,id AS bid, id_group AS id_group_new FROM test WHERE id_group eq 10) AS b

          ON     b.date_consensus               >     a.date_forecast         

          AND    b.date_consensus_minus90d     <=     a.date_forecast_confirmation

          AND    b.date_consensus_minus90d     >=     a.date_forecast

          )

     UNION

          (SELECT c.*, d.id_group_new FROM test AS c

          INNER JOIN

          (SELECT *, id_group AS id_group_new FROM test WHERE id_group eq 100) AS d

          ON     d.date_consensus               >     c.date_forecast         

          AND    d.date_consensus_minus90d     <=     c.date_forecast_confirmation

          AND    d.date_consensus_minus90d     >=     c.date_forecast

          )

     UNION

          (SELECT e.*, f.id_group_new FROM test AS e

          INNER JOIN

          (SELECT *, id_group AS id_group_new FROM test WHERE id_group eq 1000) AS f

          ON     f.date_consensus               >     e.date_forecast         

          AND    f.date_consensus_minus90d     <=     e.date_forecast_confirmation

          AND    f.date_consensus_minus90d     >=     e.date_forecast

          )

     UNION

          (SELECT g.*, h.id_group_new FROM test AS g

          INNER JOIN

          (SELECT *, id_group AS id_group_new FROM test WHERE id_group eq 10000) AS h

          ON     h.date_consensus               >     g.date_forecast         

          AND    h.date_consensus_minus90d     <=     g.date_forecast_confirmation

          AND    h.date_consensus_minus90d     >=     g.date_forecast

          )
     UNION
          (SELECT i.*, j.id_group_new FROM test AS i

          INNER JOIN

          (SELECT *, id_group AS id_group_new FROM test WHERE id_group eq 100000) AS j

          ON      j.date_consensus              >     i.date_forecast         

          AND     j.date_consensus_minus90d    <=     i.date_forecast_confirmation

          AND     j.date_consensus_minus90d    >=     i.date_forecast

          )

     UNION

          (SELECT k.*, l.id_group_new FROM test AS k

          INNER JOIN

          (SELECT *, id_group AS id_group_new FROM test WHERE id_group eq 1000000) AS l

          ON      l.date_consensus              >     k.date_forecast         

          AND     l.date_consensus_minus90d    <=     k.date_forecast_confirmation

          AND     l.date_consensus_minus90d    >=     k.date_forecast

          )

         ;

     QUIT;


     PROC SORT DATA=test2;

          BY id_group_new date_forecast;

     RUN;

     DATA test2;

          RETAIN   

          id id_group_new date_consensus date_forecast date_consensus_minus90d 

          date_forecast_confirmation eps_forecast;

          SET test2;

          RENAME id_group = id_group_old;

     RUN;

--> RESULT:

Test2.jpg

    

Here, the code works fine as id = 1 is NOT merged to id_group = 1000000 because it does not fulfill

                     

     l.date_consensus_minus90d <= k.date_forecast_confirmation

     --> 22/01/1998 > 16/01/1998

@Patrick: unfortunately, your code does not seem to account for that restriction?!

Regarding the thousands of id_groups in our full sample, we cannot implement the code above as it requires to manually type in all different id_group numbers.

Is there a possibility to combine the PROC SQL approach above with a loop?

Something like:

(we know this code does not work as it would overwrite test2 each time the loop runs; neither does the do command itself run; but hopefully you get the idea?!)

%MACRO loop;

PROC SQL;

     %DO i = 1 %TO 1000000;

     CREATE TABLE test2

     AS SELECT *

     FROM

          (SELECT *, id_group AS id_group_new FROM test)
     UNION

          (SELECT a.*, b.id_group_new FROM test AS a

          INNER JOIN

          (SELECT *,id AS bid, id_group AS id_group_new FROM test WHERE id_group eq &i.) AS b

          ON     b.date_consensus               >     a.date_forecast         

          AND    b.date_consensus_minus90d     <=     a.date_forecast_confirmation

          AND    b.date_consensus_minus90d     >=     a.date_forecast

          )

     %END;

;

QUIT;

%MEND;

Thank you all very much!

Regards,

Alex and Niklas

         

astolz0
Calcite | Level 5

Any ideas?

Thank you!

Mit
Calcite | Level 5 Mit
Calcite | Level 5

The logic you are using is different than what you want.

PROCSQL;    There should be a space in between     PROC SQL;
 
CREATE TABLE test2          
 
AS SELECT a.*

  FROM test a          
 
LEFT JOIN test b        
       ON       a.date_consensus                    >      b.date_forecast    
    

       AND     a.date_forecast_minus90d <= b.date_forecast_confirmation

       AND     a.date_forecast_minus90d      >=    b.date_forecast

  GROUP BY identgroup;

QUIT;

Please check the logic.

1.      resulting date_consensus ( 07/10/1997) not greater than   date_forecast ( 07/10/1997)

2.     date_forecast_minus90d (09/07/1997)   le     date_forecast_confirmation (16/01/1998)  -This is correct

3.     resulting date_forecast_minus90d (09/07/1997) not greater than date_forecast (07/10/1997)

Also there is no by group as identgroup in your data.

Please post that data in usable format so that the time is saved.

astolz0
Calcite | Level 5

Hi Mit,

that was just a typo in here. Of course, there is a space in PROC SORT and I meant id_group not ident_group.

Thanks!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 18 replies
  • 3518 views
  • 0 likes
  • 6 in conversation