Hi everyone, attached you find a slightly larger sample which better explains what we are looking for (this test sample is very similar to the one posted before; however, we added four more id_groups). The following code by FrederikE gives us exactly what we want, but it is hard to implement with thousands of id_groups in our full sample: PROC SQL; CREATE TABLE test2 AS SELECT * FROM (SELECT *, id_group AS id_group_new FROM test) UNION (SELECT a.*, b.id_group_new FROM test AS a INNER JOIN (SELECT *,id AS bid, id_group AS id_group_new FROM test WHERE id_group eq 10) AS b ON b.date_consensus > a.date_forecast AND b.date_consensus_minus90d <= a.date_forecast_confirmation AND b.date_consensus_minus90d >= a.date_forecast ) UNION (SELECT c.*, d.id_group_new FROM test AS c INNER JOIN (SELECT *, id_group AS id_group_new FROM test WHERE id_group eq 100) AS d ON d.date_consensus > c.date_forecast AND d.date_consensus_minus90d <= c.date_forecast_confirmation AND d.date_consensus_minus90d >= c.date_forecast ) UNION (SELECT e.*, f.id_group_new FROM test AS e INNER JOIN (SELECT *, id_group AS id_group_new FROM test WHERE id_group eq 1000) AS f ON f.date_consensus > e.date_forecast AND f.date_consensus_minus90d <= e.date_forecast_confirmation AND f.date_consensus_minus90d >= e.date_forecast ) UNION (SELECT g.*, h.id_group_new FROM test AS g INNER JOIN (SELECT *, id_group AS id_group_new FROM test WHERE id_group eq 10000) AS h ON h.date_consensus > g.date_forecast AND h.date_consensus_minus90d <= g.date_forecast_confirmation AND h.date_consensus_minus90d >= g.date_forecast ) UNION (SELECT i.*, j.id_group_new FROM test AS i INNER JOIN (SELECT *, id_group AS id_group_new FROM test WHERE id_group eq 100000) AS j ON j.date_consensus > i.date_forecast AND j.date_consensus_minus90d <= i.date_forecast_confirmation AND j.date_consensus_minus90d >= i.date_forecast ) UNION (SELECT k.*, l.id_group_new FROM test AS k INNER JOIN (SELECT *, id_group AS id_group_new FROM test WHERE id_group eq 1000000) AS l ON l.date_consensus > k.date_forecast AND l.date_consensus_minus90d <= k.date_forecast_confirmation AND l.date_consensus_minus90d >= k.date_forecast ) ; QUIT; PROC SORT DATA=test2; BY id_group_new date_forecast; RUN; DATA test2; RETAIN id id_group_new date_consensus date_forecast date_consensus_minus90d date_forecast_confirmation eps_forecast; SET test2; RENAME id_group = id_group_old; RUN; --> RESULT: Here, the code works fine as id = 1 is NOT merged to id_group = 1000000 because it does not fulfill l.date_consensus_minus90d <= k.date_forecast_confirmation --> 22/01/1998 > 16/01/1998 @Patrick: unfortunately, your code does not seem to account for that restriction?! Regarding the thousands of id_groups in our full sample, we cannot implement the code above as it requires to manually type in all different id_group numbers. Is there a possibility to combine the PROC SQL approach above with a loop? Something like: (we know this code does not work as it would overwrite test2 each time the loop runs; neither does the do command itself run; but hopefully you get the idea?!) %MACRO loop; PROC SQL; %DO i = 1 %TO 1000000; CREATE TABLE test2 AS SELECT * FROM (SELECT *, id_group AS id_group_new FROM test) UNION (SELECT a.*, b.id_group_new FROM test AS a INNER JOIN (SELECT *,id AS bid, id_group AS id_group_new FROM test WHERE id_group eq &i.) AS b ON b.date_consensus > a.date_forecast AND b.date_consensus_minus90d <= a.date_forecast_confirmation AND b.date_consensus_minus90d >= a.date_forecast ) %END; ; QUIT; %MEND; Thank you all very much! Regards, Alex and Niklas
... View more