About astolz0

astolz0 · ‎10-25-2013

Any ideas? Thank you!

astolz0 · ‎10-22-2013

Hi everyone, attached you find a slightly larger sample which better explains what we are looking for (this test sample is very similar to the one posted before; however, we added four more id_groups). The following code by FrederikE gives us exactly what we want, but it is hard to implement with thousands of id_groups in our full sample: PROC SQL; CREATE TABLE test2 AS SELECT * FROM (SELECT *, id_group AS id_group_new FROM test) UNION (SELECT a.*, b.id_group_new FROM test AS a INNER JOIN (SELECT *,id AS bid, id_group AS id_group_new FROM test WHERE id_group eq 10) AS b ON b.date_consensus > a.date_forecast AND b.date_consensus_minus90d <= a.date_forecast_confirmation AND b.date_consensus_minus90d >= a.date_forecast ) UNION (SELECT c.*, d.id_group_new FROM test AS c INNER JOIN (SELECT *, id_group AS id_group_new FROM test WHERE id_group eq 100) AS d ON d.date_consensus > c.date_forecast AND d.date_consensus_minus90d <= c.date_forecast_confirmation AND d.date_consensus_minus90d >= c.date_forecast ) UNION (SELECT e.*, f.id_group_new FROM test AS e INNER JOIN (SELECT *, id_group AS id_group_new FROM test WHERE id_group eq 1000) AS f ON f.date_consensus > e.date_forecast AND f.date_consensus_minus90d <= e.date_forecast_confirmation AND f.date_consensus_minus90d >= e.date_forecast ) UNION (SELECT g.*, h.id_group_new FROM test AS g INNER JOIN (SELECT *, id_group AS id_group_new FROM test WHERE id_group eq 10000) AS h ON h.date_consensus > g.date_forecast AND h.date_consensus_minus90d <= g.date_forecast_confirmation AND h.date_consensus_minus90d >= g.date_forecast ) UNION (SELECT i.*, j.id_group_new FROM test AS i INNER JOIN (SELECT *, id_group AS id_group_new FROM test WHERE id_group eq 100000) AS j ON j.date_consensus > i.date_forecast AND j.date_consensus_minus90d <= i.date_forecast_confirmation AND j.date_consensus_minus90d >= i.date_forecast ) UNION (SELECT k.*, l.id_group_new FROM test AS k INNER JOIN (SELECT *, id_group AS id_group_new FROM test WHERE id_group eq 1000000) AS l ON l.date_consensus > k.date_forecast AND l.date_consensus_minus90d <= k.date_forecast_confirmation AND l.date_consensus_minus90d >= k.date_forecast ) ; QUIT; PROC SORT DATA=test2; BY id_group_new date_forecast; RUN; DATA test2; RETAIN id id_group_new date_consensus date_forecast date_consensus_minus90d date_forecast_confirmation eps_forecast; SET test2; RENAME id_group = id_group_old; RUN; --> RESULT: Here, the code works fine as id = 1 is NOT merged to id_group = 1000000 because it does not fulfill l.date_consensus_minus90d <= k.date_forecast_confirmation --> 22/01/1998 > 16/01/1998 @Patrick: unfortunately, your code does not seem to account for that restriction?! Regarding the thousands of id_groups in our full sample, we cannot implement the code above as it requires to manually type in all different id_group numbers. Is there a possibility to combine the PROC SQL approach above with a loop? Something like: (we know this code does not work as it would overwrite test2 each time the loop runs; neither does the do command itself run; but hopefully you get the idea?!) %MACRO loop; PROC SQL; %DO i = 1 %TO 1000000; CREATE TABLE test2 AS SELECT * FROM (SELECT *, id_group AS id_group_new FROM test) UNION (SELECT a.*, b.id_group_new FROM test AS a INNER JOIN (SELECT *,id AS bid, id_group AS id_group_new FROM test WHERE id_group eq &i.) AS b ON b.date_consensus > a.date_forecast AND b.date_consensus_minus90d <= a.date_forecast_confirmation AND b.date_consensus_minus90d >= a.date_forecast ) %END; ; QUIT; %MEND; Thank you all very much! Regards, Alex and Niklas

astolz0 · ‎10-21-2013

Hi Patrick, thank you! I will respond to your post ASAP. We will try your approach on our full sample and see if it works. Regards, Alex

astolz0 · ‎10-21-2013

Hi Mit, that was just a typo in here. Of course, there is a space in PROC SORT and I meant id_group not ident_group. Thanks!

astolz0 · ‎10-18-2013

Thank you for posting the link. We'll work on it.

astolz0 · ‎10-17-2013

Hi Reeza, thanks for sharing your approach, too. Yes, it works. However, regarding the fact that we would like to run this code on our FULL sample, I think we will not be able to implement your approach as it requires knowing which ids should be merged to which id_group, i.e., we would have to manually look which ids fulfill our 3 conditions mentioned above for each id_group (... and our FULL sample consists of more than 100,000 id_groups). Regards, Alex

astolz0 · ‎10-17-2013

Hi Fredrik, yeah, that's it (almost)! Thank you very much! However, what shall we do if we have like 100,000 id_groups in our FULL sample? Is it possible to implement some kind of loop/macro?

astolz0 · ‎10-17-2013

Hi FrederikE, thanks for your effort. Unfortunately, your code results in the following table: Regards, Alex

astolz0 · ‎10-17-2013

Oh, sorry. I forgot to attach it. Thanks!

astolz0 · ‎10-17-2013

Hi, we have the following problem: We would like to add certain observations from the data set TEST to the same data set TEST (thereby creating a new data set TEST2) where these observations to be added fulfill three conditions. In addition, we would like to see this merge performed several times, i.e., for every group of observations classified by id_group. Maybe that sounds a bit complicated and/or is not explained very well, so please take a look at the pictures below which should give a better explanation by showing the inital data set and the one we would like to create... The initial data set TEST looks like: The data set we would like to create should look like: Our preliminary code looks like (not working): PROC SQL; CREATE TABLE test2 AS SELECT a.* FROM test a LEFT JOIN test b ON a.date_consensus > b.date_forecast AND a.date_forecast_minus90d <= b.date_forecast_confirmation AND a.date_forecast_minus90d >= b.date_forecast GROUP BY id_group; QUIT; I.e., SAS should do the following: Match no observation to id_group = 1 because the three conditions are not fulfilled by any of the 10 observations in test b Match observation with id = 1 to id_group = 10 as it is the only observation that fulfills: a.date_consensus (26.01.1998) > b.date_forecast (07.10.1997) a.date_forecast_minus90d (28.10.1997) <= b.date_forecast_confirmation (16.01.1998) a.date_forecast_minus90d (28.10.1997) >= b.date_forecast (07.10.1997) Match observation with id = 1 to id_group = 100 as it is the only observation that fulfills: a.date_consensus (27.01.1998) > b.date_forecast (07.10.1997) a.date_forecast_minus90d (29.10.1997) <= b.date_forecast_confirmation (16.01.1998) a.date_forecast_minus90d (29.10.1997) >= b.date_forecast (07.10.1997) Any help or comment is more than appreciated (sorry, we are SAS beginners). Thank you very much, Alex and Niklas

Online Status	Offline
Date Last Visited	‎09-01-2015 07:11 AM

Re: MERGE by group possible?

Re: MERGE by group possible?

Re: MERGE by group possible?

Re: MERGE by group possible?

Re: MERGE by group possible?

Re: MERGE by group possible?

Re: MERGE by group possible?

Re: MERGE by group possible?

Re: MERGE by group possible?

MERGE by group possible?

Re: MERGE by group possible?

Re: MERGE by group possible?

Re: MERGE by group possible?

Re: MERGE by group possible?

Re: MERGE by group possible?

Re: MERGE by group possible?

Re: MERGE by group possible?

Re: MERGE by group possible?

Re: MERGE by group possible?

MERGE by group possible?