BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Deps
Calcite | Level 5

Hi, I need to merge two datasets by a unique id and then by nearest date with values carried forward.

 

Code is as follows:

 

data DATAONE;
infile datalines dlm=' ' truncover;

input id dateone date9. ;
format dateone date9.;
datalines;
1 27OCT22
1 27OCT22
1 27OCT22
1 28OCT22
1 29OCT22
1 30OCT22
1 31OCT22
1 01NOV22
1 02NOV22
1 03NOV22
;
run;

proc print;run;

data DATATWO;
infile datalines dlm=' ' truncover;
input id datetwo :date9. num ;
format datetwo date9. ;
datalines;
1 26OCT22 20
1 28OCT22 18
1 03NOV22 19
1 11NOV22 22
;
run;

proc print;run;


Want: DATATHREE
(where dataset three is a merge of dataone with datettwo by unique ID and by datetwo <= dateone and values of datetwo and num are carried forward. Want DATATHREE as shown as below)

Obs id dateone datetwo num

1 1 27OCT2022 26OCT2022 20
2 1 27OCT2022 26OCT2022 20
3 1 27OCT2022 26OCT2022 20
4 1 28OCT2022 28OCT2022 18
5 1 29OCT2022 28OCT2022 18
6 1 30OCT2022 28OCT2022 18
7 1 31OCT2022 28OCT2022 18
8 1 01NOV2022 28OCT2022 18
9 1 02NOV2022 28OCT2022 18
10 1 03NOV2022 03NOV2022 19

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

This works, but I would not recommend this if you have large data sets. In that case, another approach would be needed, perhaps Hash objects, but I will leave that to others.

 

proc sql;
	create table want as select a.*,b.datetwo,b.num
	from dataone as a,datatwo as b
    where a.id=b.id and b.datetwo-a.dateone<=0
	group by a.id,a.dateone
	having (b.datetwo-a.dateone) = max(b.datetwo-a.dateone);
quit;

 

--
Paige Miller

View solution in original post

2 REPLIES 2
PaigeMiller
Diamond | Level 26

This works, but I would not recommend this if you have large data sets. In that case, another approach would be needed, perhaps Hash objects, but I will leave that to others.

 

proc sql;
	create table want as select a.*,b.datetwo,b.num
	from dataone as a,datatwo as b
    where a.id=b.id and b.datetwo-a.dateone<=0
	group by a.id,a.dateone
	having (b.datetwo-a.dateone) = max(b.datetwo-a.dateone);
quit;

 

--
Paige Miller
Tom
Super User Tom
Super User

Don't know any efficient way to find the NEAREST date.

But if you want the value from the current or most recent date then just interleave the datasets by ID and DATE and remember the most recent values.

data want ;
  set 
      datatwo(in=in2 rename=(datetwo=dateone num=num2)) 
      dataone(in=in1)
  ;
  by id dateone;
  retain datetwo num ;
  format datetwo date9.;
  drop num2;
  if first.id then call missing(datetwo,num);
  if in2 then do;
    datetwo=dateone;
    num=num2;
  end;
  if in1;
run;
Obs    id      dateone      datetwo    num

  1     1    27OCT2022    26OCT2022     20
  2     1    27OCT2022    26OCT2022     20
  3     1    27OCT2022    26OCT2022     20
  4     1    28OCT2022    28OCT2022     18
  5     1    29OCT2022    28OCT2022     18
  6     1    30OCT2022    28OCT2022     18
  7     1    31OCT2022    28OCT2022     18
  8     1    01NOV2022    28OCT2022     18
  9     1    02NOV2022    28OCT2022     18
 10     1    03NOV2022    03NOV2022     19

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 437 views
  • 4 likes
  • 3 in conversation