BookmarkSubscribeRSS Feed
Walternate
Obsidian | Level 7

Hi,

 

I have two datasets: Dataset 1 is at the person level and each person has a corresponding date. There are also a number of person-level categorical variables that I'd like to keep as is if possible. In Dataset 2 each person occurs multiple times and has multiple different dates. 

 

Dataset1:

ID1     Date1          var1     var2....etc.

1      10/31/2014     abc      def

2      9/7/2013        xyz       abc

 

Dataset2:

ID2      Date2

1        9/7/2012

1        2/4/2013

1       10/1/2014

2        9/7/2012

2       10/2/2014

 

I'm trying to merge the datasets such that from Dataset 2, I only bring in one record for each person--the record that has the Date2 value that remains true as of that person's Date1 in Dataset1. So for example, ID=1 has a Date1 value of 10/31/2014. I would want the latest Date2 value for ID=1 that occurs before Date1--in this case, 10/1/2014. 

 

Any help is much appreciated.

 

3 REPLIES 3
DartRodrigo
Lapis Lazuli | Level 10

Hi mate,

 

If you need simply join these tables where the dataset2  remains only the last dates of each id you can simply do this:

 

 

proc sort data=dataset1(rename=id1=id date1=date);
  by id date;
run;

proc sort data=dataset2(rename=id2=id date2=date);
  by id date;
run;

data dt1;
set dataset1;
by id date;
if last.date then flg = 1;
else flg = 0;
run;
data dt2;
set dataset2;
by id date;
if last.date then flg = 1;
else flg = 0;
run

data merge;
merge dt1(in=a)
dt2(in=b);
by id;
if a;
where flg = 1;
run;

Maybe this helps

 

ndp
Quartz | Level 8 ndp
Quartz | Level 8

Simple proc sql will do as well but you need to list all variables from dataset1 in group by statement:

 

proc sql noprint;
	create table ds3 as
	select a.*, max(b.date2) as date3 format=mmddyy10.
	from ds1 as a left join ds2 as b
	on a.id1=b.id2 and b.date2<=a.date1
	group by a.id1, a.date1, a.var1, a.var2;
quit;
Astounding
PROC Star

It's probably easier to reorder the records in DATASET2.  That way, you can take the first one that meets the conditions for comparing the dates:

 

proc sort data=dataset2;

   by id descending date2;

run;

 

data want;

merge dataset1 dataset2;

by id;

if date1 >= date2 then do;

   output;

   date1 = '01jan1800'd;

end;

run;

 

This assumes you will never have any dates that fall before January 1, 1800 (and assumes that I can still think straight after pondering this problem).  One of the keys is that DATE1 will be retained, and not re-read from DATASET1.

 

Good luck.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 854 views
  • 0 likes
  • 4 in conversation