BookmarkSubscribeRSS Feed
Fettah
Fluorite | Level 6

I have two datasets, current -- which is my main file containing the visit_dates I want to match on, and previous which contains other visit_dates (not necessarily always prior to the current dataset).

 

I want to create a dataset that only keeps the most recent last_date prior to the visit_date in the current dataset. If there are no prior visits, then last_date should be empty.

 

What is the best way to approach this problem?

 

data current;
infile cards;
input id visit_date;
attrib visit_date length=4 informat=date9. format=ddmmyy10.;
cards;
1 01nov14 
1 11nov14
2 22nov15 
2 24nov15
3 01nov16 
4 01nov16
;
run;

data previous;
infile cards;
input id last_date;
attrib last_date length=4 informat=date9. format=ddmmyy10.;
cards;
1 01jan14 
1 11feb14
1 02jan15
1 01nov14
2 01nov15 
2 22nov15
2 23nov15
3 01nov16 
4 12nov16
;
run;

data want;
infile cards;
input id visit_date last_date;
attrib visit_date last_date length=4 informat=date9. format=ddmmyy10.;
cards;
1 01nov14 11feb14
1 11nov14 01nov14
2 22nov15 01nov15
2 24nov15 23nov15
3 01nov16 .
4 01nov16 .
;
run;

 

 

3 REPLIES 3
AndyMayes
Calcite | Level 5

Hi Fettah

 

Here's one way to do it: first I did a Cartesian Join (work.step1) via PROC SQL, then use the SAS data step by group processing to get the records you want. The results dataset has a new variable result_date which is the variable you want (just use a drop and rename, to drop last_date and rename result_date to last_date), I left that out to make it clearer to see how it is working.

 

Andy

 

 

proc sql ;
create table work.step1 as
select
current.*,
previous.*
from
work.current as current,
work.previous as previous
where
previous.id=current.id
order by
current.id,
current.visit_date,
previous.last_date
;
quit ;

 

data results ;
retain result_date ;
attrib result_date length=4 informat=date9. format=ddmmyy10.;
set work.step1 ;
by id visit_date ;
if first.visit_date then
result_date=. ;
if visit_date > last_date then
result_date=last_date ;
if last.visit_date then
output ;
run ;

kiranv_
Rhodochrosite | Level 12

something like this

proc sql;
create table want as 
select  a.id, visit_date format=date7., max(last_date) as last_date format=date7.
from current a
left join
previous b
on a.id =b.id
and visit_date > last_date
group by 1,2;
novinosrin
Tourmaline | Level 20

SQL is by far the best approach i think however hash is fun

data current;
infile cards;
input id visit_date;
attrib visit_date length=4 informat=date9. format=ddmmyy10.;
cards;
1 01nov14 
1 11nov14
2 22nov15 
2 24nov15
3 01nov16 
4 01nov16
;
run;

data previous;
infile cards;
input id last_date;
attrib last_date length=4 informat=date9. format=ddmmyy10.;
cards;
1 01jan14 
1 11feb14
1 02jan15
1 01nov14
2 01nov15 
2 22nov15
2 23nov15
3 01nov16 
4 12nov16
;
run;
data want;
if _n_=1 then do;
if 0 then do; set current;set previous;end;
  declare hash h(dataset: 'previous',multidata:'y');
  h.definekey('id');
  h.definedata('last_date');
  h.definedone();
 end;
 set current;
 rc=h.find();
 do while(rc = 0);
  if visit_date>last_date then  _t=max( _t, last_date);
  rc = h.find_next();
 end;
 last_date=_t;
 drop rc _:;
 run;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 483 views
  • 2 likes
  • 4 in conversation