Hi SAS community,
I am trying to use a date in one observation to reference a matching date in another observation, if the dates match then put the grade value into the first observation. If the date do not match then put a null in that column. It is hard to explain what I am trying to do so I created some sample data to help understand the concept further.
Data Test;
Input ID Grade SASDate Date1yr;
Informat SASDate date9. Date1yr date9.;
Format SASDate date9. Date1yr date9.;
Datalines;
1 4 01Jan2010 01Jan2011
2 3 01Feb2010 01Feb2011
3 6 01Jan2011 01Jan2012
4 5 01Mar2011 01Mar2012
5 1 01Jan2012 01Jan2013
;
Run;
(For some reason when I post this, ID 5 is not showing up and is put in ID 4)
Ideally what I would want is to look to see if the Date1yr time is the same as a SASDate on another observation.
In this example, the data would look like this when we are done.
ID Grade Grade1yr SASDate Date1yr
1 4 6 01Jan2010 01Jan2011
2 3 . 01Feb2010 01Feb2011
3 6 1 01Jan2011 01Jan2012
4 5 . 01Mar2011 01Mar2012
5 1 . 01Jan2012 01Jan2013
I would be happy to try to help clarify more if this does not make sense. Thanks for the help!
The code here should work. The question arises what would happen if there are more than 1 match of course.
proc sql; create table want as select a.*, b.grade as grade1yr from test a left join test b on a.sasdate=b.date1yr; quit;
The code here should work. The question arises what would happen if there are more than 1 match of course.
proc sql; create table want as select a.*, b.grade as grade1yr from test a left join test b on a.sasdate=b.date1yr; quit;
@RW9 Thanks this works great and is a very eloquent solution. I just needed to switch it to on a.date1yr =b.sasdate to get it into the format that I need. My actual data is set up so that for each ID, there will be only be one value for each month, but there could be the same month in another ID. This means that my data looks more like this in practice.
ID Grade Grade1yr SASDate Date1yr
1 4 6 01Jan2010 01Jan2011
1 3 . 01Feb2010 01Feb2011
1 6 . 01Jan2011 01Jan2012
2 5 1 01Jan2010 01Jan2011
2 1 . 01Jan2011 01Jan2012
Would I just need to use some kind of group by statement to get it so that it does it for each ID?
Data Test;
infile cards truncover;
Input ID Grade SASDate Date1yr;
Informat SASDate date9. Date1yr date9.;
Format SASDate date9. Date1yr date9.;
Datalines;
1 4 01Jan2010 01Jan2011
2 3 01Feb2010 01Feb2011
3 6 01Jan2011 01Jan2012
4 5 01Mar2011 01Mar2012
5 1 01Jan2012 01Jan2013
;
Run;
data want;
if _n_=1 then do;
if 0 then set test(keep=sasdate grade rename=(grade=Grade1yr));
dcl hash H (dataset:'test(keep=sasdate grade rename=(grade=Grade1yr))') ;
h.definekey ("sasdate") ;
h.definedata ("Grade1yr") ;
h.definedone () ;
end;
set test;
if h.find(key:Date1yr) ne 0 then call missing(Grade1yr);
run;
@novinosrin Thanks for the reply, that solution also works great! I have not worked with hash tables before so I am having difficulty following what is actually happening. I didn't realize when I originally posed the question that it would be a problem that there are multiples of the same dates in the dataset, but there are not multiples of the same date for each ID. Is there a way to apply this process to each ID.
Ex:
ID Grade Grade1yr SASDate Date1yr
1 4 6 01Jan2010 01Jan2011
1 3 . 01Feb2010 01Feb2011
1 6 . 01Jan2011 01Jan2012
2 5 1 01Jan2010 01Jan2011
2 1 . 01Jan2011 01Jan2012
a small tweak in @RW9 code is what you need. For ethical purpose, I ignored as it's appropriate for you to receive the updated version from him. Anyways
proc sql;
create table want as
select a.*,
b.grade as grade1yr
from test a
left join test b
on a.id =b.id and a.date1yr=b.sasdate
order by a.id, sasdate;
quit;
and the updated hash version:
data want;
if _n_=1 then do;
if 0 then set test(keep=id sasdate grade rename=(grade=Grade1yr));
dcl hash H (dataset:'test(keep=id sasdate grade rename=(grade=Grade1yr))') ;
h.definekey ('id',"sasdate") ;
h.definedata ("Grade1yr") ;
h.definedone () ;
end;
set test;
if h.find(key:id,key:Date1yr) ne 0 then call missing(Grade1yr);
run;
Thank you so much for both of these solutions. You have been so helpful and I appreciate everything that you have done!
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.