Suppose I have two datasets with dates as follows:
Dates1:
Subject | Date1 |
---|---|
001 | 29AUG2013 |
002 | 21MAY2013 |
002 | 04JUL2013 |
002 | 20SEP2013 |
003 | 17JUN2013 |
004 | 16DEC2013 |
Dates2:
Subject | Date2 |
---|---|
001 | 08May2013 |
001 | 25JUN2013 |
001 | 13AUG2013 |
002 | 17JAN2013 |
002 | 09APR2013 |
002 | 21MAY2013 |
003 | 24JAN2013 |
003 | 18MAR2013 |
003 | 17JUN2013 |
004 | 17JUN2013 |
004 | 27JUN2013 |
004 | 19AUG2013 |
004 | 30SEP2013 |
004 | 11NOV2013 |
I need to create a new dataset that compares DATES1 and DATES2. If any of the dates in DATES1 are found in DATES2 for a given subject, then I want to flag that subject. In other words, for the above example, my output would look like this:
Subject | Match |
---|---|
001 | 0 |
002 | 1 |
003 | 1 |
004 | 0 |
I have been playing with this for a couple of days now, and I can't seem to find a solution. What would be the best way to merge these two datasets so I can compare Date1 and Date2? My main obstacle is the fact that there can be multiple entries for a single subject in both datasets, and I need to compare all dates in both datasets for a match. Any ideas?
you could use this:
data date1;
input sub date :anydtdte.;
datalines;
001 29AUG2013
002 21MAY2013
002 04JUL2013
002 20SEP2013
003 17JUN2013
004 16DEC2013
;
data date2;
input sub date :anydtdte.;
datalines;
001 08May2013
001 25JUN2013
001 13AUG2013
002 17JAN2013
002 09APR2013
002 21MAY2013
003 24JAN2013
003 18MAR2013
003 17JUN2013
004 17JUN2013
004 27JUN2013
004 19AUG2013
004 30SEP2013
004 11NOV2013
;
proc sql;
create table subCount as
select sub, sum(date in (select date from date2 where sub=d1.sub)) as match
from date1 as d1
group by sub;
select * from subCount;
quit;
PG
you could use this:
data date1;
input sub date :anydtdte.;
datalines;
001 29AUG2013
002 21MAY2013
002 04JUL2013
002 20SEP2013
003 17JUN2013
004 16DEC2013
;
data date2;
input sub date :anydtdte.;
datalines;
001 08May2013
001 25JUN2013
001 13AUG2013
002 17JAN2013
002 09APR2013
002 21MAY2013
003 24JAN2013
003 18MAR2013
003 17JUN2013
004 17JUN2013
004 27JUN2013
004 19AUG2013
004 30SEP2013
004 11NOV2013
;
proc sql;
create table subCount as
select sub, sum(date in (select date from date2 where sub=d1.sub)) as match
from date1 as d1
group by sub;
select * from subCount;
quit;
PG
I never would have thought of that on my own. Thank you so much!
You are welcome. You can reverse the role of date1 and date2 to get the match counts of table2 subjects. - PG
Call me old-fashioned ?? I probably would use a DATA step instead of SQL:
data want;
merge date1 (in=in1) date2 (in=in2 rename=(date2=date1));
by subject date1;
if first.subject then match=0;
if in1 and in2 then match=1;
retain match;
keep subject match;
if last.subject;
run;
FAIW - Replace match=1 with match+1 to get the count of matching dates for each subject and render the retain statement redundant. - PG
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.