data enroll;
input patient status :$12. dateassigned &:anydtdte.;
format date yymmdd10.;
datalines;
500-001 enrolled 01-jan-2019
500-002 enrolled 15-jan-2019
500-003 removed 23-Jan-2019
500-004 enrolled 05-feb-2019
500-005 enrolled 17-feb-2019
587-001 enrolled 20-feb-2019
587-002 enrolled 25-feb-2019
587-003 enrolled 03-mar-2019
594-001 enrolled 04-feb-2018
594-002 enrolled 09-feb-2018
648-001 enrolled 15-mar-2019
648-002 enrolled 22-mar-2019
648-003 enrolled 27-mar-2019
648-004 enrolled 30-mar-2019
;
data calendar;
input visitnumber patientID :$12. datetracked &:anydtdte.;
format date yymmdd10.;
datalines;
500 500-001-rdf 01-jan-2019
500 500-002-fgh 15-jan-2019
500 500-003-ehd 23-Jan-2019
500 500-004-ern 05-feb-2019
500 500-005-qmd 17-feb-2019
587 587-001-wcs 20-feb-2019
587 587-002-qlc 25-feb-2019
587 587-003-qhr 03-mar-2019
594 594-001-qwn 04-feb-2018
594 594-002-agj 09-feb-2018
648 648-001-wuf 15-mar-2019
648 648-002-qbf 22-mar-2019
648 648-003-olr 27-mar-2019
648 648-004-wmf 30-mar-2019
;
Hi all, I am trying to join these two these tables together. I want the columns to align where the column “patient” in table “enroll” matches the column “patientID” in table “calendar”. "patient" and "patientID" are similar but not equal. In addition, the column “dateassigned” in table “enroll” should match the column “datetracked” in table “calendar”. Since this is only a snippet of a larger dataset, the columns should align together. However, this may not be the case in the larger dataset, so I am trying to highlight where the columns from both tables do not align. Thank you in advance.
Alternatively try the merge statement as well as below
proc sort data=enroll;
by patient dateassigned;
run;
data calendar;
length patient $8.;
set calendar;
patient=substr(patientID,1,7);
dateassigned=datetracked;
run;
proc sort data=calendar;
by patient dateassigned;
run;
data want;
merge enroll(in=a) calendar(in=b);
by patient dateassigned;
if a and b;
run;
This is how you can join these unequal fields. You may adjust to inner join, left join, etc as needed.
proc sql;
create table want as
select
enrl.*
, cal.*
from
enroll enrl
, calendar cal
where
enrl.patient = substr(cal.patientID,1,7) and
enrl.dateassigned = cal.datetracked
;
quit;
I might use
where index(cal.patientID,enrl.patient,)>0 and
in case the length of enrl.patient (and possibly cal.patientID) varies more than is shown in the example.
Alternatively try the merge statement as well as below
proc sort data=enroll;
by patient dateassigned;
run;
data calendar;
length patient $8.;
set calendar;
patient=substr(patientID,1,7);
dateassigned=datetracked;
run;
proc sort data=calendar;
by patient dateassigned;
run;
data want;
merge enroll(in=a) calendar(in=b);
by patient dateassigned;
if a and b;
run;
Thank you for your reply. I am given the ERROR: Variable patient has been defined as both character and numeric.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.