Please post sample data. Makes it much easier to provide a usable code answer.
As @PeterClemmensen recommends please provide samples of both input datasets (table1 and table2), a sample of what you want in your output dataset based on the inputs, and the SAS code you have currently.
For example:
/* Create sample Table 1 data */
data work.table1 ;
format
date1 date7.
time1 time5. ;
infile cards ;
input
subject1 $3.
date1 : date7.
time1 : time5. ;
output
work.table1 ;
cards ;
101 30Jan20 11:15
201 31Jan20 10:15
;
/* Create sample Table 2 data */
data work.table2 ;
format
date2 date7.
time2 time5. ;
infile cards ;
input
subject2 $3.
date2 : date7.
time2 : time5. ;
output
work.table2 ;
cards ;
101 30Jan20 11:15
101 30Jan20 11:15
201 31Jan20 10:15
201 01Feb20 10:15
;
/* Create sample want data */
/* This is the resulting dataset I want */
/* Based on the input datasets table1 and table 2 */
data work.want ;
format
date date7.
time time5. ;
infile cards ;
input
subject $3.
date : date7.
time : time5. ;
output
work.want ;
cards ;
101 30Jan20 11:15
201 31Jan20 10:15
201 01Feb20 10:15
;
/* My code to generate the want dataset from the input datasets table1 and table2 */
proc sql ;
create table
work.joined as
select
table1.*,
table2.*
from
table1,
table2
where
table1.subject1=table2.subject2 and
table1.date1=table2.date2 and
table1.time1=table2.time2
;
quit ;
If it is a self join, I assume you are joining on something like
a.patient_id=b.patient_id and
a.visit_time = b.visit_time and
a.visit_text ne b.visit_text
If you do not want the output both ways, just put change the "not equal" to "greater than" or "smaller than", e.g.
a.patient_id=b.patient_id and
a.visit_time = b.visit_time and
a.visit_text < b.visit_text
Of course, if you have more than 2 visits on the same time, you will still get too many combinations. In that case, it may be simpler with a datastep, e.g.
proc sort data=visits;
by patient_id visit_time;
run;
data duplicates;
set visits;
by patient_id visit_time;
if not(first.visit_time and last.visit_time);
run;
You can use first.variable/last.variable to flag duplicate records. Depending on the VISIT values, you can consider sorting order (eg. subjectid date time descending visit)
Eg:
data have;
input subjectid $ visit $ date:date9. time:time5.;
format date date9. time time5.;
cards;
101 Screen 30Jan2020 11:15
101 Day1 30Jan2020 11:15
101 Week6 15Mar2020 12:15
101 Week12 01May2020 12:00
;
proc print;run;
proc sort data=have;
by subjectid date time descending visit;
run;
data want;
set have;
by subjectid date time;
length flag $20;
if not first.time then flag= 'duplicate date';
proc print;run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.