Please post sample data. Makes it much easier to provide a usable code answer.
As @PeterClemmensen recommends please provide samples of both input datasets (table1 and table2), a sample of what you want in your output dataset based on the inputs, and the SAS code you have currently.
For example:
/* Create sample Table 1 data */
data work.table1 ;
format
date1 date7.
time1 time5. ;
infile cards ;
input
subject1 $3.
date1 : date7.
time1 : time5. ;
output
work.table1 ;
cards ;
101 30Jan20 11:15
201 31Jan20 10:15
;
/* Create sample Table 2 data */
data work.table2 ;
format
date2 date7.
time2 time5. ;
infile cards ;
input
subject2 $3.
date2 : date7.
time2 : time5. ;
output
work.table2 ;
cards ;
101 30Jan20 11:15
101 30Jan20 11:15
201 31Jan20 10:15
201 01Feb20 10:15
;
/* Create sample want data */
/* This is the resulting dataset I want */
/* Based on the input datasets table1 and table 2 */
data work.want ;
format
date date7.
time time5. ;
infile cards ;
input
subject $3.
date : date7.
time : time5. ;
output
work.want ;
cards ;
101 30Jan20 11:15
201 31Jan20 10:15
201 01Feb20 10:15
;
/* My code to generate the want dataset from the input datasets table1 and table2 */
proc sql ;
create table
work.joined as
select
table1.*,
table2.*
from
table1,
table2
where
table1.subject1=table2.subject2 and
table1.date1=table2.date2 and
table1.time1=table2.time2
;
quit ;
If it is a self join, I assume you are joining on something like
a.patient_id=b.patient_id and
a.visit_time = b.visit_time and
a.visit_text ne b.visit_text
If you do not want the output both ways, just put change the "not equal" to "greater than" or "smaller than", e.g.
a.patient_id=b.patient_id and
a.visit_time = b.visit_time and
a.visit_text < b.visit_text
Of course, if you have more than 2 visits on the same time, you will still get too many combinations. In that case, it may be simpler with a datastep, e.g.
proc sort data=visits;
by patient_id visit_time;
run;
data duplicates;
set visits;
by patient_id visit_time;
if not(first.visit_time and last.visit_time);
run;
You can use first.variable/last.variable to flag duplicate records. Depending on the VISIT values, you can consider sorting order (eg. subjectid date time descending visit)
Eg:
data have;
input subjectid $ visit $ date:date9. time:time5.;
format date date9. time time5.;
cards;
101 Screen 30Jan2020 11:15
101 Day1 30Jan2020 11:15
101 Week6 15Mar2020 12:15
101 Week12 01May2020 12:00
;
proc print;run;
proc sort data=have;
by subjectid date time descending visit;
run;
data want;
set have;
by subjectid date time;
length flag $20;
if not first.time then flag= 'duplicate date';
proc print;run;
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.