BookmarkSubscribeRSS Feed
vidya3
Calcite | Level 5
Hi,

Kindly help as this is very urgent.

There are two fields to be considered, date and time. If they are both duplicates across any 2 visits in a subject, then it must fire in the final output.

The issue here is, I tried self join but many to many join is occurring. For instance, for subject 101, Screening visit from table 1 (date = 30Jan2020 and time=11:15) is getting matched with Day 1 visit from table 2 (date = 30Jan2020 and time=11:15) which is the 1st row in the final output, but I'm also getting another 2nd row where Day 1 visit from table 1 (date = 30Jan2020 and time=11:15) is getting matched with Screening visit from table 2 (date = 30Jan2020 and time=11:15) . I'm expecting just one record whereas I'm getting vice versa scenarios, how do I avoid the vice versa scenario here and get just 1 record in the output?

Thanks in advance. Kindly help.
5 REPLIES 5
PeterClemmensen
Tourmaline | Level 20

Please post sample data. Makes it much easier to provide a usable code answer.

AMSAS
SAS Super FREQ

As @PeterClemmensen recommends please provide samples of both input datasets (table1 and table2), a sample of what you want in your output dataset based on the inputs, and the SAS code you have currently.

For example:

/* Create sample Table 1 data */
data work.table1 ;
	format 
		date1 date7. 
		time1 time5. ;
	infile cards ;
	input 
		subject1      $3.
		date1    : date7.
		time1    : time5. ;
	output 
		work.table1 ;
cards ;
101 30Jan20 11:15
201 31Jan20 10:15
;

/* Create sample Table 2 data */
data work.table2 ;
	format 
		date2 date7. 
		time2 time5. ;
	infile cards ;
	input 
		subject2      $3.
		date2    : date7.
		time2    : time5. ;
	output 
		work.table2 ;
cards ;
101 30Jan20 11:15
101 30Jan20 11:15
201 31Jan20 10:15
201 01Feb20 10:15
;

/* Create sample want data                        */
/* This is the resulting dataset I want           */
/* Based on the input datasets table1 and table 2 */
data work.want ;
	format 
		date date7. 
		time time5. ;
	infile cards ;
	input 
		subject      $3.
		date    : date7.
		time    : time5. ;
	output 
		work.want ;
cards ;
101 30Jan20 11:15
201 31Jan20 10:15
201 01Feb20 10:15
;

/* My code to generate the want dataset from the input datasets table1 and table2 */
proc sql ;
	create table 
		work.joined as
	select 
		table1.*,
		table2.*
	from
		table1, 
		table2 
	where
		table1.subject1=table2.subject2 and 
		table1.date1=table2.date2 and 
		table1.time1=table2.time2 
	;
		
quit ;
s_lassen
Meteorite | Level 14

If it is a self join, I assume you are joining on something like

a.patient_id=b.patient_id and
a.visit_time = b.visit_time and 
a.visit_text ne b.visit_text

If you do not want the output both ways, just put change the "not equal" to "greater than" or "smaller than", e.g.

a.patient_id=b.patient_id and
a.visit_time = b.visit_time and 
a.visit_text < b.visit_text

 

s_lassen
Meteorite | Level 14

Of course, if you have more than 2 visits on the same time, you will still get too many combinations. In that case, it may be simpler with a datastep, e.g.

proc sort data=visits;
  by patient_id visit_time;
run;

data duplicates;
  set visits;
  by patient_id visit_time;
  if not(first.visit_time and last.visit_time);
run;
A_Kh
Lapis Lazuli | Level 10

You can use first.variable/last.variable to flag duplicate records. Depending on the VISIT values, you can consider sorting order (eg. subjectid date time descending visit)

Eg:

data have;
	input subjectid $ visit $ date:date9. time:time5.;
	format date date9. time time5.;
	cards;
101 Screen	30Jan2020 11:15
101 Day1 30Jan2020 11:15
101 Week6 15Mar2020 12:15
101 Week12 01May2020 12:00
;
proc print;run;

proc sort data=have;
	by subjectid date time descending visit;
run; 

data want;
	set have;
	by subjectid date time;
	length flag $20;
	if not first.time then flag= 'duplicate date';
proc print;run; 

Capture.PNG 

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 882 views
  • 0 likes
  • 5 in conversation