BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
djbateman
Lapis Lazuli | Level 10

I have 2 datasets that I need to merge, but I'm stuck in this situation because there is a repeat in the by variables group that I can't ignore.

 

I have a dosing table that I need to merge with an external lab table, but I only want to keep where the subject and visit are found in the dosing table.

 

So, for example below, there is dosing at Day -28 and Day 1, so I only want to keep those visits from EXT (external lab table).  That's easy enough.  The issue arises in the fact that at Day -28 there is only one drug administered while at Day 1 there are 2 drugs administered.  In addition to that, the external types of ECG and SP only have one record per visit while SWC has 2 records per visit (due to left and right laterality).

 

 

proc sql;
	create table DOSEF0 (SUBJID char(11), EVENT char(100), EVENTDT char(10), EXFTRT char(50), EXDTC char(16));
	insert into dosef0 (subjid, event, eventdt, exftrt, exdtc)
		values ('102-087-201','Day -28','2021-10-06','Elexacaftor/Tezacaftor/Ivacaftor','2021-10-06T09:41')
		values ('102-087-201','Day 1','2021-11-03','VX-121/Tezacaftor/Deutivacaftor','2021-11-03T15:00')
		values ('102-087-201','Day 1','2021-11-03','Elexacaftor/Tezacaftor/Ivacaftor','2021-11-03T15:25');

	create table EXT (SUBJID char(11), EVENT char(100), DATA_TYPE char(3), SWLAT char(5), EXTSTDTTM char(16), EXTENDTTM char(16));
	insert into ext (subjid, event, data_type, swlat, extstdttm, extendttm)
		values ('102-087-201','Day -28',	'ECG','',		'2021-10-06T09:06','')
		values ('102-087-201','Screening',	'ECG','',		'2021-09-14T09:34','')
		values ('102-087-201','Screening',	'SWC','LEFT',	'2021-09-14T09:55','2021-09-14T10:09')
		values ('102-087-201','Screening',	'SWC','RIGHT',	'2021-09-14T10:05','2021-09-14T10:20')
		values ('102-087-201','Screening',	'SP', '',		'2021-09-14T10:24','')
		values ('102-087-201','Day 1',		'ECG','',		'2021-11-03T09:15','')
		values ('102-087-201','Day 1',		'SP', '',		'2021-11-03T09:27','')
		values ('102-087-201','Day 1',		'SWC','LEFT',	'2021-11-03T09:40','2021-11-03T15:00')
		values ('102-087-201','Day 1',		'SWC','RIGHT',	'2021-11-03T09:47','2021-11-03T15:25')
		values ('102-087-201','Day 15',		'ECG','',		'2021-11-18T08:49','')
		values ('102-087-201','Day 15',		'SP', '',		'2021-11-18T09:02','')
		values ('102-087-201','Day 15',		'SWC','LEFT',	'2021-11-18T09:13','2021-11-18T09:31')
		values ('102-087-201','Day 15',		'SWC','RIGHT',	'2021-11-18T09:20','2021-11-18T09:45');
quit;

 

So, from the code above, how can I get the table below?  I have tried pulling in EXFTRT into EXT separately without success.  I have tried different cartesian products without success.  I'm just not sure how to go about this.

 
SUBJID EVENT EVENTDT EXFTRT EXDTC DATA_TYPE SWLAT EXSTDTTM EXTENDTTM
102-087-201 Day -28 2021-10-06 Elexacaftor/Tezacaftor/Ivacaftor 2021-10-06T09:41 ECG   2021-10-06T09:06  
102-087-201 Day 1 2021-11-03 VX-121/Tezacaftor/Deutivacaftor 2021-11-03T15:00 ECG   2021-11-03T09:15  
102-087-201 Day 1 2021-11-03 Elexacaftor/Tezacaftor/Ivacaftor 2021-11-03T15:25 ECG   2021-11-03T09:15  
102-087-201 Day 1 2021-11-03 VX-121/Tezacaftor/Deutivacaftor 2021-11-03T15:00 SP   2021-11-03T09:27  
102-087-201 Day 1 2021-11-03 Elexacaftor/Tezacaftor/Ivacaftor 2021-11-03T15:25 SP   2021-11-03T09:27  
102-087-201 Day 1 2021-11-03 VX-121/Tezacaftor/Deutivacaftor 2021-11-03T15:00 SWC Left 2021-11-03T09:40 2021-11-03T15:00
102-087-201 Day 1 2021-11-03 Elexacaftor/Tezacaftor/Ivacaftor 2021-11-03T15:25 SWC Right 2021-11-03T09:47 2021-11-03T15:25
102-087-201 Day 1 2021-11-03 VX-121/Tezacaftor/Deutivacaftor 2021-11-03T15:00 SWC Left 2021-11-03T09:40 2021-11-03T15:00
102-087-201 Day 1 2021-11-03 Elexacaftor/Tezacaftor/Ivacaftor 2021-11-03T15:25 SWC Right 2021-11-03T09:47 2021-11-03T15:25

 

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

Are your dates and datetime values really stored as character?

 

I think a simple left join like ( add an ORDER BY if you need a different order)

Proc sql;
   create table want as
   select a.subjid, a.event, a.eventdt, a.exftrt, a.exdtc
            , b.data_type, b.swlat, b.extstdttm, b.extendttm
   from dosef0 as a
           left join
           ext as b
           on a.subjid= b.subjid
             and a.event=b.event
   ;
quit;

@djbateman wrote:

I have 2 datasets that I need to merge, but I'm stuck in this situation because there is a repeat in the by variables group that I can't ignore.

 

I have a dosing table that I need to merge with an external lab table, but I only want to keep where the subject and visit are found in the dosing table.

 

So, for example below, there is dosing at Day -28 and Day 1, so I only want to keep those visits from EXT (external lab table).  That's easy enough.  The issue arises in the fact that at Day -28 there is only one drug administered while at Day 1 there are 2 drugs administered.  In addition to that, the external types of ECG and SP only have one record per visit while SWC has 2 records per visit (due to left and right laterality).

 

 

proc sql;
	create table DOSEF0 (SUBJID char(11), EVENT char(100), EVENTDT char(10), EXFTRT char(50), EXDTC char(16));
	insert into dosef0 (subjid, event, eventdt, exftrt, exdtc)
		values ('102-087-201','Day -28','2021-10-06','Elexacaftor/Tezacaftor/Ivacaftor','2021-10-06T09:41')
		values ('102-087-201','Day 1','2021-11-03','VX-121/Tezacaftor/Deutivacaftor','2021-11-03T15:00')
		values ('102-087-201','Day 1','2021-11-03','Elexacaftor/Tezacaftor/Ivacaftor','2021-11-03T15:25');

	create table EXT (SUBJID char(11), EVENT char(100), DATA_TYPE char(3), SWLAT char(5), EXTSTDTTM char(16), EXTENDTTM char(16));
	insert into ext (subjid, event, data_type, swlat, extstdttm, extendttm)
		values ('102-087-201','Day -28',	'ECG','',		'2021-10-06T09:06','')
		values ('102-087-201','Screening',	'ECG','',		'2021-09-14T09:34','')
		values ('102-087-201','Screening',	'SWC','LEFT',	'2021-09-14T09:55','2021-09-14T10:09')
		values ('102-087-201','Screening',	'SWC','RIGHT',	'2021-09-14T10:05','2021-09-14T10:20')
		values ('102-087-201','Screening',	'SP', '',		'2021-09-14T10:24','')
		values ('102-087-201','Day 1',		'ECG','',		'2021-11-03T09:15','')
		values ('102-087-201','Day 1',		'SP', '',		'2021-11-03T09:27','')
		values ('102-087-201','Day 1',		'SWC','LEFT',	'2021-11-03T09:40','2021-11-03T15:00')
		values ('102-087-201','Day 1',		'SWC','RIGHT',	'2021-11-03T09:47','2021-11-03T15:25')
		values ('102-087-201','Day 15',		'ECG','',		'2021-11-18T08:49','')
		values ('102-087-201','Day 15',		'SP', '',		'2021-11-18T09:02','')
		values ('102-087-201','Day 15',		'SWC','LEFT',	'2021-11-18T09:13','2021-11-18T09:31')
		values ('102-087-201','Day 15',		'SWC','RIGHT',	'2021-11-18T09:20','2021-11-18T09:45');
quit;

 

So, from the code above, how can I get the table below?  I have tried pulling in EXFTRT into EXT separately without success.  I have tried different cartesian products without success.  I'm just not sure how to go about this.

 
SUBJID EVENT EVENTDT EXFTRT EXDTC DATA_TYPE SWLAT EXSTDTTM EXTENDTTM
102-087-201 Day -28 2021-10-06 Elexacaftor/Tezacaftor/Ivacaftor 2021-10-06T09:41 ECG   2021-10-06T09:06  
102-087-201 Day 1 2021-11-03 VX-121/Tezacaftor/Deutivacaftor 2021-11-03T15:00 ECG   2021-11-03T09:15  
102-087-201 Day 1 2021-11-03 Elexacaftor/Tezacaftor/Ivacaftor 2021-11-03T15:25 ECG   2021-11-03T09:15  
102-087-201 Day 1 2021-11-03 VX-121/Tezacaftor/Deutivacaftor 2021-11-03T15:00 SP   2021-11-03T09:27  
102-087-201 Day 1 2021-11-03 Elexacaftor/Tezacaftor/Ivacaftor 2021-11-03T15:25 SP   2021-11-03T09:27  
102-087-201 Day 1 2021-11-03 VX-121/Tezacaftor/Deutivacaftor 2021-11-03T15:00 SWC Left 2021-11-03T09:40 2021-11-03T15:00
102-087-201 Day 1 2021-11-03 Elexacaftor/Tezacaftor/Ivacaftor 2021-11-03T15:25 SWC Right 2021-11-03T09:47 2021-11-03T15:25
102-087-201 Day 1 2021-11-03 VX-121/Tezacaftor/Deutivacaftor 2021-11-03T15:00 SWC Left 2021-11-03T09:40 2021-11-03T15:00
102-087-201 Day 1 2021-11-03 Elexacaftor/Tezacaftor/Ivacaftor 2021-11-03T15:25 SWC Right 2021-11-03T09:47 2021-11-03T15:25

 


 

 

View solution in original post

2 REPLIES 2
ballardw
Super User

Are your dates and datetime values really stored as character?

 

I think a simple left join like ( add an ORDER BY if you need a different order)

Proc sql;
   create table want as
   select a.subjid, a.event, a.eventdt, a.exftrt, a.exdtc
            , b.data_type, b.swlat, b.extstdttm, b.extendttm
   from dosef0 as a
           left join
           ext as b
           on a.subjid= b.subjid
             and a.event=b.event
   ;
quit;

@djbateman wrote:

I have 2 datasets that I need to merge, but I'm stuck in this situation because there is a repeat in the by variables group that I can't ignore.

 

I have a dosing table that I need to merge with an external lab table, but I only want to keep where the subject and visit are found in the dosing table.

 

So, for example below, there is dosing at Day -28 and Day 1, so I only want to keep those visits from EXT (external lab table).  That's easy enough.  The issue arises in the fact that at Day -28 there is only one drug administered while at Day 1 there are 2 drugs administered.  In addition to that, the external types of ECG and SP only have one record per visit while SWC has 2 records per visit (due to left and right laterality).

 

 

proc sql;
	create table DOSEF0 (SUBJID char(11), EVENT char(100), EVENTDT char(10), EXFTRT char(50), EXDTC char(16));
	insert into dosef0 (subjid, event, eventdt, exftrt, exdtc)
		values ('102-087-201','Day -28','2021-10-06','Elexacaftor/Tezacaftor/Ivacaftor','2021-10-06T09:41')
		values ('102-087-201','Day 1','2021-11-03','VX-121/Tezacaftor/Deutivacaftor','2021-11-03T15:00')
		values ('102-087-201','Day 1','2021-11-03','Elexacaftor/Tezacaftor/Ivacaftor','2021-11-03T15:25');

	create table EXT (SUBJID char(11), EVENT char(100), DATA_TYPE char(3), SWLAT char(5), EXTSTDTTM char(16), EXTENDTTM char(16));
	insert into ext (subjid, event, data_type, swlat, extstdttm, extendttm)
		values ('102-087-201','Day -28',	'ECG','',		'2021-10-06T09:06','')
		values ('102-087-201','Screening',	'ECG','',		'2021-09-14T09:34','')
		values ('102-087-201','Screening',	'SWC','LEFT',	'2021-09-14T09:55','2021-09-14T10:09')
		values ('102-087-201','Screening',	'SWC','RIGHT',	'2021-09-14T10:05','2021-09-14T10:20')
		values ('102-087-201','Screening',	'SP', '',		'2021-09-14T10:24','')
		values ('102-087-201','Day 1',		'ECG','',		'2021-11-03T09:15','')
		values ('102-087-201','Day 1',		'SP', '',		'2021-11-03T09:27','')
		values ('102-087-201','Day 1',		'SWC','LEFT',	'2021-11-03T09:40','2021-11-03T15:00')
		values ('102-087-201','Day 1',		'SWC','RIGHT',	'2021-11-03T09:47','2021-11-03T15:25')
		values ('102-087-201','Day 15',		'ECG','',		'2021-11-18T08:49','')
		values ('102-087-201','Day 15',		'SP', '',		'2021-11-18T09:02','')
		values ('102-087-201','Day 15',		'SWC','LEFT',	'2021-11-18T09:13','2021-11-18T09:31')
		values ('102-087-201','Day 15',		'SWC','RIGHT',	'2021-11-18T09:20','2021-11-18T09:45');
quit;

 

So, from the code above, how can I get the table below?  I have tried pulling in EXFTRT into EXT separately without success.  I have tried different cartesian products without success.  I'm just not sure how to go about this.

 
SUBJID EVENT EVENTDT EXFTRT EXDTC DATA_TYPE SWLAT EXSTDTTM EXTENDTTM
102-087-201 Day -28 2021-10-06 Elexacaftor/Tezacaftor/Ivacaftor 2021-10-06T09:41 ECG   2021-10-06T09:06  
102-087-201 Day 1 2021-11-03 VX-121/Tezacaftor/Deutivacaftor 2021-11-03T15:00 ECG   2021-11-03T09:15  
102-087-201 Day 1 2021-11-03 Elexacaftor/Tezacaftor/Ivacaftor 2021-11-03T15:25 ECG   2021-11-03T09:15  
102-087-201 Day 1 2021-11-03 VX-121/Tezacaftor/Deutivacaftor 2021-11-03T15:00 SP   2021-11-03T09:27  
102-087-201 Day 1 2021-11-03 Elexacaftor/Tezacaftor/Ivacaftor 2021-11-03T15:25 SP   2021-11-03T09:27  
102-087-201 Day 1 2021-11-03 VX-121/Tezacaftor/Deutivacaftor 2021-11-03T15:00 SWC Left 2021-11-03T09:40 2021-11-03T15:00
102-087-201 Day 1 2021-11-03 Elexacaftor/Tezacaftor/Ivacaftor 2021-11-03T15:25 SWC Right 2021-11-03T09:47 2021-11-03T15:25
102-087-201 Day 1 2021-11-03 VX-121/Tezacaftor/Deutivacaftor 2021-11-03T15:00 SWC Left 2021-11-03T09:40 2021-11-03T15:00
102-087-201 Day 1 2021-11-03 Elexacaftor/Tezacaftor/Ivacaftor 2021-11-03T15:25 SWC Right 2021-11-03T09:47 2021-11-03T15:25

 


 

 

djbateman
Lapis Lazuli | Level 10
Oh man! I swear I tried a left join. Maybe I just thought I did but never tested it because I couldn't imagine the result in my head. Thank you so much!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 1931 views
  • 0 likes
  • 2 in conversation