I have 2 datasets that I need to merge, but I'm stuck in this situation because there is a repeat in the by variables group that I can't ignore.
I have a dosing table that I need to merge with an external lab table, but I only want to keep where the subject and visit are found in the dosing table.
So, for example below, there is dosing at Day -28 and Day 1, so I only want to keep those visits from EXT (external lab table). That's easy enough. The issue arises in the fact that at Day -28 there is only one drug administered while at Day 1 there are 2 drugs administered. In addition to that, the external types of ECG and SP only have one record per visit while SWC has 2 records per visit (due to left and right laterality).
proc sql;
	create table DOSEF0 (SUBJID char(11), EVENT char(100), EVENTDT char(10), EXFTRT char(50), EXDTC char(16));
	insert into dosef0 (subjid, event, eventdt, exftrt, exdtc)
		values ('102-087-201','Day -28','2021-10-06','Elexacaftor/Tezacaftor/Ivacaftor','2021-10-06T09:41')
		values ('102-087-201','Day 1','2021-11-03','VX-121/Tezacaftor/Deutivacaftor','2021-11-03T15:00')
		values ('102-087-201','Day 1','2021-11-03','Elexacaftor/Tezacaftor/Ivacaftor','2021-11-03T15:25');
	create table EXT (SUBJID char(11), EVENT char(100), DATA_TYPE char(3), SWLAT char(5), EXTSTDTTM char(16), EXTENDTTM char(16));
	insert into ext (subjid, event, data_type, swlat, extstdttm, extendttm)
		values ('102-087-201','Day -28',	'ECG','',		'2021-10-06T09:06','')
		values ('102-087-201','Screening',	'ECG','',		'2021-09-14T09:34','')
		values ('102-087-201','Screening',	'SWC','LEFT',	'2021-09-14T09:55','2021-09-14T10:09')
		values ('102-087-201','Screening',	'SWC','RIGHT',	'2021-09-14T10:05','2021-09-14T10:20')
		values ('102-087-201','Screening',	'SP', '',		'2021-09-14T10:24','')
		values ('102-087-201','Day 1',		'ECG','',		'2021-11-03T09:15','')
		values ('102-087-201','Day 1',		'SP', '',		'2021-11-03T09:27','')
		values ('102-087-201','Day 1',		'SWC','LEFT',	'2021-11-03T09:40','2021-11-03T15:00')
		values ('102-087-201','Day 1',		'SWC','RIGHT',	'2021-11-03T09:47','2021-11-03T15:25')
		values ('102-087-201','Day 15',		'ECG','',		'2021-11-18T08:49','')
		values ('102-087-201','Day 15',		'SP', '',		'2021-11-18T09:02','')
		values ('102-087-201','Day 15',		'SWC','LEFT',	'2021-11-18T09:13','2021-11-18T09:31')
		values ('102-087-201','Day 15',		'SWC','RIGHT',	'2021-11-18T09:20','2021-11-18T09:45');
quit;
So, from the code above, how can I get the table below? I have tried pulling in EXFTRT into EXT separately without success. I have tried different cartesian products without success. I'm just not sure how to go about this.
| SUBJID | EVENT | EVENTDT | EXFTRT | EXDTC | DATA_TYPE | SWLAT | EXSTDTTM | EXTENDTTM | 
| 102-087-201 | Day -28 | 2021-10-06 | Elexacaftor/Tezacaftor/Ivacaftor | 2021-10-06T09:41 | ECG | 2021-10-06T09:06 | ||
| 102-087-201 | Day 1 | 2021-11-03 | VX-121/Tezacaftor/Deutivacaftor | 2021-11-03T15:00 | ECG | 2021-11-03T09:15 | ||
| 102-087-201 | Day 1 | 2021-11-03 | Elexacaftor/Tezacaftor/Ivacaftor | 2021-11-03T15:25 | ECG | 2021-11-03T09:15 | ||
| 102-087-201 | Day 1 | 2021-11-03 | VX-121/Tezacaftor/Deutivacaftor | 2021-11-03T15:00 | SP | 2021-11-03T09:27 | ||
| 102-087-201 | Day 1 | 2021-11-03 | Elexacaftor/Tezacaftor/Ivacaftor | 2021-11-03T15:25 | SP | 2021-11-03T09:27 | ||
| 102-087-201 | Day 1 | 2021-11-03 | VX-121/Tezacaftor/Deutivacaftor | 2021-11-03T15:00 | SWC | Left | 2021-11-03T09:40 | 2021-11-03T15:00 | 
| 102-087-201 | Day 1 | 2021-11-03 | Elexacaftor/Tezacaftor/Ivacaftor | 2021-11-03T15:25 | SWC | Right | 2021-11-03T09:47 | 2021-11-03T15:25 | 
| 102-087-201 | Day 1 | 2021-11-03 | VX-121/Tezacaftor/Deutivacaftor | 2021-11-03T15:00 | SWC | Left | 2021-11-03T09:40 | 2021-11-03T15:00 | 
| 102-087-201 | Day 1 | 2021-11-03 | Elexacaftor/Tezacaftor/Ivacaftor | 2021-11-03T15:25 | SWC | Right | 2021-11-03T09:47 | 2021-11-03T15:25 | 
Are your dates and datetime values really stored as character?
I think a simple left join like ( add an ORDER BY if you need a different order)
Proc sql;
   create table want as
   select a.subjid, a.event, a.eventdt, a.exftrt, a.exdtc
            , b.data_type, b.swlat, b.extstdttm, b.extendttm
   from dosef0 as a
           left join
           ext as b
           on a.subjid= b.subjid
             and a.event=b.event
   ;
quit;
@djbateman wrote:
I have 2 datasets that I need to merge, but I'm stuck in this situation because there is a repeat in the by variables group that I can't ignore.
I have a dosing table that I need to merge with an external lab table, but I only want to keep where the subject and visit are found in the dosing table.
So, for example below, there is dosing at Day -28 and Day 1, so I only want to keep those visits from EXT (external lab table). That's easy enough. The issue arises in the fact that at Day -28 there is only one drug administered while at Day 1 there are 2 drugs administered. In addition to that, the external types of ECG and SP only have one record per visit while SWC has 2 records per visit (due to left and right laterality).
proc sql; create table DOSEF0 (SUBJID char(11), EVENT char(100), EVENTDT char(10), EXFTRT char(50), EXDTC char(16)); insert into dosef0 (subjid, event, eventdt, exftrt, exdtc) values ('102-087-201','Day -28','2021-10-06','Elexacaftor/Tezacaftor/Ivacaftor','2021-10-06T09:41') values ('102-087-201','Day 1','2021-11-03','VX-121/Tezacaftor/Deutivacaftor','2021-11-03T15:00') values ('102-087-201','Day 1','2021-11-03','Elexacaftor/Tezacaftor/Ivacaftor','2021-11-03T15:25'); create table EXT (SUBJID char(11), EVENT char(100), DATA_TYPE char(3), SWLAT char(5), EXTSTDTTM char(16), EXTENDTTM char(16)); insert into ext (subjid, event, data_type, swlat, extstdttm, extendttm) values ('102-087-201','Day -28', 'ECG','', '2021-10-06T09:06','') values ('102-087-201','Screening', 'ECG','', '2021-09-14T09:34','') values ('102-087-201','Screening', 'SWC','LEFT', '2021-09-14T09:55','2021-09-14T10:09') values ('102-087-201','Screening', 'SWC','RIGHT', '2021-09-14T10:05','2021-09-14T10:20') values ('102-087-201','Screening', 'SP', '', '2021-09-14T10:24','') values ('102-087-201','Day 1', 'ECG','', '2021-11-03T09:15','') values ('102-087-201','Day 1', 'SP', '', '2021-11-03T09:27','') values ('102-087-201','Day 1', 'SWC','LEFT', '2021-11-03T09:40','2021-11-03T15:00') values ('102-087-201','Day 1', 'SWC','RIGHT', '2021-11-03T09:47','2021-11-03T15:25') values ('102-087-201','Day 15', 'ECG','', '2021-11-18T08:49','') values ('102-087-201','Day 15', 'SP', '', '2021-11-18T09:02','') values ('102-087-201','Day 15', 'SWC','LEFT', '2021-11-18T09:13','2021-11-18T09:31') values ('102-087-201','Day 15', 'SWC','RIGHT', '2021-11-18T09:20','2021-11-18T09:45'); quit;
So, from the code above, how can I get the table below? I have tried pulling in EXFTRT into EXT separately without success. I have tried different cartesian products without success. I'm just not sure how to go about this.
SUBJID EVENT EVENTDT EXFTRT EXDTC DATA_TYPE SWLAT EXSTDTTM EXTENDTTM 102-087-201 Day -28 2021-10-06 Elexacaftor/Tezacaftor/Ivacaftor 2021-10-06T09:41 ECG 2021-10-06T09:06 102-087-201 Day 1 2021-11-03 VX-121/Tezacaftor/Deutivacaftor 2021-11-03T15:00 ECG 2021-11-03T09:15 102-087-201 Day 1 2021-11-03 Elexacaftor/Tezacaftor/Ivacaftor 2021-11-03T15:25 ECG 2021-11-03T09:15 102-087-201 Day 1 2021-11-03 VX-121/Tezacaftor/Deutivacaftor 2021-11-03T15:00 SP 2021-11-03T09:27 102-087-201 Day 1 2021-11-03 Elexacaftor/Tezacaftor/Ivacaftor 2021-11-03T15:25 SP 2021-11-03T09:27 102-087-201 Day 1 2021-11-03 VX-121/Tezacaftor/Deutivacaftor 2021-11-03T15:00 SWC Left 2021-11-03T09:40 2021-11-03T15:00 102-087-201 Day 1 2021-11-03 Elexacaftor/Tezacaftor/Ivacaftor 2021-11-03T15:25 SWC Right 2021-11-03T09:47 2021-11-03T15:25 102-087-201 Day 1 2021-11-03 VX-121/Tezacaftor/Deutivacaftor 2021-11-03T15:00 SWC Left 2021-11-03T09:40 2021-11-03T15:00 102-087-201 Day 1 2021-11-03 Elexacaftor/Tezacaftor/Ivacaftor 2021-11-03T15:25 SWC Right 2021-11-03T09:47 2021-11-03T15:25 
Are your dates and datetime values really stored as character?
I think a simple left join like ( add an ORDER BY if you need a different order)
Proc sql;
   create table want as
   select a.subjid, a.event, a.eventdt, a.exftrt, a.exdtc
            , b.data_type, b.swlat, b.extstdttm, b.extendttm
   from dosef0 as a
           left join
           ext as b
           on a.subjid= b.subjid
             and a.event=b.event
   ;
quit;
@djbateman wrote:
I have 2 datasets that I need to merge, but I'm stuck in this situation because there is a repeat in the by variables group that I can't ignore.
I have a dosing table that I need to merge with an external lab table, but I only want to keep where the subject and visit are found in the dosing table.
So, for example below, there is dosing at Day -28 and Day 1, so I only want to keep those visits from EXT (external lab table). That's easy enough. The issue arises in the fact that at Day -28 there is only one drug administered while at Day 1 there are 2 drugs administered. In addition to that, the external types of ECG and SP only have one record per visit while SWC has 2 records per visit (due to left and right laterality).
proc sql; create table DOSEF0 (SUBJID char(11), EVENT char(100), EVENTDT char(10), EXFTRT char(50), EXDTC char(16)); insert into dosef0 (subjid, event, eventdt, exftrt, exdtc) values ('102-087-201','Day -28','2021-10-06','Elexacaftor/Tezacaftor/Ivacaftor','2021-10-06T09:41') values ('102-087-201','Day 1','2021-11-03','VX-121/Tezacaftor/Deutivacaftor','2021-11-03T15:00') values ('102-087-201','Day 1','2021-11-03','Elexacaftor/Tezacaftor/Ivacaftor','2021-11-03T15:25'); create table EXT (SUBJID char(11), EVENT char(100), DATA_TYPE char(3), SWLAT char(5), EXTSTDTTM char(16), EXTENDTTM char(16)); insert into ext (subjid, event, data_type, swlat, extstdttm, extendttm) values ('102-087-201','Day -28', 'ECG','', '2021-10-06T09:06','') values ('102-087-201','Screening', 'ECG','', '2021-09-14T09:34','') values ('102-087-201','Screening', 'SWC','LEFT', '2021-09-14T09:55','2021-09-14T10:09') values ('102-087-201','Screening', 'SWC','RIGHT', '2021-09-14T10:05','2021-09-14T10:20') values ('102-087-201','Screening', 'SP', '', '2021-09-14T10:24','') values ('102-087-201','Day 1', 'ECG','', '2021-11-03T09:15','') values ('102-087-201','Day 1', 'SP', '', '2021-11-03T09:27','') values ('102-087-201','Day 1', 'SWC','LEFT', '2021-11-03T09:40','2021-11-03T15:00') values ('102-087-201','Day 1', 'SWC','RIGHT', '2021-11-03T09:47','2021-11-03T15:25') values ('102-087-201','Day 15', 'ECG','', '2021-11-18T08:49','') values ('102-087-201','Day 15', 'SP', '', '2021-11-18T09:02','') values ('102-087-201','Day 15', 'SWC','LEFT', '2021-11-18T09:13','2021-11-18T09:31') values ('102-087-201','Day 15', 'SWC','RIGHT', '2021-11-18T09:20','2021-11-18T09:45'); quit;
So, from the code above, how can I get the table below? I have tried pulling in EXFTRT into EXT separately without success. I have tried different cartesian products without success. I'm just not sure how to go about this.
SUBJID EVENT EVENTDT EXFTRT EXDTC DATA_TYPE SWLAT EXSTDTTM EXTENDTTM 102-087-201 Day -28 2021-10-06 Elexacaftor/Tezacaftor/Ivacaftor 2021-10-06T09:41 ECG 2021-10-06T09:06 102-087-201 Day 1 2021-11-03 VX-121/Tezacaftor/Deutivacaftor 2021-11-03T15:00 ECG 2021-11-03T09:15 102-087-201 Day 1 2021-11-03 Elexacaftor/Tezacaftor/Ivacaftor 2021-11-03T15:25 ECG 2021-11-03T09:15 102-087-201 Day 1 2021-11-03 VX-121/Tezacaftor/Deutivacaftor 2021-11-03T15:00 SP 2021-11-03T09:27 102-087-201 Day 1 2021-11-03 Elexacaftor/Tezacaftor/Ivacaftor 2021-11-03T15:25 SP 2021-11-03T09:27 102-087-201 Day 1 2021-11-03 VX-121/Tezacaftor/Deutivacaftor 2021-11-03T15:00 SWC Left 2021-11-03T09:40 2021-11-03T15:00 102-087-201 Day 1 2021-11-03 Elexacaftor/Tezacaftor/Ivacaftor 2021-11-03T15:25 SWC Right 2021-11-03T09:47 2021-11-03T15:25 102-087-201 Day 1 2021-11-03 VX-121/Tezacaftor/Deutivacaftor 2021-11-03T15:00 SWC Left 2021-11-03T09:40 2021-11-03T15:00 102-087-201 Day 1 2021-11-03 Elexacaftor/Tezacaftor/Ivacaftor 2021-11-03T15:25 SWC Right 2021-11-03T09:47 2021-11-03T15:25 
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.
