I have 2 datasets that I need to merge, but I'm stuck in this situation because there is a repeat in the by variables group that I can't ignore.
I have a dosing table that I need to merge with an external lab table, but I only want to keep where the subject and visit are found in the dosing table.
So, for example below, there is dosing at Day -28 and Day 1, so I only want to keep those visits from EXT (external lab table). That's easy enough. The issue arises in the fact that at Day -28 there is only one drug administered while at Day 1 there are 2 drugs administered. In addition to that, the external types of ECG and SP only have one record per visit while SWC has 2 records per visit (due to left and right laterality).
proc sql;
create table DOSEF0 (SUBJID char(11), EVENT char(100), EVENTDT char(10), EXFTRT char(50), EXDTC char(16));
insert into dosef0 (subjid, event, eventdt, exftrt, exdtc)
values ('102-087-201','Day -28','2021-10-06','Elexacaftor/Tezacaftor/Ivacaftor','2021-10-06T09:41')
values ('102-087-201','Day 1','2021-11-03','VX-121/Tezacaftor/Deutivacaftor','2021-11-03T15:00')
values ('102-087-201','Day 1','2021-11-03','Elexacaftor/Tezacaftor/Ivacaftor','2021-11-03T15:25');
create table EXT (SUBJID char(11), EVENT char(100), DATA_TYPE char(3), SWLAT char(5), EXTSTDTTM char(16), EXTENDTTM char(16));
insert into ext (subjid, event, data_type, swlat, extstdttm, extendttm)
values ('102-087-201','Day -28', 'ECG','', '2021-10-06T09:06','')
values ('102-087-201','Screening', 'ECG','', '2021-09-14T09:34','')
values ('102-087-201','Screening', 'SWC','LEFT', '2021-09-14T09:55','2021-09-14T10:09')
values ('102-087-201','Screening', 'SWC','RIGHT', '2021-09-14T10:05','2021-09-14T10:20')
values ('102-087-201','Screening', 'SP', '', '2021-09-14T10:24','')
values ('102-087-201','Day 1', 'ECG','', '2021-11-03T09:15','')
values ('102-087-201','Day 1', 'SP', '', '2021-11-03T09:27','')
values ('102-087-201','Day 1', 'SWC','LEFT', '2021-11-03T09:40','2021-11-03T15:00')
values ('102-087-201','Day 1', 'SWC','RIGHT', '2021-11-03T09:47','2021-11-03T15:25')
values ('102-087-201','Day 15', 'ECG','', '2021-11-18T08:49','')
values ('102-087-201','Day 15', 'SP', '', '2021-11-18T09:02','')
values ('102-087-201','Day 15', 'SWC','LEFT', '2021-11-18T09:13','2021-11-18T09:31')
values ('102-087-201','Day 15', 'SWC','RIGHT', '2021-11-18T09:20','2021-11-18T09:45');
quit;
So, from the code above, how can I get the table below? I have tried pulling in EXFTRT into EXT separately without success. I have tried different cartesian products without success. I'm just not sure how to go about this.
SUBJID | EVENT | EVENTDT | EXFTRT | EXDTC | DATA_TYPE | SWLAT | EXSTDTTM | EXTENDTTM |
102-087-201 | Day -28 | 2021-10-06 | Elexacaftor/Tezacaftor/Ivacaftor | 2021-10-06T09:41 | ECG | 2021-10-06T09:06 | ||
102-087-201 | Day 1 | 2021-11-03 | VX-121/Tezacaftor/Deutivacaftor | 2021-11-03T15:00 | ECG | 2021-11-03T09:15 | ||
102-087-201 | Day 1 | 2021-11-03 | Elexacaftor/Tezacaftor/Ivacaftor | 2021-11-03T15:25 | ECG | 2021-11-03T09:15 | ||
102-087-201 | Day 1 | 2021-11-03 | VX-121/Tezacaftor/Deutivacaftor | 2021-11-03T15:00 | SP | 2021-11-03T09:27 | ||
102-087-201 | Day 1 | 2021-11-03 | Elexacaftor/Tezacaftor/Ivacaftor | 2021-11-03T15:25 | SP | 2021-11-03T09:27 | ||
102-087-201 | Day 1 | 2021-11-03 | VX-121/Tezacaftor/Deutivacaftor | 2021-11-03T15:00 | SWC | Left | 2021-11-03T09:40 | 2021-11-03T15:00 |
102-087-201 | Day 1 | 2021-11-03 | Elexacaftor/Tezacaftor/Ivacaftor | 2021-11-03T15:25 | SWC | Right | 2021-11-03T09:47 | 2021-11-03T15:25 |
102-087-201 | Day 1 | 2021-11-03 | VX-121/Tezacaftor/Deutivacaftor | 2021-11-03T15:00 | SWC | Left | 2021-11-03T09:40 | 2021-11-03T15:00 |
102-087-201 | Day 1 | 2021-11-03 | Elexacaftor/Tezacaftor/Ivacaftor | 2021-11-03T15:25 | SWC | Right | 2021-11-03T09:47 | 2021-11-03T15:25 |
Are your dates and datetime values really stored as character?
I think a simple left join like ( add an ORDER BY if you need a different order)
Proc sql; create table want as select a.subjid, a.event, a.eventdt, a.exftrt, a.exdtc , b.data_type, b.swlat, b.extstdttm, b.extendttm from dosef0 as a left join ext as b on a.subjid= b.subjid and a.event=b.event ; quit;
@djbateman wrote:
I have 2 datasets that I need to merge, but I'm stuck in this situation because there is a repeat in the by variables group that I can't ignore.
I have a dosing table that I need to merge with an external lab table, but I only want to keep where the subject and visit are found in the dosing table.
So, for example below, there is dosing at Day -28 and Day 1, so I only want to keep those visits from EXT (external lab table). That's easy enough. The issue arises in the fact that at Day -28 there is only one drug administered while at Day 1 there are 2 drugs administered. In addition to that, the external types of ECG and SP only have one record per visit while SWC has 2 records per visit (due to left and right laterality).
proc sql; create table DOSEF0 (SUBJID char(11), EVENT char(100), EVENTDT char(10), EXFTRT char(50), EXDTC char(16)); insert into dosef0 (subjid, event, eventdt, exftrt, exdtc) values ('102-087-201','Day -28','2021-10-06','Elexacaftor/Tezacaftor/Ivacaftor','2021-10-06T09:41') values ('102-087-201','Day 1','2021-11-03','VX-121/Tezacaftor/Deutivacaftor','2021-11-03T15:00') values ('102-087-201','Day 1','2021-11-03','Elexacaftor/Tezacaftor/Ivacaftor','2021-11-03T15:25'); create table EXT (SUBJID char(11), EVENT char(100), DATA_TYPE char(3), SWLAT char(5), EXTSTDTTM char(16), EXTENDTTM char(16)); insert into ext (subjid, event, data_type, swlat, extstdttm, extendttm) values ('102-087-201','Day -28', 'ECG','', '2021-10-06T09:06','') values ('102-087-201','Screening', 'ECG','', '2021-09-14T09:34','') values ('102-087-201','Screening', 'SWC','LEFT', '2021-09-14T09:55','2021-09-14T10:09') values ('102-087-201','Screening', 'SWC','RIGHT', '2021-09-14T10:05','2021-09-14T10:20') values ('102-087-201','Screening', 'SP', '', '2021-09-14T10:24','') values ('102-087-201','Day 1', 'ECG','', '2021-11-03T09:15','') values ('102-087-201','Day 1', 'SP', '', '2021-11-03T09:27','') values ('102-087-201','Day 1', 'SWC','LEFT', '2021-11-03T09:40','2021-11-03T15:00') values ('102-087-201','Day 1', 'SWC','RIGHT', '2021-11-03T09:47','2021-11-03T15:25') values ('102-087-201','Day 15', 'ECG','', '2021-11-18T08:49','') values ('102-087-201','Day 15', 'SP', '', '2021-11-18T09:02','') values ('102-087-201','Day 15', 'SWC','LEFT', '2021-11-18T09:13','2021-11-18T09:31') values ('102-087-201','Day 15', 'SWC','RIGHT', '2021-11-18T09:20','2021-11-18T09:45'); quit;
So, from the code above, how can I get the table below? I have tried pulling in EXFTRT into EXT separately without success. I have tried different cartesian products without success. I'm just not sure how to go about this.
SUBJID EVENT EVENTDT EXFTRT EXDTC DATA_TYPE SWLAT EXSTDTTM EXTENDTTM 102-087-201 Day -28 2021-10-06 Elexacaftor/Tezacaftor/Ivacaftor 2021-10-06T09:41 ECG 2021-10-06T09:06 102-087-201 Day 1 2021-11-03 VX-121/Tezacaftor/Deutivacaftor 2021-11-03T15:00 ECG 2021-11-03T09:15 102-087-201 Day 1 2021-11-03 Elexacaftor/Tezacaftor/Ivacaftor 2021-11-03T15:25 ECG 2021-11-03T09:15 102-087-201 Day 1 2021-11-03 VX-121/Tezacaftor/Deutivacaftor 2021-11-03T15:00 SP 2021-11-03T09:27 102-087-201 Day 1 2021-11-03 Elexacaftor/Tezacaftor/Ivacaftor 2021-11-03T15:25 SP 2021-11-03T09:27 102-087-201 Day 1 2021-11-03 VX-121/Tezacaftor/Deutivacaftor 2021-11-03T15:00 SWC Left 2021-11-03T09:40 2021-11-03T15:00 102-087-201 Day 1 2021-11-03 Elexacaftor/Tezacaftor/Ivacaftor 2021-11-03T15:25 SWC Right 2021-11-03T09:47 2021-11-03T15:25 102-087-201 Day 1 2021-11-03 VX-121/Tezacaftor/Deutivacaftor 2021-11-03T15:00 SWC Left 2021-11-03T09:40 2021-11-03T15:00 102-087-201 Day 1 2021-11-03 Elexacaftor/Tezacaftor/Ivacaftor 2021-11-03T15:25 SWC Right 2021-11-03T09:47 2021-11-03T15:25
Are your dates and datetime values really stored as character?
I think a simple left join like ( add an ORDER BY if you need a different order)
Proc sql; create table want as select a.subjid, a.event, a.eventdt, a.exftrt, a.exdtc , b.data_type, b.swlat, b.extstdttm, b.extendttm from dosef0 as a left join ext as b on a.subjid= b.subjid and a.event=b.event ; quit;
@djbateman wrote:
I have 2 datasets that I need to merge, but I'm stuck in this situation because there is a repeat in the by variables group that I can't ignore.
I have a dosing table that I need to merge with an external lab table, but I only want to keep where the subject and visit are found in the dosing table.
So, for example below, there is dosing at Day -28 and Day 1, so I only want to keep those visits from EXT (external lab table). That's easy enough. The issue arises in the fact that at Day -28 there is only one drug administered while at Day 1 there are 2 drugs administered. In addition to that, the external types of ECG and SP only have one record per visit while SWC has 2 records per visit (due to left and right laterality).
proc sql; create table DOSEF0 (SUBJID char(11), EVENT char(100), EVENTDT char(10), EXFTRT char(50), EXDTC char(16)); insert into dosef0 (subjid, event, eventdt, exftrt, exdtc) values ('102-087-201','Day -28','2021-10-06','Elexacaftor/Tezacaftor/Ivacaftor','2021-10-06T09:41') values ('102-087-201','Day 1','2021-11-03','VX-121/Tezacaftor/Deutivacaftor','2021-11-03T15:00') values ('102-087-201','Day 1','2021-11-03','Elexacaftor/Tezacaftor/Ivacaftor','2021-11-03T15:25'); create table EXT (SUBJID char(11), EVENT char(100), DATA_TYPE char(3), SWLAT char(5), EXTSTDTTM char(16), EXTENDTTM char(16)); insert into ext (subjid, event, data_type, swlat, extstdttm, extendttm) values ('102-087-201','Day -28', 'ECG','', '2021-10-06T09:06','') values ('102-087-201','Screening', 'ECG','', '2021-09-14T09:34','') values ('102-087-201','Screening', 'SWC','LEFT', '2021-09-14T09:55','2021-09-14T10:09') values ('102-087-201','Screening', 'SWC','RIGHT', '2021-09-14T10:05','2021-09-14T10:20') values ('102-087-201','Screening', 'SP', '', '2021-09-14T10:24','') values ('102-087-201','Day 1', 'ECG','', '2021-11-03T09:15','') values ('102-087-201','Day 1', 'SP', '', '2021-11-03T09:27','') values ('102-087-201','Day 1', 'SWC','LEFT', '2021-11-03T09:40','2021-11-03T15:00') values ('102-087-201','Day 1', 'SWC','RIGHT', '2021-11-03T09:47','2021-11-03T15:25') values ('102-087-201','Day 15', 'ECG','', '2021-11-18T08:49','') values ('102-087-201','Day 15', 'SP', '', '2021-11-18T09:02','') values ('102-087-201','Day 15', 'SWC','LEFT', '2021-11-18T09:13','2021-11-18T09:31') values ('102-087-201','Day 15', 'SWC','RIGHT', '2021-11-18T09:20','2021-11-18T09:45'); quit;
So, from the code above, how can I get the table below? I have tried pulling in EXFTRT into EXT separately without success. I have tried different cartesian products without success. I'm just not sure how to go about this.
SUBJID EVENT EVENTDT EXFTRT EXDTC DATA_TYPE SWLAT EXSTDTTM EXTENDTTM 102-087-201 Day -28 2021-10-06 Elexacaftor/Tezacaftor/Ivacaftor 2021-10-06T09:41 ECG 2021-10-06T09:06 102-087-201 Day 1 2021-11-03 VX-121/Tezacaftor/Deutivacaftor 2021-11-03T15:00 ECG 2021-11-03T09:15 102-087-201 Day 1 2021-11-03 Elexacaftor/Tezacaftor/Ivacaftor 2021-11-03T15:25 ECG 2021-11-03T09:15 102-087-201 Day 1 2021-11-03 VX-121/Tezacaftor/Deutivacaftor 2021-11-03T15:00 SP 2021-11-03T09:27 102-087-201 Day 1 2021-11-03 Elexacaftor/Tezacaftor/Ivacaftor 2021-11-03T15:25 SP 2021-11-03T09:27 102-087-201 Day 1 2021-11-03 VX-121/Tezacaftor/Deutivacaftor 2021-11-03T15:00 SWC Left 2021-11-03T09:40 2021-11-03T15:00 102-087-201 Day 1 2021-11-03 Elexacaftor/Tezacaftor/Ivacaftor 2021-11-03T15:25 SWC Right 2021-11-03T09:47 2021-11-03T15:25 102-087-201 Day 1 2021-11-03 VX-121/Tezacaftor/Deutivacaftor 2021-11-03T15:00 SWC Left 2021-11-03T09:40 2021-11-03T15:00 102-087-201 Day 1 2021-11-03 Elexacaftor/Tezacaftor/Ivacaftor 2021-11-03T15:25 SWC Right 2021-11-03T09:47 2021-11-03T15:25
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.