Hello I basically want to keep the visit/visitnum from have2 where ANL01FL='Y' and match with visit/visitnum from have1 but also retain the week 12/16 records where aval is missing. Any help with this matter would be greatly appreciated.
data have1;
input SUBJID $ PARAMCD $ ADT :$9. VISITNUM :8. VISIT :$12. AVAL :8. ;
infile datalines dlm = '|';
datalines;
1004|EASITOTA|27JUN2022|1|Screening|9.6
1004|EASITOTA|21JUL2022|2|Day 1 |16.3
1004|EASITOTA|03AUG2022|4|Week 2 |8.6
1004|EASITOTA|17AUG2022|5|Week 4 |8.7
1004|EASITOTA|14SEP2022|6|Week 8 |9.6
;
run;
data have2;
input SUBJID $ ADT :$9. VISITNUM :8. VISIT :$12. ANL01FL :$1. RTT1L28D :8. RTT1G28D :8. SWREL :8. SWNREL :8. T2CM :8. RTT1L14D :8. RTT1G14D :8. NCUNODEV :8.;
infile datalines dlm = '|';
datalines;
1004|21JUL2022|2|Day 1 |Y|0|0|0|0|0|0|0|1
1004|03AUG2022|4|Week 2 |Y|0|0|0|0|0|0|0|1
1004|17AUG2022|5|Week 4 |Y|0|0|0|0|0|0|0|1
1004|14SEP2022|6|Week 8 |Y|0|0|0|0|0|0|0|1
1004|13OCT2022|7|Week 12|Y|0|0|0|0|0|0|0|1
1004|10NOV2022|8|Week 16|Y|0|0|0|0|0|0|0|1
;
run;
data want;
input SUBJID $ PARAMCD $ ADT :$9. VISITNUM :8. VISIT :$12. RTT1L28D :8. RTT1G28D :8. SWREL :8. SWNREL :8. T2CM :8. RTT1L14D :8. RTT1G14D :8. NCUNODEV :8.;
infile datalines dlm = '|';
datalines;
1004|EASITOTA|21JUL2022|2|Day 1 |16.3|0|0|0|0|0|0|0|1
1004|EASITOTA|03AUG2022|4|Week 2 |8.6 |0|0|0|0|0|0|0|1
1004|EASITOTA|17AUG2022|5|Week 4 |8.7 |0|0|0|0|0|0|0|1
1004|EASITOTA|14SEP2022|6|Week 8 |9.6 |0|0|0|0|0|0|0|1
1004|EASITOTA| |7|Week 12 | |0|0|0|0|0|0|0|1
1004|EASITOTA| |8|Week 16 | |0|0|0|0|0|0|0|1
;
run;
Same approach, simply remove it from the Where and Drop statements like this
data have1;
input SUBJID $ PARAMCD $ ADT :$9. VISITNUM :8. VISIT :$12. AVAL :8. ;
infile datalines dlm = '|';
datalines;
1004|EASITOTA|21JUL2022|2|Day 1 |16.3
1004|EASITOTA|03AUG2022|4|Week 2 |8.6
1004|EASITOTA|17AUG2022|5|Week 4 |8.7
1004|EASITOTA|14SEP2022|6|Week 8 |9.6
;
run;
data have2;
input SUBJID $ ADT :$9. VISITNUM :8. VISIT :$12. RTT1L28D :8. RTT1G28D :8. SWREL :8. SWNREL :8. T2CM :8. RTT1L14D :8. RTT1G14D :8. NCUNODEV :8.;
infile datalines dlm = '|';
datalines;
1004|21JUL2022|2|Day 1 |0|0|0|0|0|0|0|1
1004|03AUG2022|4|Week 2 |0|0|0|0|0|0|0|1
1004|17AUG2022|5|Week 4 |0|0|0|0|0|0|0|1
1004|14SEP2022|6|Week 8 |0|0|0|0|0|0|0|1
1004|13OCT2022|7|Week 12|0|0|0|0|0|0|0|1
1004|10NOV2022|8|Week 16|0|0|0|0|0|0|0|1
;
run;
data want(drop = rc);
format SUBJID PARAMCD ADT VISITNUM VISIT AVAL RTT1L28D RTT1G28D SWREL SWNREL T2CM RTT1L14D RTT1G14D NCUNODEV;
if _N_ = 1 then do;
dcl hash h(dataset : 'have1');
h.definekey('SUBJID', 'VISITNUM');
h.definedata(all : 'Y');
h.definedone();
end;
set have2;
if 0 then set have1;
rc = h.find();
if rc then call missing(ADT, AVAL);
run;
None of the records have ANL01FL='Y'?
There is in the have2 dataset but not have1
No.
Seems to me you switched VISIT and ANL01FL in the have2 data, am I right?
So the have2 data should be
data have2;
input SUBJID $ ADT :$9. VISITNUM :8. VISIT :$12. ANL01FL :$1. RTT1L28D :8. RTT1G28D :8. SWREL :8. SWNREL :8. T2CM :8. RTT1L14D :8. RTT1G14D :8. NCUNODEV :8.;
infile datalines dlm = '|';
datalines;
1004|21JUL2022|2|Day 1 |Y|0|0|0|0|0|0|0|1
1004|03AUG2022|4|Week 2 |Y|0|0|0|0|0|0|0|1
1004|17AUG2022|5|Week 4 |Y|0|0|0|0|0|0|0|1
1004|14SEP2022|6|Week 8 |Y|0|0|0|0|0|0|0|1
1004|13OCT2022|7|Week 12|Y|0|0|0|0|0|0|0|1
1004|10NOV2022|8|Week 16|Y|0|0|0|0|0|0|0|1
;
run;
Sorry yes I did make that update shortly after I posted because I thought it would be easier
Ok.
I think this gives you what you want
data have1;
input SUBJID $ PARAMCD $ ADT :$9. VISITNUM :8. VISIT :$12. AVAL :8. ;
infile datalines dlm = '|';
datalines;
1004|EASITOTA|21JUL2022|2|Day 1 |16.3
1004|EASITOTA|03AUG2022|4|Week 2 |8.6
1004|EASITOTA|17AUG2022|5|Week 4 |8.7
1004|EASITOTA|14SEP2022|6|Week 8 |9.6
;
run;
data have2;
input SUBJID $ ADT :$9. VISITNUM :8. VISIT :$12. ANL01FL :$1. RTT1L28D :8. RTT1G28D :8. SWREL :8. SWNREL :8. T2CM :8. RTT1L14D :8. RTT1G14D :8. NCUNODEV :8.;
infile datalines dlm = '|';
datalines;
1004|21JUL2022|2|Day 1 |Y|0|0|0|0|0|0|0|1
1004|03AUG2022|4|Week 2 |Y|0|0|0|0|0|0|0|1
1004|17AUG2022|5|Week 4 |Y|0|0|0|0|0|0|0|1
1004|14SEP2022|6|Week 8 |Y|0|0|0|0|0|0|0|1
1004|13OCT2022|7|Week 12|Y|0|0|0|0|0|0|0|1
1004|10NOV2022|8|Week 16|Y|0|0|0|0|0|0|0|1
;
run;
data want(drop = rc ANL01FL);
format SUBJID PARAMCD ADT VISITNUM VISIT AVAL RTT1L28D RTT1G28D SWREL SWNREL T2CM RTT1L14D RTT1G14D NCUNODEV;
if _N_ = 1 then do;
dcl hash h(dataset : 'have1');
h.definekey('SUBJID', 'VISITNUM');
h.definedata(all : 'Y');
h.definedone();
end;
set have2;
where ANL01FL='Y';
if 0 then set have1;
rc = h.find();
if rc then call missing(ADT, AVAL);
run;
This works great thanks. My only issue is when I apply it to the actual datasets it only populates for the formatted columns but what do you do if there are multiple other column variables in both datasets?. is it the if 0= set statement at the start instead of formats?. Does that make sense?
More variables in the data should not be a problem here.
you're 100% correct I know what the issue was now. I think I need to remove the ANL01FL actually from the equation as it's causing unexpected events but if I just remove the where clause referencing ANL01FL it doesn't seem to work. How would you do it if ANL01FL didn't exist or is it the exact same approach?
Same approach, simply remove it from the Where and Drop statements like this
data have1;
input SUBJID $ PARAMCD $ ADT :$9. VISITNUM :8. VISIT :$12. AVAL :8. ;
infile datalines dlm = '|';
datalines;
1004|EASITOTA|21JUL2022|2|Day 1 |16.3
1004|EASITOTA|03AUG2022|4|Week 2 |8.6
1004|EASITOTA|17AUG2022|5|Week 4 |8.7
1004|EASITOTA|14SEP2022|6|Week 8 |9.6
;
run;
data have2;
input SUBJID $ ADT :$9. VISITNUM :8. VISIT :$12. RTT1L28D :8. RTT1G28D :8. SWREL :8. SWNREL :8. T2CM :8. RTT1L14D :8. RTT1G14D :8. NCUNODEV :8.;
infile datalines dlm = '|';
datalines;
1004|21JUL2022|2|Day 1 |0|0|0|0|0|0|0|1
1004|03AUG2022|4|Week 2 |0|0|0|0|0|0|0|1
1004|17AUG2022|5|Week 4 |0|0|0|0|0|0|0|1
1004|14SEP2022|6|Week 8 |0|0|0|0|0|0|0|1
1004|13OCT2022|7|Week 12|0|0|0|0|0|0|0|1
1004|10NOV2022|8|Week 16|0|0|0|0|0|0|0|1
;
run;
data want(drop = rc);
format SUBJID PARAMCD ADT VISITNUM VISIT AVAL RTT1L28D RTT1G28D SWREL SWNREL T2CM RTT1L14D RTT1G14D NCUNODEV;
if _N_ = 1 then do;
dcl hash h(dataset : 'have1');
h.definekey('SUBJID', 'VISITNUM');
h.definedata(all : 'Y');
h.definedone();
end;
set have2;
if 0 then set have1;
rc = h.find();
if rc then call missing(ADT, AVAL);
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.