I have two variables that should match, but some do not. The variables are comma separated strings and if one list is missing values from the ideal list, I need it to output those values.
Have:
Animal Days Days_expected
100 1,2,3,4 1,2,3,4,5
101 2,3,5 1,2,3,4,5
102 1,3,4,5 1,2,3,4,5
Want:
Animal Days Days_expected Days_missing
100 1,2,3,4 1,2,3,4,5 5
101 2,3,5 1,2,3,4,5 1,4
102 1,3,4,5 1,2,3,4,5 2
Just loop over the set of expected values and test if it appears or not.
data have;
input Animal $ Days :$10. Days_expected :$10.;
cards;
100 1,2,3,4 1,2,3,4,5
101 2,3,5 1,2,3,4,5
102 1,3,4,5 1,2,3,4,5
;
data want;
set have;
length Days_missing $10;
do index=1 to countw(days_expected,',');
if not findw(days,scan(days_expected,index,','),',','T') then
days_missing=catx(',',days_missing,scan(days_expected,index,','))
;
end;
drop index;
run;
Just loop over the set of expected values and test if it appears or not.
data have;
input Animal $ Days :$10. Days_expected :$10.;
cards;
100 1,2,3,4 1,2,3,4,5
101 2,3,5 1,2,3,4,5
102 1,3,4,5 1,2,3,4,5
;
data want;
set have;
length Days_missing $10;
do index=1 to countw(days_expected,',');
if not findw(days,scan(days_expected,index,','),',','T') then
days_missing=catx(',',days_missing,scan(days_expected,index,','))
;
end;
drop index;
run;
Generally placing multiple values into a single variable is a bad thing and this is an example.
Questions: Does the value of "Days_expected" ever actually change? Is it actually a variable in the dataset? If so, is it always going to be sequentially numbered values?
With the example provided:
data have; infile datalines dlm='|'; input Animal $ Days:$15. Days_expected :$15. ; length word $ 3 result $ 15; do i=1 to countw(days_expected,','); word = scan(days_expected,i,','); if findw(strip(days), strip(word),',')=0 then result=catx(',',result, word); end; drop i word; datalines; 100|1,2,3,4|1,2,3,4,5 101|2,3,5|1,2,3,4,5 102|1,3,4,5|1,2,3,4,5 ;
Strip is needed in a few places because we do not know the actual lengths of your variables or the maximum possible value(s) involved.
Countw counts how many elements are in the list. If Days_expected is not an actual variable then use a quoted list such as "1,2,3,4,5" every where the variable Days_expected appears.
Scan pulls out the word. Findw finds if a word matches in the first value. If not the returned value is 0 so we add the word to the Result list. Catx places a comma between the words.
data have;
input Animal $ Days :$10. Days_expected :$10.;
cards;
100 1,2,3,4 1,2,3,4,5
101 2,3,5 1,2,3,4,5
102 1,3,4,5 1,2,3,4,5
;
data want;
set have;
Days_missing=prxchange(cats('s/\b(',translate(Days,'|',','),')\b//'),-1,Days_expected);
Days_missing=prxchange('s/^,+|,+$//',-1,strip(Days_missing));
Days_missing=prxchange('s/,+/,/',-1,Days_missing);
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.