I have two variables that should match, but some do not. The variables are comma separated strings and if one list is missing values from the ideal list, I need it to output those values.
Have:
Animal Days Days_expected
100 1,2,3,4 1,2,3,4,5
101 2,3,5 1,2,3,4,5
102 1,3,4,5 1,2,3,4,5
Want:
Animal Days Days_expected Days_missing
100 1,2,3,4 1,2,3,4,5 5
101 2,3,5 1,2,3,4,5 1,4
102 1,3,4,5 1,2,3,4,5 2
Just loop over the set of expected values and test if it appears or not.
data have;
input Animal $ Days :$10. Days_expected :$10.;
cards;
100 1,2,3,4 1,2,3,4,5
101 2,3,5 1,2,3,4,5
102 1,3,4,5 1,2,3,4,5
;
data want;
set have;
length Days_missing $10;
do index=1 to countw(days_expected,',');
if not findw(days,scan(days_expected,index,','),',','T') then
days_missing=catx(',',days_missing,scan(days_expected,index,','))
;
end;
drop index;
run;
Just loop over the set of expected values and test if it appears or not.
data have;
input Animal $ Days :$10. Days_expected :$10.;
cards;
100 1,2,3,4 1,2,3,4,5
101 2,3,5 1,2,3,4,5
102 1,3,4,5 1,2,3,4,5
;
data want;
set have;
length Days_missing $10;
do index=1 to countw(days_expected,',');
if not findw(days,scan(days_expected,index,','),',','T') then
days_missing=catx(',',days_missing,scan(days_expected,index,','))
;
end;
drop index;
run;
Generally placing multiple values into a single variable is a bad thing and this is an example.
Questions: Does the value of "Days_expected" ever actually change? Is it actually a variable in the dataset? If so, is it always going to be sequentially numbered values?
With the example provided:
data have; infile datalines dlm='|'; input Animal $ Days:$15. Days_expected :$15. ; length word $ 3 result $ 15; do i=1 to countw(days_expected,','); word = scan(days_expected,i,','); if findw(strip(days), strip(word),',')=0 then result=catx(',',result, word); end; drop i word; datalines; 100|1,2,3,4|1,2,3,4,5 101|2,3,5|1,2,3,4,5 102|1,3,4,5|1,2,3,4,5 ;
Strip is needed in a few places because we do not know the actual lengths of your variables or the maximum possible value(s) involved.
Countw counts how many elements are in the list. If Days_expected is not an actual variable then use a quoted list such as "1,2,3,4,5" every where the variable Days_expected appears.
Scan pulls out the word. Findw finds if a word matches in the first value. If not the returned value is 0 so we add the word to the Result list. Catx places a comma between the words.
data have;
input Animal $ Days :$10. Days_expected :$10.;
cards;
100 1,2,3,4 1,2,3,4,5
101 2,3,5 1,2,3,4,5
102 1,3,4,5 1,2,3,4,5
;
data want;
set have;
Days_missing=prxchange(cats('s/\b(',translate(Days,'|',','),')\b//'),-1,Days_expected);
Days_missing=prxchange('s/^,+|,+$//',-1,strip(Days_missing));
Days_missing=prxchange('s/,+/,/',-1,Days_missing);
run;
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.