Hello SAS Community,
I want to find and flag rows if either the first column value 'Yes' or if the value 'Yes' is in sequence across the columns like below. It should either be that first column starts with 'Yes' or be all 'Yes' without any other value in between or no missing. Thank you for your time.
Have:
T1 T2 T3 T4 T5 T6 T7 T8 T9 T10
Yes No Yes Yes Yes No Yes . . .
Yes Yes Yes Yes Yes . . . . .
Yes Yes Yes . . . . . . .
No No No No No Yes No Yes Yes No
Want:
T1 T2 T3 T4 T5 T6 T7 T8 T9 T10 Flag
Yes No Yes Yes Yes No Yes . . . 1
Yes Yes Yes Yes Yes . . . . . 1
Yes Yes Yes . . . . . . . 1
No No No No No Yes No Yes Yes No 0
Not sure that I fully understand the "or no missing" requirement but based on the sample data provided below should work.
If there are other patterns in your data to consider then please provide sample data that includes these patterns.
data have;
infile datalines dlm='|' dsd truncover;
input (t1-t10) (:$3.) flag_want;
datalines;
Yes|No|Yes|Yes|Yes|No|Yes||||1
Yes|Yes|Yes|Yes|Yes||||||1
Yes|Yes|Yes||||||||1
No|No|No|No|No|Yes|No|Yes|Yes|No|0
;
data want;
set have;
if t1='Yes' then flag_derived=1;
else flag_derived= ( whichc('No',of t2-t10)=0 );
run;
proc print data=want;
run;
...and given that in the "data" you posted missings are represented as dots here the code that works with numerical variables.
Going forward: Please post sample data always via a working SAS datastep as this helps us help you.
proc format;
value YesNo
0='No'
1='Yes'
;
run;
data have;
infile datalines dlm='|' dsd truncover;
input (t1-t10) (:best32.) flag_want;
format t1-t10 YesNo.;
datalines;
1|0|1|1|1|0|1||||1
1|1|1|1|1||||||1
1|1|1||||||||1
0|0|0|0|0|1|0|1|1|0|0
;
data want;
set have;
if t1=1 then flag_derived=1;
else flag_derived= ( whichn(0,of t2-t10)=0 );
run;
proc print data=want;
run;
If you are going to have a lot to questions about stuff like this, especially if you get to "at least X yes values", "at most" "more yes than no" or similar you may want to save yourself a lot of headaches and either convert the Yes/No to 1/0 (my choice) or reduce them to single character Y/N.
If values are numeric 1/0 coded then you have the functions SUM, that would give a total number of 1's, N the number of variables with any value, Nmiss the number of missing, Range largest - smallest (1 as range tells you that there are some of both values and 0 that all populated values are 1), mean decimal percentage of 1's,
Or using the single letter you could combine them into one string and use Countc to get counts of Y and N.
Also if any of your actual values include MISSING between values you should include some in the examples.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.