I have the following 7 variables in my dataset:
type_1 - type_7
The observations for each one of these is supposed to be either "checked" or "unchecked"in the datset. The first thing I want to do is check that there is only 1 observation for the 7 variables that is "checked" and the rest are "unchecked." What is the most efficient way to do this?
The second thing I want to do is recode the 7 variables into 1 variable where the value is the label for variable that is "checked." What is the most efficient way to do this?
BTW: Interestinglly, the exact same code works if the variables are numeric 1s and 0s, rather than character "1"s and "0"s:
data have; label type_1='type of surgery is aaa'; label type_2='type of surgery is bbb'; label type_3='type of surgery is ccc'; label type_4='type of surgery is ddd'; label type_5='type of surgery is eee'; label type_6='type of surgery is fff'; label type_7='type of surgery is ggg'; input type_1 - type_7; cards; 1 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 ; data want multiples missing; set have; array vars(*) type_1-type_7; recnum=_n_; if sum(of vars(*)) eq 0 then output missing; else if sum(of vars(*)) gt 1 then output multiples; else do; type=substr(vlabel(vars(whichn(1,of vars(*)))),20); output want; end; run;
Art, CEO, AnalystFinder.com
Please help us help you. How about showing what the data looks like before your process. And what you want it to look like afterwards.
Are the current values "checked" and 'unchecked" or, if not, are the variables character or numeric, and what values represent checked and unchecked. Also, what do you want to do if more than one is checked?
Art, CEO, AnalystFinder.com
Yes sorry! Here is a look at the data:
value type_1 0='Unchecked' 1='Checked';
label type_1 'Type of surgery is xxx';
same for type_2 - type_7 but different surgeries xxx.
type_1 type_2 type_3 .... type_7
unchecked checked unchecked unchecked
unchecked unchecked unchecked checked
...
Would like it to look like:
type
xxx (from type_2)
xxx (from type_7)
if more than 1 is checked, I just want to be able to identify that and check the discrepancy with the investigator.
Thanks for your help!
data have; informat type_1 - type_7 $9.; label type_1='type of surgery is aaa'; label type_2='type of surgery is bbb'; label type_3='type of surgery is ccc'; label type_4='type of surgery is ddd'; label type_5='type of surgery is eee'; label type_6='type of surgery is fff'; label type_7='type of surgery is ggg'; input type_1 - type_7; cards; checked unchecked unchecked unchecked unchecked unchecked unchecked unchecked unchecked unchecked checked unchecked unchecked unchecked checked unchecked checked unchecked unchecked unchecked unchecked ; data want multiples; set have; array vars(*) $ type_1-type_7; if count(catt(of vars(*)), 'un') lt 6 then output multiples; else do; type=substr(vlabel(vars(whichc('checked',of vars(*)))),20); output want; end; run;
Art, CEO, AnalystFinder.com
Minor difference on @art297's approach basically to turn the checked/unchecked into numerics AND report as read that there may be a problem. Then clean it before
proc format library=work; invalue Checked 'checked' = 1 'unchecked'= 0 other=.; run; data have; informat type_1 - type_7 Checked.; label type_1='type of surgery is aaa'; label type_2='type of surgery is bbb'; label type_3='type of surgery is ccc'; label type_4='type of surgery is ddd'; label type_5='type of surgery is eee'; label type_6='type of surgery is fff'; label type_7='type of surgery is ggg'; input type_1 - type_7; TypeSum = sum(of Type_:); if typesum ne 1 then put "Problem with number checked for Type in record: " _n_; cards; checked unchecked unchecked unchecked unchecked unchecked unchecked unchecked unchecked unchecked checked unchecked unchecked unchecked checked unchecked checked unchecked unchecked unchecked unchecked ; run;
OK. I hadn't correctly read you're specs. How about:
data have; informat type_1 - type_7 $9.; label type_1='type of surgery is aaa'; label type_2='type of surgery is bbb'; label type_3='type of surgery is ccc'; label type_4='type of surgery is ddd'; label type_5='type of surgery is eee'; label type_6='type of surgery is fff'; label type_7='type of surgery is ggg'; input type_1 - type_7; cards; 1 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 ; data want multiples missing; set have; array vars(*) type_1-type_7; recnum=_n_; if sum(of vars(*)) eq 0 then output missing; else if sum(of vars(*)) gt 1 then output multiples; else do; type=substr(vlabel(vars(whichn(1,of vars(*)))),20); output want; end; run;
Art, CEO, AnalystFinder.com
BTW: Interestinglly, the exact same code works if the variables are numeric 1s and 0s, rather than character "1"s and "0"s:
data have; label type_1='type of surgery is aaa'; label type_2='type of surgery is bbb'; label type_3='type of surgery is ccc'; label type_4='type of surgery is ddd'; label type_5='type of surgery is eee'; label type_6='type of surgery is fff'; label type_7='type of surgery is ggg'; input type_1 - type_7; cards; 1 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 ; data want multiples missing; set have; array vars(*) type_1-type_7; recnum=_n_; if sum(of vars(*)) eq 0 then output missing; else if sum(of vars(*)) gt 1 then output multiples; else do; type=substr(vlabel(vars(whichn(1,of vars(*)))),20); output want; end; run;
Art, CEO, AnalystFinder.com
Thank you! Worked beautifully.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.