Data have1;
infile datalines dsd dlm=",";
input id $ rsn1 $ rsn2 $ rsn3 $ rsn4 $ rsn5 $;
datalines;
001,X, ,X, , ,
002,X, , , , ,
003,, , , , ,
004,,X, , ,X,
;
run;
data want;
set have1;
overall_rsn=catx(" ,", rsn1, rsn2, rsn3, rsn4, rsn5);
run;
How do I program this so that variable OVERALL_RSN shows the var_name for the actual value?
For example, instead of observation 1 returning back "X, X" for OVERALL_RSN, I want it to return back "rsn1, rsn3".
Thanks
edit: The only solution that I could come up with is this way. But the problem is the actual data set I'm using has 30 Rsns variables, which could be quite burdensome for coding
data want2;
set have1;
if rsn1="X" then rsn1="Rsn1";
if rsn2="X" then rsn2="Rsn2";
if rsn3="X" then rsn3="Rsn3";
if rsn4="X" then rsn4="Rsn4";
if rsn5="X" then rsn5="Rsn5";
overall_rsn=catx(" ,", rsn1, rsn2, rsn3, rsn4);
run;
Data have1;
infile datalines dsd dlm=",";
input id $ rsn1 $ rsn2 $ rsn3 $ rsn4 $ rsn5 $;
datalines;
001,X, ,X, , ,
002,X, , , , ,
003,, , , , ,
004,,X, , ,X,
;
run;
data want;
set have1;
array rs rsn:;
length rsn_list $40;
do over rs;
if not missing(rs) then rsn_list=catx(',',rsn_list,vname(rs));
end;
run;
Just make sure that the length of rsn_list is sufficient to hold a comma-separated list of all the variable names.
Explain why row 1 results in rsn1, rsn2 when the Xs are in columns rsn1 and rsn3.
I get the feeling that this is a situation where instead of all of this working with strings in a horizontal wide layout, your task and subsequent analysis would be much more easily performed if the data was in a long data set and instead of X or blank, you had 1s and 0s.
@Hello_there wrote:
Hi PaigeMiller,
sorry I wrote this in haste and I changed the information. You are correct,, it should result in rsn1, rsn3 for obs. 1.
I was given a data set like this. and I'm supposed to come up with a variable that lists only the reasons why, if there were reasons that exist.
So are you saying I need to transform the data somehow?
Okay, and what are you going to do with this data once you have converted observation 1 to be rsn1, rsn3? What is the next step? What is the next analysis or report or table? I'm not going down this path of string manipulation just yet, as I still think there are many easier ways to go.
Data have1;
infile datalines dsd dlm=",";
input id $ rsn1 $ rsn2 $ rsn3 $ rsn4 $ rsn5 $;
datalines;
001,X, ,X, , ,
002,X, , , , ,
003,, , , , ,
004,,X, , ,X,
;
run;
data want;
set have1;
array rs rsn:;
length rsn_list $40;
do over rs;
if not missing(rs) then rsn_list=catx(',',rsn_list,vname(rs));
end;
run;
Just make sure that the length of rsn_list is sufficient to hold a comma-separated list of all the variable names.
Data have1;
infile datalines dsd dlm=",";
input id $ rsn1 $ rsn2 $ rsn3 $ rsn4 $ rsn5 $;
datalines;
001,X, ,X, , ,
002,X, , , , ,
003,, , , , ,
004,,X, , ,X,
;
run;
data want;
set have1;
length overall_rsn $100.;
array _rsn(*) rsn:;
do i=1 to dim(_rsn);
if _rsn(i) = 'X' then overall_rsn = catx(", ", overall_rsn, vname(_rsn(i)));
end;
run;
This is amazing. Thanks!
First thing is convert the data to a usable structure using PROC TRANSPOSE.
data have;
input id $ (rsn1-rsn5) (:$1.);
datalines;
001 X . X . .
002 X . . . .
003 . . . . .
004 . X . X .
;
proc transpose data=have out=tall;
by id;
var rsn1-rsn5 ;
run;
Then you could produce your report directly from that:
data _null_;
set tall;
by id;
file print;
if first.id then put id @;
if col1='X' then put _name_ @;
if last.id then put;
run;
Results:
Or you could build your long character string from it.
data want;
do until (last.id);
set tall;
by id;
length string $200;
if col1='X' then string=catx(' ',string,_name_);
end;
drop _name_ col1;
run;
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.