Hi I have a data like this;
data symptoms;
input id symptoms $ ;
cards;
1 discharge
2 pain
3 sore
4 odor warts
;
run;
I transposed the data set and it looks this now
id discharge pain sore odor
1 discharge
2 pain odor
3 sore
4 odor
instead of this i would like to have like THIS (below)
id discharge pain sore odor
1 1 0 0 0
2 0 1 0 1
3 0 0 1 0
4 0 0 0 1
I used if then statement like this;
if discharge ="discharge" then discharge="1";
but i have hundreds of symptoms to write if then statement. is there a shorter way macro or something to do it efficiently?
if you are OK with auto conversion log note:
data have;
input (id discharge pain sore odor ) (:$10.);
cards;
1 discharge . . .
2 . pain . odor
3 . . sore .
4 . . . odor
;
data want;
set have;
array t(*) discharge--odor;
do _n_=1 to dim(t);
t(_n_)= vname(t(_n_))=t(_n_);
end;
run;
if you are OK with auto conversion log note:
data have;
input (id discharge pain sore odor ) (:$10.);
cards;
1 discharge . . .
2 . pain . odor
3 . . sore .
4 . . . odor
;
data want;
set have;
array t(*) discharge--odor;
do _n_=1 to dim(t);
t(_n_)= vname(t(_n_))=t(_n_);
end;
run;
Can you post a better and a true representative sample of your real plz?
Regardless of the route you choose, macro language will not be involved.
One possibility: add a dummy variable with a value of 1:
data have2;
set have;
dummy=1;
run;
Then you can use that in your transposing:
proc transpose data=have2 out=want;
var dummy;
id symptoms;
by id;
run;
Issues you will need to address beyond that:
Set a dummy variable and then you can use that as var in transpose.
data have;
set symptoms;
dummy=1;
run;
proc sort data=have;
by id;
proc transpose data=have out=want(drop= _Name_) ;
by id;
id symptoms;
var dummy;
run;
data want;
set want;
array num _numeric_;
do over num;
if missing(num) then num=0;
end;
run;
If you have spaces between the names that needs to be converted to variable names then use options validvarname=v7; before the transpose, this will convert the names to valid variable names (Replaces with "_" for all invalid characters for variable names.
Fastest way might be use some code generation to generate an SQL statement to summarize by ID.
data symptoms;
input id symptom $32. ;
cards;
1 discharge
2 pain
3 sore
4 odor
4 warts
;
proc sort data=symptoms(keep=symptom) nodupkey out=symptomlist ;
by symptom;
run;
filename code temp;
data _null_;
set symptomlist end=eof;
file code ;
if _n_=1 then put
'proc sql noprint;'
/'create table want as select'
/' id'
;
put ',max(symptom=' symptom :$quote. ') as ' symptom ;
if eof then put
'from symptoms'
/'group by id'
/';'
/'quit;'
;
run;
%include code / source2;
proc print;
run;
Obs id discharge odor pain sore warts 1 1 1 0 0 0 0 2 2 0 0 1 0 0 3 3 0 0 0 1 0 4 4 0 1 0 0 1
137 +proc sql noprint; 138 +create table want as select 139 + id 140 +,max(symptom="discharge" ) as discharge 141 +,max(symptom="odor" ) as odor 142 +,max(symptom="pain" ) as pain 143 +,max(symptom="sore" ) as sore 144 +,max(symptom="warts" ) as warts 145 +from symptoms 146 +group by id 147 +; NOTE: SAS threaded sort was used. NOTE: Table WORK.WANT created, with 4 rows and 6 columns.
If your source data is NOT one observation per ID*SYMPTOM then you would need to change the logic of the test for whether a particular observation has an indication of that symptom.
Also if your symptom values are not valid SAS variable names then you will have trouble. You might need to generate unique names for the flag variables.
It is called design matrix.
data symptoms;
input id symptoms $ ;
dummy=1;
cards;
1 discharge
2 pain
3 sore
4 odor warts
;
run;
proc logistic data=symptoms outdesign=want(drop=dummy) outdesignonly;
class symptoms/param=glm;
model dummy=id symptoms/noint;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.