BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
stancemcgraw
Obsidian | Level 7

Hello,

   I have several values under the variable Comorbidities per unique studyid. I'd like to create new dummy variables for each comorbidity listed.

Data have;

Studyid   Comorbidities

678             smoker, alcohol use disorder

679             current smoker

680              hypertension, diabetes mellitus

 

data want;

studyid         Smoker   Alcohol_use_disorder   Hypertension Diabetes_Mellitus

678                  1                    1                                  0                  0

679                  1                    0                                 0                   0

680                  0                    0                                 1                    1

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User
I think this is identical to a previous question you asked recently...

Given that diseases aren't coded cleanly, I'd definitely recommending using the dynamic approach instead.

1. Separate out terms into individual rows (data step)
2. Clean them up here for the issues mentioned in your second post.
3. Add an indicator variable to get the 1
4. Flip the clean version wide using PROC TRANSPOSE

View solution in original post

4 REPLIES 4
Kurt_Bremser
Super User

First, dissect the string:

data long;
set have;
length comorbidity $50;
value = 1;
do i = 1 to countw(comorbidities,',');
  comorbidity = translate(strip(scan(comorbidities,i,',')),'_',' ');
  output;
end;
drop i comorbidities;
run;

Then you can transpose that (although I recommend to keep the long dataset, as it is easier to work with):

proc transpose
  data=long
  out=want (dropy=_name_)
;
by studyid;
id comorbidity;
var value;
run;

Untested, posted from my tablet.

stancemcgraw
Obsidian | Level 7
So now it gives me soooo many different iterations of the same thing. This makes it so that any comorbidity that is cut-off now becomes its own variable.

for example, when I proc freq comorbidites, mental/personaliy disorder, for example now has :
mental
mental/
mental/personalit
mental/personality
mental/personality_diso
And now, each of those become a variable, when those should all be the same variable....
Reeza
Super User
I think this is identical to a previous question you asked recently...

Given that diseases aren't coded cleanly, I'd definitely recommending using the dynamic approach instead.

1. Separate out terms into individual rows (data step)
2. Clean them up here for the issues mentioned in your second post.
3. Add an indicator variable to get the 1
4. Flip the clean version wide using PROC TRANSPOSE

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 491 views
  • 0 likes
  • 3 in conversation