BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
aespinarey
Obsidian | Level 7

Hello all,

 

I am having issues creating dummy variables (CC_Inf1) from a variable (CC_inforcvd2) with multiple values separated by commas. This variable comes from a survey question that asked participants to select all choices that applied to them. There were 13 choices.

 

The code I am using is:

 

Data PrePostZ;
if find(CC_inforcvd2, '1 ','1,') then CC_Inf1=1;
else CC_Inf1=0;
run;

 

I get results that are close but not exact to the actual count. For example there are 111 people who chose 1 in any sort of combination with other choices, But with this code, only 86 are counted. Please help.

 

Thank you,

Andrea

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

Stealing from @PeterClemmensen for example data:

 

data exampledata;
CC_inforcvd2="2,3,5,7,10,11";output;
CC_inforcvd2="1,2,3,5";output;
CC_inforcvd2="12";output;
run;

data want;
   set exampledata;
   array CC_Inf{12};
   do i = 1 to dim(CC_Inf);
      CC_Inf[i]= findw(CC_inforcvd2,strip(put(i,best2.)))>0;
   end;
run;

Uses findw default behavior of treating a comma as delimiter. Strip is needed to use "2" instead of " 2" as the search value.

View solution in original post

9 REPLIES 9
PaigeMiller
Diamond | Level 26

Show us a representative portion of the data. Show us the desired output.

--
Paige Miller
PeterClemmensen
Tourmaline | Level 20

See if this logic is helpful to you. Just took the three first rows of your data..

 

data exampledata;
CC_inforcvd2="2,3,5,7,10,11";output;
CC_inforcvd2="1,2,3,5";output;
CC_inforcvd2="12";output;
run;

data want;
   set exampledata;
   array CC_Inf{12};
   do i = 1 to dim(CC_Inf);
      CC_Inf[i]=0;
   end;

   do i = 1 to countw(CC_inforcvd2, ',');
      idx=input(scan(CC_inforcvd2, i, ','), 8.);
      CC_Inf[idx]=1;
   end;
run;

 

From the look of you PDF file it seems like you write a lot of code that is not necessary. With the above logic you only have to read the data once 🙂

ballardw
Super User

Stealing from @PeterClemmensen for example data:

 

data exampledata;
CC_inforcvd2="2,3,5,7,10,11";output;
CC_inforcvd2="1,2,3,5";output;
CC_inforcvd2="12";output;
run;

data want;
   set exampledata;
   array CC_Inf{12};
   do i = 1 to dim(CC_Inf);
      CC_Inf[i]= findw(CC_inforcvd2,strip(put(i,best2.)))>0;
   end;
run;

Uses findw default behavior of treating a comma as delimiter. Strip is needed to use "2" instead of " 2" as the search value.

PeterClemmensen
Tourmaline | Level 20

@ballardw stealing my example data and writing more efficient code.. Shame on you 😉

ballardw
Super User

@PeterClemmensen wrote:

@ballardw stealing my example data and writing more efficient code.. Shame on you 😉


I have had to use this approach to cleaning data for years as I deal with a variety of data sources that give me what I consider pseudo-garbage values like that.

There is one other bit of information this type of format contains that may be of interest if the Order of the output variables might indicate importance. In which case the Findw word position can be used to set a second arrays value of order for the indicator.

aespinarey
Obsidian | Level 7

Thank you so much, it worked beautifully. 🙂

Radwan
Quartz | Level 8

Hello @ballardw 

I have a question 

the multiple values of mine as follows( obs1=1,2,3. obs2=3,2,4,5. obs3=4,1,2,3. so on)

so should i specify each sot of the multiple values individually in this code ?

data exampledata;
CC_inforcvd2="2,3,5,7,10,11";output;
CC_inforcvd2="1,2,3,5";output;
CC_inforcvd2="12";output;
run;

data want;
   set exampledata;
   array CC_Inf{12};
   do i = 1 to dim(CC_Inf);
      CC_Inf[i]= findw(CC_inforcvd2,strip(put(i,best2.)))>0;
   end;
run;

  that what i understood.

Also could i use this proc to categorize the multiple values based on first value in the multiple values? for instance: either obs contains 1 as first value change into zero.

thanks in advance  

aespinarey
Obsidian | Level 7
No, they added that as an example of a dataset. You only need the code below that to analyze your dataset
aespinarey
Obsidian | Level 7

Thank you 🙂 I greatly appreciate it.

 

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 9 replies
  • 3495 views
  • 5 likes
  • 5 in conversation