BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
emaneman
Pyrite | Level 9

Dear all,

 

I have a variable (CHOICES)that contains several codes, separated by a comma:

 

CHOICES
23,55,58,81,82,99,110,119
3,9,23,27,55,58,61,82,98,99,105,106,110,118,119
34

i need to create a new variable that has value 1 only when CHOICES contains the single code 3.

 

If I use;

NEW = count( choices , '3'); 

I get 

 

NEW
1
2
1

 

Because SAS counts every instance of 3. So 23 and 34 also get counted.

 

I would like this output instead:

 

NEW
0
1
0

 

Any suggestion would be most welcome!

Thank you in advance.

 

Eman 

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

Try

data example;
   input choices :$50.;
   new = (findw(choices,'3')> 0);
datalines;
23,55,58,81,82,99,110,119
3,9,23,27,55,58,61,82,98,99,105,106,110,118,119
34
;
run;

View solution in original post

12 REPLIES 12
Reeza
Super User
Have you tried using COUNTW() instead which checks for words, not just the string.

new = countw(choices, '3');
emaneman
Pyrite | Level 9

Hello Reeza,

unfortunately that does not work either. it gives me:

2

2

1

Which is weired because in the first observation there is only one 3, part of 23:

23,55,58,81,82,99,110,119

 

Eman 

Reeza
Super User

COUNTW() counts the number of words, the second parameter is the delimiter, in this case 3. 

It's definitely the wrong function so that's my bad. I can't recall another function to count the number of words so I guess a loop is the best option unfortunately. 

 

data have;
	length choices $50.;
	choices="23,55,58,81,82,99,110,119";
	output;
	choices="3,9,23,27,55,58,61,82,98,99,105,106,110,118,119";
	output;
	choices="34";
	output;
run;

data want;
	set have;
	
	*initialize count to 0 for each row;
	number3=0;

    *check if any 3s are present if so, count them;
	if indexw(choices, '3', ',') then
	     /*loop counter based on number of words using COUNTW()*/
		do i=1 to countw(choices);

			if scan(choices, i, ',')='3' then
				number3+1; /*implicit retain with a sum statement*/
		end;
run;
emaneman
Pyrite | Level 9

This works!  

 

It is a bit cumbersome, since I have to look for 65 different substrings (3 was only one of them), and create 65 new corresponding variables with value 0 or 1 depending on whether the specific substring is present, but this is excellent. 

 

Thank you Reeza.

Reeza
Super User
I would instead separate the data and flip it long and use a sing WHERE statement which is more dynamic. Or two loops would do it quite nicely. But you never mentioned this so the solution posted reflects what you state.
ballardw
Super User

@emaneman wrote:

Hello Reeza,

unfortunately that does not work either. it gives me:

2

2

1

Which is weired because in the first observation there is only one 3, part of 23:

23,55,58,81,82,99,110,119

 

Eman 


Countw, counts words, not a specific word, so used that way defines '3' as the delimiter to use in defining words. So '2' is the first word and everything after 3 in the first value of choices is the second word

ballardw
Super User

Try

data example;
   input choices :$50.;
   new = (findw(choices,'3')> 0);
datalines;
23,55,58,81,82,99,110,119
3,9,23,27,55,58,61,82,98,99,105,106,110,118,119
34
;
run;
emaneman
Pyrite | Level 9

hello Ballardw,

this is much easier, thanks you.

The only issue at present is that I already have a file with the variable choice defined. 

Of course I can recreate it with a datalines statement, but is there a way I can put the equivalent of "choice:$50." without having to create a new datafile using datalines?

Eman

ballardw
Super User

@emaneman wrote:

hello Ballardw,

this is much easier, thanks you.

The only issue at present is that I already have a file with the variable choice defined. 

Of course I can recreate it with a datalines statement, but is there a way I can put the equivalent of "choice:$50." without having to create a new datafile using datalines?

Eman


I provided example data in the form of a data step to demonstrate the code behavior. You would typically use a SET statement to bring an existing data set in to the working data set.

Data want;

   set have; /* the name of your existing data set*/

  <and other code statments here?

run;

 

You did not provide either a complete data step, the name of your existing data set. So I created one with the example data you provided just to have something to work with.

The FINDW function returns either the character position number (the example I used does this) or the word number in a string (check the documentation for the options), or 0 if the value is not found. By default commas are one of the delimiters for "words". So  abc,3,444 would have '3' as a word. The comparison I use returns 1 with the comparison is true (3 was found as a word) or 0 otherwise.

 

emaneman
Pyrite | Level 9

My apologies. I thought that defining the variable choices in that way was necessary. But it is not, so your solution using FINDW works perfectly and efficiently!

thank you very much. 

s_lassen
Meteorite | Level 14

The way to go may be to use PROC FCMP to create a custom function:

proc fcmp outlib=work.funcs.string;
  function countw2(instr $,word $,delim $);
    start=1;
    len=length(word);
    do i=0 by 1;
      start=findw(instr,word,delim,start);
      if start=0 then
        leave;
      else 
        start=start+len;
      end;
    return(i);
    endsub;
run;

The function returns the number of occurrences of WORD within INSTR, delimited by DELIM.

 

You can then use the function like this:

options cmplib=work.funcs;

data want;
  set have;
  x=countw2(choices,'3',',');
run;

You may want to save the function in a permanent library, instead of WORK.

emaneman
Pyrite | Level 9

 

thank you S_Lasssen, that works.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 12 replies
  • 1634 views
  • 1 like
  • 4 in conversation