BookmarkSubscribeRSS Feed
marylin5828
Calcite | Level 5

Hello, I am using a dataset on a large set of individuals. There are 6 questions regarding english proficiency ( each has their own column). I need to separate individuals by the categories speaks some English and no English. My issues is that there are 6 different columns containing yes or no, I need to have Sas combine those who answered yes to 3 or more questions as speaks some english and I need to have those who answered yes to 2 or less of the questions to be categorized as does not speak english. 

5 REPLIES 5
PaigeMiller
Diamond | Level 26

First, I point out that having numeric values of 0 and 1 instead of character values No and Yes as the possible answers makes the coding an awful lot easier.

 

Nevertheless, I point you to a similar problem which was discussed recently here in the forums, which applies to this situation as well:

https://communities.sas.com/t5/SAS-Programming/Sum-by-an-ID-variable-according-to-a-logical-test/m-p...

 

--
Paige Miller
indiaink1
Calcite | Level 5

Try using an array to convert the yes/no columns to numeric, then use the sum function.

 

data have;
	input id var1 $ var2 $ var3 $ var4 $ var5 $ var6 $;
datalines;
1 No No Yes Yes No Yes
2 Yes No Yes No Yes Yes
3 No Yes No Yes No No
4 Yes Yes No No Yes No
5 No No No No Yes No
6 Yes No No Yes No No
7 No No No No No No
8 Yes Yes Yes Yes Yes Yes
9 No Yes No Yes Yes No
10 No Yes No Yes Yes No
run;

data want;
	set have;
	array vars(6) $ var1-var6;
	array vars_n(6) var_n1-var_n6;
	do i=1 to 6;
		if vars(i) = "Yes" then vars_n(i) = 1;
		else if vars(i) = "No" then vars_n(i) = 0;
	end;
	sum = sum(of var_n1-var_n6);
	if sum >= 3 then English = 1;
	else if 0 <= sum < 3 then English = 0;
run;
marylin5828
Calcite | Level 5

Hello,
 I am stuck on the part where I convert it to numeric. They are already numeric but I combined the answers 2-9 as 0.

this is the code I used to filter the numeric variables.

 

data WORK.DATA01;
array vars(6) $ varB2AX--varB3CX;
array vars_n(6) var_nB2AX--var_nB3CX;
do i=1 to 6;
if vars (i)= "1" then vars_n(i) =1;
else if vars (i)= "2-9" then vars_n(i)=0;
end;
 
the error messages I am getting are
ERROR: Variable varB2AX cannot be found on the list of previously defined variables.
ERROR: Too few variables defined for the dimension(s) specified for the array vars.
ERROR: Variable var_nB2AX cannot be found on the list of previously defined variables.
ERROR: Too few variables defined for the dimension(s) specified for the array vars_n.
 
Not sure why it is saying it cannot be found if that is the name of the variable.
 
thanks!
PaigeMiller
Diamond | Level 26

The first problem is that you don't have a SET statement. There may be other problems as well.

 

Also, depending on your data, this may not work as you can't test if a number is between 2 and 9 via this statement:

if vars (i)= "2-9"
--
Paige Miller
Tom
Super User Tom
Super User

Just use the answer from this question.

https://communities.sas.com/t5/SAS-Procedures/Count-a-specific-value-across-columns/td-p/635511

data want;
set have;
  nyes=count(catx('#',of q1 q2 q3 q4 q5 q6 ),'Yes');
  english = (nyes > 2) ;
run;

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 2145 views
  • 0 likes
  • 4 in conversation