BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Abdulla1
Quartz | Level 8

Greatings

I have a survey data with responses that have questions about how important is...? do you agree on.. ?  what is the chance of...? and the answerers for those questions vary between 1 which is equivalent to very important/ strong agree/ high chance. and either3 or 7 which indicates the opposite (strong disagree/not important/strong disagree/no chance) 

the survey output has also indicated individuals who left a specific question blank or didn't take the survey. as shown below

Abdulla1_0-1646159859115.png

I want to convert those data into numeric numbers such that if the data starts with 1 it returns 1. while if it starts with 9 or 8 it returns nothing (blank field) and if it doesn't start with a number then it returns 999.

I have the following data code for simplification and training purposes 

data mysurvay;

data mysurvay;
input q1 $ q2 $ q3 $; *my actual data has over 100 question;
DATALINES;
1_strogly_agree	4_not_important 		3_very_low
1_strogly_agree	3_somewhat_importatnt	9_left_blank
2_agree			9_left_blank			9_left_blank
4_disagree		2_important			    normal
9_left_blank    2_important			    1_high
agree			3_somewhat_importatnt   2_normal
;
run;

expected output

q1numaric q2numaric q3numaric
1 4 3
1 3 .
2 . .
4 2 999
. 2 1
999 3 2

 

1 ACCEPTED SOLUTION

Accepted Solutions
HB
Barite | Level 11 HB
Barite | Level 11

 

This code:

data want;
	set mysurvey;

	q1numeric = 0;
	if anydigit(q1) = 0 then q1numeric = 999;
	if anydigit(q1) > 0 then q1numeric = substr(q1, 1, 1);
	if q1numeric in (8,9) then q1numeric = .;

	q2numeric = 0;
	if anydigit(q2) = 0 then q2numeric = 999;
	if anydigit(q2) > 0 then q2numeric = substr(q2, 1, 1);
	if q2numeric in (8,9) then q2numeric = .;

	q3numeric = 0;
	if anydigit(q3) = 0 then q3numeric = 999;
	if anydigit(q3) > 0 then q3numeric = substr(q3, 1, 1);
	if q3numeric in (8,9) then q3numeric = .;

run;

is a brute force way to generate the desired output for this little test data set.

 

For the real deal implementation, I would make this a macro or a do loop to run though the 4 lines for each variable. 

 

 

Edit:

 

here might be the macro:

 

%macro parseme;
data want;
	set mysurvey;
	
	%do i = 1 %to 3;

	q&i.numeric = 0;
	if anydigit(q&i.) = 0 then q&i.numeric = 999;
	if anydigit(q&i.) > 0 then q&i.numeric = substr(q&i., 1, 1);
	if q&i.numeric in (8,9) then q&i.numeric = .;

drop q&i.;
%end; run; %mend parseme; %parseme

View solution in original post

5 REPLIES 5
HB
Barite | Level 11 HB
Barite | Level 11

 

This code:

data want;
	set mysurvey;

	q1numeric = 0;
	if anydigit(q1) = 0 then q1numeric = 999;
	if anydigit(q1) > 0 then q1numeric = substr(q1, 1, 1);
	if q1numeric in (8,9) then q1numeric = .;

	q2numeric = 0;
	if anydigit(q2) = 0 then q2numeric = 999;
	if anydigit(q2) > 0 then q2numeric = substr(q2, 1, 1);
	if q2numeric in (8,9) then q2numeric = .;

	q3numeric = 0;
	if anydigit(q3) = 0 then q3numeric = 999;
	if anydigit(q3) > 0 then q3numeric = substr(q3, 1, 1);
	if q3numeric in (8,9) then q3numeric = .;

run;

is a brute force way to generate the desired output for this little test data set.

 

For the real deal implementation, I would make this a macro or a do loop to run though the 4 lines for each variable. 

 

 

Edit:

 

here might be the macro:

 

%macro parseme;
data want;
	set mysurvey;
	
	%do i = 1 %to 3;

	q&i.numeric = 0;
	if anydigit(q&i.) = 0 then q&i.numeric = 999;
	if anydigit(q&i.) > 0 then q&i.numeric = substr(q&i., 1, 1);
	if q&i.numeric in (8,9) then q&i.numeric = .;

drop q&i.;
%end; run; %mend parseme; %parseme
Abdulla1
Quartz | Level 8

This worked perfectly, thanks. I didn't know about the existence of substr function.

 

Abdulla1
Quartz | Level 8

Thank you very much for adding the Macro function. I really appreciate your help. I did a slight adjusted to your code so that it can deal with what ever the variable name format was. just that you know the code that I used is below: 

thank you again for your help

proc sql noprint;
  /*count all columns in all the data sets in the given library*/
    select left(put(count(*),8.)) 
    	into :numds
    from dictionary.columns
    where libname='WORK' and upcase(memname)=upcase('mysurvey');
  /*store all columns names as macro variables cl1 - cl&numds*/
    select name 
      into :cl1 - :cl&numds 
    from dictionary.columns
    where libname='WORK' and upcase(memname)=upcase('mysurvey');	
quit;
quit;
%put &numds;
%macro DO_BRANCH;
	data want;
		set mysurvey;
		%do i = 1 %to &numds;
			%let nana=&&cl&i;/*nana is the columns name*/
			if &nana='' then &nana.numeric = .;
			if anydigit(substr(&nana., 1, 1))= 0 and not(&nana='') then &nana.numeric = 999;/*if columns doesn't start with a number, write 999*/
				 else &nana.numeric = substr(&nana., 1, 1);/*if columns doesn't start with a number, write it*/
			if &nana.numeric in (8,9) then &nana.numeric = .;/*if columns value is either 9 or 8 write "."*/
		%end;
	run;
%mend DO_BRANCH;
%DO_BRANCH;
ballardw
Super User

Style choice: post text as text, open a text box on the forum using the </> and post the text.

 

If your source file is text I would read the data with custom informats to deal with each question properly.

Typically I would create a separate informat for each of the sets of like answers, such as Importance or an Agreement scale and read the appropriate variables with a matching informat. Depending on why you are want "999" for a value you might be better off with a special missing, a dot and letter or underscore such as .M to indicate categories of missing. That way you don't have to worry about accidentally leaving that 999 in when calculating average response or similar.

 

Here is an ugly example with your example code and data:

proc format;
invalue qanswer
"1_high"          =1
"1_strogly_agree" =1
"2_agree"         =2
"2_important"     =2
"2_normal"        =2
"3_somewhat_importatnt"=3
"3_very_low"      =3
"4_disagree"      =4
"4_not_important" =4		
"9_left_blank"    = .
other             =999
;

data mysurvay;
input q1 :qanswer. q2 :qanswer.  q3 :qanswer. ;
DATALINES;
1_strogly_agree	4_not_important 		3_very_low
1_strogly_agree	3_somewhat_importatnt	9_left_blank
2_agree			9_left_blank			9_left_blank
4_disagree		2_important			    normal
9_left_blank    2_important			    1_high
agree			3_somewhat_importatnt   2_normal
;
run;


Instead of Other=999 you could use one of the special missing like Other = .M. Or the optional _error_, Other=_error_ which will show in the log as invalid data. This is the way I read the first set of data with the expected text and the _error_ catches things that may be misspelled such as your "1_strogly_agree" that probably should have been "1_strongly_agree".

 

The log will show the data row read and notes about which variables are in error.

 

Abdulla1
Quartz | Level 8

that is a great way of dealing with the problem.  and thank you for the advices on the 999 -which I added to indicate that I need to review why it is there-. your method can can detect variations resulted from different survey editions during the years it was issued. however, may you help me with the following 2 questions about it:

1- if I have say CSV file or xlsx and the data include name phone_number  and then the survey responses (say q1 through q3) how can I change the format of such file so that it will only affect the survey responses 

2- if I used _error_  I cannot see it in the datafile resulted. so how can I detect those later?

 

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 2048 views
  • 5 likes
  • 3 in conversation