Hello,
I have a survey dataset includes a scale from 0 to 3. Three is the most, One is the least. However for certain questions in the dataset, the answers are wrongly recorded with the assumption that 0 is most and 3 is least. There are ultimately 15 variables that have the wrong answer, and I have to correct them.
So far I have a basic if then statements to correct the scores for one variable called Q2V1. (Question 2, Visit 1). But I want to include all the other Question variables in the same statement if possible.
Data CorrectedQuestion; Set Survey; if Q2V1, = 0 then Q2V1 = 3; if Q2V1 = 1 then Q2V1 = 2; if Q2V1 = 2 then Q2V1 = 1; if Q2V1 = 3 then Q2V1 = 0; if Q2V1 = -99 then Q2V1 = .;
Essentially, I want to say, if any of these variables = x , then any of these variables = y.
Thank you!
-E
The tool for repeating the same operations with multiple variables is array processing. An array is a way to reference each variable using an index value and a list of the variables.
Data CorrectedQuestion; Set Survey; array q Q2V1 ; /*list of other variables to process the same way goes after Q2v1*/ do i=1 to dim(q); if Q[i], = 0 then Q[i] = 3; if Q[i] = 1 then Q[i] = 2; if Q[i] = 2 then Q[i] = 1; if Q[i] = 3 then Q[i] = 0; if Q[i] = -99 then Q[i] = .; end; drop i; run;
List all of the variables with spaces between them for the variables you want after the "array q". Q is the name of the array and elements are addressed using q[1] to reference the first assigned variable or a variable holding numbers within the range of the number of variables assigned to the array.
The variable I is the index, the function dim returns the number of elements that have been assigned to the array q.
It worked perfectly, thank you!
No it didn't. It ran without errors, and it gave you the same incorrect result as your original program. You need to either add the word "else" or else switch to @Reeza's solution.
With the logic you started with, you will change all the "0" values to "3", and then change them back to "0" again. Take a look at the final result you get ... no instances of "2", and no instances of "3". Doesn't that look wrong?
You can use an array here:
data correctedQuestions;
set Survey;
array quest(*) list of variables to be corrected here;
do i=1 to dim(quest); *loop over questions;
if quest(i) = -99 then quest(i) = .;
else quest(i) = 3 - quest(i); *can avoid if/else with formula;
end;
run;
@Errant wrote:
Hello,
I have a survey dataset includes a scale from 0 to 3. Three is the most, One is the least. However for certain questions in the dataset, the answers are wrongly recorded with the assumption that 0 is most and 3 is least. There are ultimately 15 variables that have the wrong answer, and I have to correct them.
So far I have a basic if then statements to correct the scores for one variable called Q2V1. (Question 2, Visit 1). But I want to include all the other Question variables in the same statement if possible.
Data CorrectedQuestion; Set Survey; if Q2V1, = 0 then Q2V1 = 3; if Q2V1 = 1 then Q2V1 = 2; if Q2V1 = 2 then Q2V1 = 1; if Q2V1 = 3 then Q2V1 = 0; if Q2V1 = -99 then Q2V1 = .;Essentially, I want to say, if any of these variables = x , then any of these variables = y.
Thank you!
-E
Recoding values is often done using Formats.
proc format;
value recodeItems
0 = 3
1 = 2
2 = 1
3 = 0
other=.
;
run;
Data CorrectedQuestion(drop=_:);
Set Survey;
array q Q2V1 ; /*list of other variables to process the same way goes after Q2v1*/
do _i=1 to dim(q);
q[_i]=put(q[_i],recodeItems.);
end;
run;
Thank you, this fully corrected the conversion problem.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.