## Recoding Variables and Arithmetic

Solved
Occasional Contributor
Posts: 9

# Recoding Variables and Arithmetic

Hi. I am very new to SAS.I’m working on a project which involves recoding a large amount of evaluation survey questions, scoring them and then aggregating the scores.

In the dataset below thereare the values of 1,2,3,4. “4” means the question was not applicable in those cases. So, for var1 there are 9 cases applicable, for var2 there are 7 and var3all 10 cases are applicable.

data have;

infile datalines dlm=',' dsd;

input Record \$ var1 var2 var3;

datalines;

"Record1", 1,2,2

"Record2 ",1,2,2

"Record3 ",1,2,2

"Record4 ",2,3,3

"Record5 ",1,1,1

" Record6",3,3,3

" Record7",1,3,3

" Record8",3,4,1

" Record9",1,4,1

" Record10",4,4,3

;

run;

Questions are weighted differently –there are three categories 1 point, 2 point, 3 point questions. I recode the questions so the responses are reflective of their point value.

Initial codeing:

1=YES

2=NO

3=Incomplete

4=N/A

Recode:

data new;

set old;

if var1=1 then Var1New= 1; else;

if var1=2 then Var1New= 0; else;

if var1=3 then Var1New=.5; else;

if var1=4 then Var1New=.;else

if var2=1 then Var2New= 2; else;

if var2=2 then Var2New= 0; else;

if var2=3 then Var2New=1; else;

if var2=4 then Var2New=.;else

if var3=1 then Var3new=3; else;

if var3=2 then Var3new=0; else;

if var3=3 then Var3new=1.5; else;

if var3=4 then Var3new=.; else;

run;

and this is where my SAS knowledge gets fuzzy (again, i am a SAS beginner). I need to award each Var a score and then aggregate them for a final score. The math should look something like this table:

 Question Point Value Records Applicable Highest possible Score Points Awarded SCORE VAR1 1 9 1X9 records=9 7 77.7% VAR2 2 7 2X7 records=14 5 35.7% VAR3 3 10 3X10 records=30 16.5 55%

And then total score is 28.5/ 53 = TOTAL SCORE  53.7%

So, my questions are: how do you write code to do the equations and is there a way that you can recode and perform the calculations on a large group of variables at once?

Accepted Solutions
Solution
‎03-04-2012 04:14 PM
Posts: 5,536

## Recoding Variables and Arithmetic

I think there is a typo in the var3 score, it should be 15, not 16.5.

data have;
infile datalines dlm=',' dsd;
input Record \$ var1 var2 var3;
datalines;
"Record1", 1,2,2
"Record2 ",1,2,2
"Record3 ",1,2,2
"Record4 ",2,3,3
"Record5 ",1,1,1
"Record6",3,3,3
"Record7",1,3,3
"Record8",3,4,1
"Record9",1,4,1
"Record10",4,4,3
;

data points;
input question \$ response score;
datalines;
var1 1 1
var1 2 0
var1 3 0.5
var1 4 .
var2 1 2
var2 2 0
var2 3 1
var2 4 .
var3 1 3
var3 2 0
var3 3 1.5
var3 4 .
;

proc transpose data=have out=temp name=question;
var var1-var3;
by record notsorted;
run;

proc sql;
create table questionScores as
select T.question label="Question",
count(score) as applicable, M.scoreMax*calculated applicable as highestPossible,
sum(score) as pointsAwarded, calculated pointsAwarded/calculated highestPossible as questionScore
from temp as T, points as P, (select question, max(score) as scoreMax from points group by question) as M
where T.question=P.question and T.COL1=P.response and T.question=M.question
group by T.question, M.scoreMax;
create table totalScore as
select sum(pointsAwarded) as totalAwarded, sum(highestPossible) as totalPossible,
calculated totalAwarded / calculated totalPossible as score
from questionScores;
quit;

proc sql;
title "Question scores";
select * from questionScores;
title "Total score";
select * from totalScore;
quit;

PG

PG

All Replies
PROC Star
Posts: 8,164

## Recoding Variables and Arithmetic

One slight change to your coding.  You wrote:

data new;

set old;

if var1=1 then Var1New= 1; else;

if var1=2 then Var1New= 0; else;

if var1=3 then Var1New=.5; else;

if var1=4 then Var1New=.;else

if var2=1 then Var2New= 2; else;

if var2=2 then Var2New= 0; else;

if var2=3 then Var2New=1; else;

if var2=4 then Var2New=.;else

if var3=1 then Var3new=3; else;

if var3=2 then Var3new=0; else;

if var3=3 then Var3new=1.5; else;

if var3=4 then Var3new=.; else;

run;

What would work is:

data new;

set old;

if var1=1 then Var1New= 1;

else if var1=2 then Var1New= 0;

else if var1=3 then Var1New=.5;

else if var1=4 then Var1New=.;

if var2=1 then Var2New= 2;

else if var2=2 then Var2New= 0;

else if var2=3 then Var2New=1;

else if var2=4 then Var2New=.;

if var3=1 then Var3new=3;

else if var3=2 then Var3new=0;

else if var3=3 then Var3new=1.5;

else if var3=4 then Var3new=.;

run;

Are the ten records your entire dataset or does every subject have a set of 10 records?  And, if it is the latter, will there always be 10 records per subject and will they always be ordered 1 thru 10?  And, if there are a number of subjects is there, or should there be, an ID field?

Occasional Contributor
Posts: 9

## Recoding Variables and Arithmetic

Thanks! In this case yes, the 10 records are the entire dataset. You could say the record variable is the ID variable for each case.

Thanks.

Solution
‎03-04-2012 04:14 PM
Posts: 5,536

## Recoding Variables and Arithmetic

I think there is a typo in the var3 score, it should be 15, not 16.5.

data have;
infile datalines dlm=',' dsd;
input Record \$ var1 var2 var3;
datalines;
"Record1", 1,2,2
"Record2 ",1,2,2
"Record3 ",1,2,2
"Record4 ",2,3,3
"Record5 ",1,1,1
"Record6",3,3,3
"Record7",1,3,3
"Record8",3,4,1
"Record9",1,4,1
"Record10",4,4,3
;

data points;
input question \$ response score;
datalines;
var1 1 1
var1 2 0
var1 3 0.5
var1 4 .
var2 1 2
var2 2 0
var2 3 1
var2 4 .
var3 1 3
var3 2 0
var3 3 1.5
var3 4 .
;

proc transpose data=have out=temp name=question;
var var1-var3;
by record notsorted;
run;

proc sql;
create table questionScores as
select T.question label="Question",
count(score) as applicable, M.scoreMax*calculated applicable as highestPossible,
sum(score) as pointsAwarded, calculated pointsAwarded/calculated highestPossible as questionScore
from temp as T, points as P, (select question, max(score) as scoreMax from points group by question) as M
where T.question=P.question and T.COL1=P.response and T.question=M.question
group by T.question, M.scoreMax;
create table totalScore as
select sum(pointsAwarded) as totalAwarded, sum(highestPossible) as totalPossible,
calculated totalAwarded / calculated totalPossible as score
from questionScores;
quit;

proc sql;
title "Question scores";
select * from questionScores;
title "Total score";
select * from totalScore;
quit;

PG

PG
Occasional Contributor
Posts: 9

## Recoding Variables and Arithmetic

Thanks so much! This works great.

Super User
Posts: 6,785

## Re: Recoding Variables and Arithmetic

One more small point to consider ...

It is possible that your variables contain missing values BEFORE you change their values.  If you change "4" to missing, you won't be able to distinguish between values that were missing originally and values that were "4" originally.  SAS provides a neat way around that.  It allows special missing values, for example:

else if var1=4 then var1new=.A;

There are 27 special missing values (.A through .Z, and ._).  They would all be thrown out of the calculations by PROC MEANS (or whatever you use for subsequent processing), but could be distinguished one from another.

If you refer to them in your code, the sorted order is ._ then . then .A through .Z, so you might have to change the way you examine for missing values.  Instead of:

if var1new = . then ...

you would need to code:

if var1new <= .Z then ...

if var1new > . then ...

you would need to code:

if var1new > .Z then ...

Good luck.

Occasional Contributor
Posts: 9

## Recoding Variables and Arithmetic

Hi there--

You made a very good point regarding missing values. Right now, my office has a very poor way of handling them in or surveys. Do you have/or know of and very eay to understand literature regarding SAS special missing values I have been looking online and I can't find a lot.

Thanks!

Super User
Posts: 6,785

## Recoding Variables and Arithmetic

Hmmm ... not sure where to find this online.  But what I posted might just be enough.  Experiment with it a bit and it will probably become clear.  Try running some PROC FREQs with the MISSING option on the TABLES statement.

🔒 This topic is solved and locked.