BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Rick79
Calcite | Level 5

Hi. I am very new to SAS.I’m working on a project which involves recoding a large amount of evaluation survey questions, scoring them and then aggregating the scores.

In the dataset below thereare the values of 1,2,3,4. “4” means the question was not applicable in those cases. So, for var1 there are 9 cases applicable, for var2 there are 7 and var3all 10 cases are applicable. 

data have;

  infile datalines dlm=',' dsd;

  input Record $ var1 var2 var3;

datalines;

"Record1", 1,2,2

"Record2 ",1,2,2

"Record3 ",1,2,2

"Record4 ",2,3,3

"Record5 ",1,1,1

" Record6",3,3,3

" Record7",1,3,3

" Record8",3,4,1

" Record9",1,4,1

" Record10",4,4,3

;

run;

Questions are weighted differently –there are three categories 1 point, 2 point, 3 point questions. I recode the questions so the responses are reflective of their point value. 

Initial codeing:


1=YES

2=NO

3=Incomplete

4=N/A

Recode:

data new;

set old;

if var1=1 then Var1New= 1; else;

if var1=2 then Var1New= 0; else;

if var1=3 then Var1New=.5; else;

if var1=4 then Var1New=.;else

if var2=1 then Var2New= 2; else;

if var2=2 then Var2New= 0; else;

if var2=3 then Var2New=1; else;

if var2=4 then Var2New=.;else

if var3=1 then Var3new=3; else;

if var3=2 then Var3new=0; else;

if var3=3 then Var3new=1.5; else;

if var3=4 then Var3new=.; else;

run;

and this is where my SAS knowledge gets fuzzy (again, i am a SAS beginner). I need to award each Var a score and then aggregate them for a final score. The math should look something like this table:

 

Question Point Value

 

Records Applicable

 

Highest possible Score

 

Points Awarded

 

SCORE

 

VAR1

 

1

 

9

 

1X9 records=9

 

7

 

77.7%

 

VAR2

 

2

 

7

 

2X7 records=14

 

5

 

35.7%

 

VAR3

 

3

 

10

 

3X10 records=30

 

16.5

 

55%

 

And then total score is 28.5/ 53 = TOTAL SCORE  53.7%

So, my questions are: how do you write code to do the equations and is there a way that you can recode and perform the calculations on a large group of variables at once?

I thank you in advance.

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

I think there is a typo in the var3 score, it should be 15, not 16.5.

data have;
  infile datalines dlm=',' dsd;
  input Record $ var1 var2 var3;
datalines;
"Record1", 1,2,2
"Record2 ",1,2,2
"Record3 ",1,2,2
"Record4 ",2,3,3
"Record5 ",1,1,1
"Record6",3,3,3
"Record7",1,3,3
"Record8",3,4,1
"Record9",1,4,1
"Record10",4,4,3
;

data points;
input question $ response score;
datalines;
var1 1 1
var1 2 0
var1 3 0.5
var1 4 .
var2 1 2
var2 2 0
var2 3 1
var2 4 .
var3 1 3
var3 2 0
var3 3 1.5
var3 4 .
;

proc transpose data=have out=temp name=question;
var var1-var3;
by record notsorted;
run;

proc sql;
create table questionScores as
select T.question label="Question",
count(score) as applicable, M.scoreMax*calculated applicable as highestPossible,
sum(score) as pointsAwarded, calculated pointsAwarded/calculated highestPossible as questionScore
from temp as T, points as P, (select question, max(score) as scoreMax from points group by question) as M
where T.question=P.question and T.COL1=P.response and T.question=M.question
group by T.question, M.scoreMax;
create table totalScore as
select sum(pointsAwarded) as totalAwarded, sum(highestPossible) as totalPossible,
calculated totalAwarded / calculated totalPossible as score
from questionScores;
quit;

proc sql;
title "Question scores";
select * from questionScores;
title "Total score";
select * from totalScore;
quit;

PG

PG

View solution in original post

7 REPLIES 7
art297
Opal | Level 21

One slight change to your coding.  You wrote:

data new;

set old;

if var1=1 then Var1New= 1; else;

if var1=2 then Var1New= 0; else;

if var1=3 then Var1New=.5; else;

if var1=4 then Var1New=.;else

if var2=1 then Var2New= 2; else;

if var2=2 then Var2New= 0; else;

if var2=3 then Var2New=1; else;

if var2=4 then Var2New=.;else

if var3=1 then Var3new=3; else;

if var3=2 then Var3new=0; else;

if var3=3 then Var3new=1.5; else;

if var3=4 then Var3new=.; else;

run;

What would work is:

data new;

  set old;

  if var1=1 then Var1New= 1;

   else if var1=2 then Var1New= 0;

   else if var1=3 then Var1New=.5;

   else if var1=4 then Var1New=.;

  if var2=1 then Var2New= 2;

   else if var2=2 then Var2New= 0;

   else if var2=3 then Var2New=1;

   else if var2=4 then Var2New=.;

  if var3=1 then Var3new=3;

   else if var3=2 then Var3new=0;

   else if var3=3 then Var3new=1.5;

   else if var3=4 then Var3new=.;

run;

Are the ten records your entire dataset or does every subject have a set of 10 records?  And, if it is the latter, will there always be 10 records per subject and will they always be ordered 1 thru 10?  And, if there are a number of subjects is there, or should there be, an ID field?

Rick79
Calcite | Level 5

Thanks! In this case yes, the 10 records are the entire dataset. You could say the record variable is the ID variable for each case.

Thanks.

PGStats
Opal | Level 21

I think there is a typo in the var3 score, it should be 15, not 16.5.

data have;
  infile datalines dlm=',' dsd;
  input Record $ var1 var2 var3;
datalines;
"Record1", 1,2,2
"Record2 ",1,2,2
"Record3 ",1,2,2
"Record4 ",2,3,3
"Record5 ",1,1,1
"Record6",3,3,3
"Record7",1,3,3
"Record8",3,4,1
"Record9",1,4,1
"Record10",4,4,3
;

data points;
input question $ response score;
datalines;
var1 1 1
var1 2 0
var1 3 0.5
var1 4 .
var2 1 2
var2 2 0
var2 3 1
var2 4 .
var3 1 3
var3 2 0
var3 3 1.5
var3 4 .
;

proc transpose data=have out=temp name=question;
var var1-var3;
by record notsorted;
run;

proc sql;
create table questionScores as
select T.question label="Question",
count(score) as applicable, M.scoreMax*calculated applicable as highestPossible,
sum(score) as pointsAwarded, calculated pointsAwarded/calculated highestPossible as questionScore
from temp as T, points as P, (select question, max(score) as scoreMax from points group by question) as M
where T.question=P.question and T.COL1=P.response and T.question=M.question
group by T.question, M.scoreMax;
create table totalScore as
select sum(pointsAwarded) as totalAwarded, sum(highestPossible) as totalPossible,
calculated totalAwarded / calculated totalPossible as score
from questionScores;
quit;

proc sql;
title "Question scores";
select * from questionScores;
title "Total score";
select * from totalScore;
quit;

PG

PG
Rick79
Calcite | Level 5

Thanks so much! This works great.

Astounding
PROC Star

One more small point to consider ...

It is possible that your variables contain missing values BEFORE you change their values.  If you change "4" to missing, you won't be able to distinguish between values that were missing originally and values that were "4" originally.  SAS provides a neat way around that.  It allows special missing values, for example:

else if var1=4 then var1new=.A;

There are 27 special missing values (.A through .Z, and ._).  They would all be thrown out of the calculations by PROC MEANS (or whatever you use for subsequent processing), but could be distinguished one from another.

If you refer to them in your code, the sorted order is ._ then . then .A through .Z, so you might have to change the way you examine for missing values.  Instead of:

if var1new = . then ...

you would need to code:

if var1new <= .Z then ...

And instead of:

if var1new > . then ...

you would need to code:

if var1new > .Z then ...

Good luck.

Rick79
Calcite | Level 5

Hi there--

You made a very good point regarding missing values. Right now, my office has a very poor way of handling them in or surveys. Do you have/or know of and very eay to understand literature regarding SAS special missing values I have been looking online and I can't find a lot.

Thanks!

Astounding
PROC Star

Hmmm ... not sure where to find this online.  But what I posted might just be enough.  Experiment with it a bit and it will probably become clear.  Try running some PROC FREQs with the MISSING option on the TABLES statement.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 1121 views
  • 0 likes
  • 4 in conversation