Write and run SAS programs in your web browser

Multiple where-clauses in correlation analysis

Occasional Learner
Posts: 1

Multiple where-clauses in correlation analysis

Hey Guys, 


I'm new to SAS and currently trying to examine correlations between answers to survey questions where participants were asked to position themselves on a seven-point-scale. In this survey, there were also values above 7 reserved for things like "dont know" or "dont care" which obviously I want to exclude from the correlation analysis. 


But: If im not mistaken, SAS Studio doesn't led me include multiple where-clauses into a correlation analysis. I already tried to put them directly into the code but then the log showed that only the last one hat been included into the calculation. Does anybody have a solution?


Thanks and best regards, Robert 

Grand Advisor
Posts: 9,719

Re: Multiple where-clauses in correlation analysis

If this were "where clauses" related you should show what you have been attempting, what the clauses look like.


However, typically with survey data in the situation you describe it is very common to create recoded variables where the "don't know" "refused" or "not applicable" type of answers are set to missing so they are excluded from analysis of the scales.


In code terms if you have many similar response ranges to set you can use an array:

Array q   q1 q2 q5;  /* the q1 q2 q5 represent the names of the variables you want to recode you may be able to reference as q1-q10 if the ones you want have numeric suffixs and are in order*/

Array r   rq1 rq2 rq5; /* these variables will hold the results of the recodes should look exactly like the ones in Q but with a change to the names, you will discover that prefixing is preferred to suffixing  as lists like q1r - q10r do not behave as desired*/

/* the next line assumes your scale runs 1 to 7 and the values to remove are greater than 7. Any comparison that works is okay, such

as if q[i] in ( 10, 12, 0) then r[i]=.;

do i = 1 to dim(q);

   if q[i] > 7 then r[i]=.;

   else r[i] = q[i];



I recommend adding a label to your recoded variables.


For some purposes you could use a custom format to indicate special missing that would indicate the original reason the data is set to missing but the special missing may be included in some places (proc freq output for example) when you may not want them.

Respected Advisor
Posts: 4,641

Re: Multiple where-clauses in correlation analysis

To me, one of the most important ideas is included in ballardw's response but not explained enough.  Consider this variation on the array processing:


array q [3]  q1 q2 q5;

do i = 1 to 3;

   if q[i] =8 then then q[i]=.A;

   else if q[i]=9 then q[i] = .B;

   else if q[i]=10 then q[i]=.C



The idea is that you can afford to permanently change your existing data ... IF you can still distinguish among the answers greater than 7 and identify what they were originally.  Special missing values let you do that.  Once you have saved special missing values instead of the original values, procedures will automatically treat those values differently than they would nonmissing values.

Post a Question
Discussion Stats
  • 2 replies
  • 3 in conversation