BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Pritish
Quartz | Level 8

Hi,

I am trying to calculate the z-score for the variable that I have in my dataset using proc standard. All the columns have different mean and std, so my question is should I use a common mean and std deviation for calculating the z-score or I should calculate it separately?

in terms of code:

PROC STANDARD

      DATA = X

     MEAN = 0

     STD = 1

     OUT = ZSCORE

     VAR

     A /* it has a mean of 5 and std of 5 */

     B /* it has a mean of 500 and std of 7 */

     C /* it has a mean of 900 and std of 1000 */

run;

OR I should use this approach?

PROC STANDARD

      DATA = X

     MEAN = 5

     STD = 5

     OUT = ZSCORE_a

     VAR

     A /* it has a mean of 5 and std of 5 */

  run;

PROC STANDARD

      DATA = X

     MEAN = 500

     STD = 7

     OUT = ZSCORE_b

     VAR

          B /* it has a mean of 500 and std of 7 */

      run;

PROC STANDARD

      DATA = X

     MEAN = 900

     STD = 1000

     OUT = ZSCORE_c

     VAR

         C /* it has a mean of 900 and std of 1000 */

run;

and then merge all cols

I really appreciate your time and guidance.

Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
SteveDenham
Jade | Level 19

If you want z scores, use your first block of code exactly as it is.  The mean= and std= options give the TARGET values, not the values of your sample.

Another approach is PROC STDIZE.  Something like this:

proc stdize data=X out=zscore sprefix=z_ oprefix=orig;

var A B C;

run;

This will give an output dataset with the original variables prefixed with orig and the z scores prefixed with z_.

I hope this helps.

Steve Denham


View solution in original post

3 REPLIES 3
SteveDenham
Jade | Level 19

Your first block of code will standardize all three variables to a mean of 0, and a standard deviation of 1.  This would be a z score.  None of the other code blocks will give z scores, but will instead give scaled scores that will look very much like the raw scores, as you are standardizing to the sample mean and standard deviation.

Steve Denham

Pritish
Quartz | Level 8

Steve, thanks for your reply!

So would it be fair, if I standardzied my data with a mean of 0 and std of 1? Since all my variable have different mean and std. Or should I try to get kind of avg of mean, std and plug it my first block of code?

SteveDenham
Jade | Level 19

If you want z scores, use your first block of code exactly as it is.  The mean= and std= options give the TARGET values, not the values of your sample.

Another approach is PROC STDIZE.  Something like this:

proc stdize data=X out=zscore sprefix=z_ oprefix=orig;

var A B C;

run;

This will give an output dataset with the original variables prefixed with orig and the z scores prefixed with z_.

I hope this helps.

Steve Denham


sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 2323 views
  • 1 like
  • 2 in conversation