I want to standardize the variable titled "Value" such that it has a maximum value of X (e.g., -2) and a minimum value of Y (e.g., 10). To do so, I want that all observations are adjusted in the same way.
Please see below the code I have written so far and it is based on a post by @Rick_SAS (please see link below):
DATA work.sample;
INPUT Institution $ Value;
DATALINES;
ABC 0.8
BCD 0.9
CDF 1.2
DEF -0.1
;
RUN;
*Normalize the data above to a max of 1 and min of 0 such that all other observations (non-max and non-min values) will also change;
proc stdize data=work.sample out=work.sample2 method=RANGE;
var Value;
run;
/*To change to a different range (e.g., [-1,1])*/
data work.sample3;
set work.sample2;
Value = 2*Value - 1;
run;
Question 1:
By adjusting the range, all of the non-boundary values will also be adjusted, right? For instance, if we have 3 observations (e.g., -10, 5, 10), then the median will also change such that it incorporates the adjustment of the upper and lower boundary (which changes to 0 and 1, respectively).
Question 2:
If I want to have an upper bound of 10 and a lower bound of 2 what change do I have to make in the last part of the code?
Link: https://communities.sas.com/t5/SAS-Procedures/scaling-variable-in-a-dataset/td-p/13294
Q1: Correct.
Q2: You can use the MULT= and ADD= options to scale and translate directly in PROC STDIZE. So insted of STDIZE followed by a DATA step, you can get sample3 by using
proc stdize data=work.sample out=work.sample4 method=RANGE mult=2 add=-1;
var Value;
run;
In particular, if you want an upper bound of 10 and a lower bound of 2, then the scale of the data is (10-2)=8 and the offset is 2, so you can use
proc stdize data=work.sample out=work.sample5 method=RANGE mult=8 add=2;
var Value;
run;
Q1: Correct.
Q2: You can use the MULT= and ADD= options to scale and translate directly in PROC STDIZE. So insted of STDIZE followed by a DATA step, you can get sample3 by using
proc stdize data=work.sample out=work.sample4 method=RANGE mult=2 add=-1;
var Value;
run;
In particular, if you want an upper bound of 10 and a lower bound of 2, then the scale of the data is (10-2)=8 and the offset is 2, so you can use
proc stdize data=work.sample out=work.sample5 method=RANGE mult=8 add=2;
var Value;
run;
Perfect, thank you for this helpful reply. Lastly, I am wondering if there is any way to use "proc stdize" such that one can directly specify the upper and lower bound without having to add / subtract or multiply / divide values. For instance, if one wants to have a lower bound of 1.61 and an upper bound of 3.14, would it be possibly to directly specify these bounds?
%let lower = 1.61;
%let upper = 3.14;
%let range = %sysevalf(&upper - &lower);
proc stdize data=sample out=sample6 method=RANGE mult=&range add=&lower;
var Value;
run;
This is exactly what I needed! Thank you very much, @Rick_SAS!
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.