I have some data that I'd like to standardize but can i also set a minimum and/or maximum value? Let's say in my example below that i wanted the minimum to be at least 0.25, is that possible?
proc standard data =sashelp.Baseball out=Baseball_Normalized
mean=0 std=1;
var CrAtBat;
run;
Hi @SAShole,
There is no linear transformation that satisfies all three conditions. So, if the condition "transformed values sum to the number of observations" is mandatory, you can obtain either a minimum of .25 or a maximum of 4:
data trans1;
set have;
a=0.75/1321.18;
b=1-1333.18*a;
HH_Weight_t=a*HH_Weight+b;
run;
proc means data=trans1 sum min max;
var HH_Weight_t;
run;
Result:
Sum Minimum Maximum -------------------------------------------- 50.0000000 0.2500000 1.6538587 --------------------------------------------
Same with a=3/1151.82 (and again b=1-1333.18*a):
Sum Minimum Maximum -------------------------------------------- 50.0000000 -2.4411106 4.0000000 --------------------------------------------
Or a compromise between these two (using least squares)? (a=0.0014472809)
Sum Minimum Maximum -------------------------------------------- 50.0000000 -0.9121186 2.6670071 --------------------------------------------
However, there are non-linear transformations which satisfy all three conditions. Would this make sense? Unfortunately, there is no suitable quadratic transformation (t(x)=ax²+bx+c) -- in spite of three free parameters. Reason: It would not be monotone, hence there would be no unique inverse transformation!
But how about an exponential transformation (t(x)=a exp(bx)+c), which is of course monotone?
data trans;
retain a 0.01556848794537
b 2.20865846733238e-3
c 0.2340133696564;
set have;
HH_Weight_t=a*exp(b*HH_Weight)+c;
run;
Result of PROC MEANS:
Sum Minimum Maximum -------------------------------------------- 50.0000000 0.2500000 4.0000000 --------------------------------------------
PROC STDIZE may have a method that will suite you.
@SAShole wrote:
I have some data that I'd like to standardize but can i also set a minimum and/or maximum value? Let's say in my example below that i wanted the minimum to be at least 0.25, is that possible?
proc standard data =sashelp.Baseball out=Baseball_Normalized mean=0 std=1; var CrAtBat; run;
While I think I understand the request when you want the MAXIMUM to be 0.25, and there is the MAXABS method of PROC STDIZE (which is a form of standardizing, but it might not have mean zero and would not have a variance of 1).
However, you specified you want the MINIMUM to be 0.25, and that doesn't make any mathematical sense to me. Minimum of what? Can you explain further, and even provide a small example?
I have survey weights that I need to transform such that they sum to the number of observations and have a minimum of .25 and a maximum of 4 (if possible). Here is a small example of my dataset:
data Have;
input HH_Weight 8.;
datalines;
213
2161
973
1870
1335
2087
2002
312
2076
2226
484
669
1190
972
711
1874
1773
2291
2213
1582
1584
1581
2182
2329
382
158
1567
183
1171
2485
12
1283
1008
1283
1631
1206
2399
871
1892
1475
876
1373
2357
285
971
229
545
1800
1137
1390
run;
@SAShole wrote:
I have survey weights that I need to transform such that they sum to the number of observations and have a minimum of .25 and a maximum of 4 (if possible). Here is a small example of my dataset:
data Have; input HH_Weight 8.; datalines; 213 2161 973 1870 1335 2087 2002 312 2076 2226 484 669 1190 972 711 1874 1773 2291 2213 1582 1584 1581 2182 2329 382 158 1567 183 1171 2485 12 1283 1008 1283 1631 1206 2399 871 1892 1475 876 1373 2357 285 971 229 545 1800 1137 1390 run;
Okay, this isn't really standardizing, is it? And I'm not sure you can get all of the conditions you specify to hold at the same time, min of 0.25, max of 4 and sum to the number of observations. Has this actually been done in other applications, getting all three of the conditions to hold at the same time, or is it something you have decided you need for this application?
The part where you want them to sum to the total number of observations is relatively easy. I am skeptical that you can then take the result and set the min to 0.25 and max to 4 and get the same sum.
To get them to sum to the number of observations, count the number of observatrions (let's call this N) and find the sum of all observations (let's call this SUM), and then each value would be multiplied by (N/SUM). After that, I don't see a path forward.
Hi @SAShole,
There is no linear transformation that satisfies all three conditions. So, if the condition "transformed values sum to the number of observations" is mandatory, you can obtain either a minimum of .25 or a maximum of 4:
data trans1;
set have;
a=0.75/1321.18;
b=1-1333.18*a;
HH_Weight_t=a*HH_Weight+b;
run;
proc means data=trans1 sum min max;
var HH_Weight_t;
run;
Result:
Sum Minimum Maximum -------------------------------------------- 50.0000000 0.2500000 1.6538587 --------------------------------------------
Same with a=3/1151.82 (and again b=1-1333.18*a):
Sum Minimum Maximum -------------------------------------------- 50.0000000 -2.4411106 4.0000000 --------------------------------------------
Or a compromise between these two (using least squares)? (a=0.0014472809)
Sum Minimum Maximum -------------------------------------------- 50.0000000 -0.9121186 2.6670071 --------------------------------------------
However, there are non-linear transformations which satisfy all three conditions. Would this make sense? Unfortunately, there is no suitable quadratic transformation (t(x)=ax²+bx+c) -- in spite of three free parameters. Reason: It would not be monotone, hence there would be no unique inverse transformation!
But how about an exponential transformation (t(x)=a exp(bx)+c), which is of course monotone?
data trans;
retain a 0.01556848794537
b 2.20865846733238e-3
c 0.2340133696564;
set have;
HH_Weight_t=a*exp(b*HH_Weight)+c;
run;
Result of PROC MEANS:
Sum Minimum Maximum -------------------------------------------- 50.0000000 0.2500000 4.0000000 --------------------------------------------
@SAShole wrote:
I have some data that I'd like to standardize but can i also set a minimum and/or maximum value? Let's say in my example below that i wanted the minimum to be at least 0.25, is that possible?
proc standard data =sashelp.Baseball out=Baseball_Normalized mean=0 std=1; var CrAtBat; run;
I am wondering what you will do with the resulting values.
Are you looking for a sampling probability?
Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.
Register today!ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.