BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
SAShole
Pyrite | Level 9

I have some data that I'd like to standardize but can i also set a minimum and/or maximum value? Let's say in my example below that i wanted the minimum to be at least 0.25, is that possible?

 

proc standard data =sashelp.Baseball out=Baseball_Normalized 
mean=0 std=1;
	var CrAtBat;
run;

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

Hi @SAShole,

 

There is no linear transformation that satisfies all three conditions. So, if the condition "transformed values sum to the number of observations" is mandatory, you can obtain either a minimum of .25 or a maximum of 4:

data trans1;
set have;
a=0.75/1321.18;
b=1-1333.18*a;
HH_Weight_t=a*HH_Weight+b;
run;

proc means data=trans1 sum min max;
var HH_Weight_t;
run;

Result:

         Sum         Minimum         Maximum
--------------------------------------------
  50.0000000       0.2500000       1.6538587
--------------------------------------------

Same with a=3/1151.82 (and again b=1-1333.18*a):

         Sum         Minimum         Maximum
--------------------------------------------
  50.0000000      -2.4411106       4.0000000
--------------------------------------------

Or a compromise between these two (using least squares)? (a=0.0014472809)

         Sum         Minimum         Maximum
--------------------------------------------
  50.0000000      -0.9121186       2.6670071
--------------------------------------------

 

However, there are non-linear transformations which satisfy all three conditions. Would this make sense? Unfortunately, there is no suitable quadratic transformation (t(x)=ax²+bx+c) -- in spite of three free parameters. Reason: It would not be monotone, hence there would be no unique inverse transformation!

 

But how about an exponential transformation (t(x)=a exp(bx)+c), which is of course monotone?

data trans;
retain a 0.01556848794537
       b 2.20865846733238e-3
       c 0.2340133696564;
set have;
HH_Weight_t=a*exp(b*HH_Weight)+c;
run;

Result of PROC MEANS:

         Sum         Minimum         Maximum
--------------------------------------------
  50.0000000       0.2500000       4.0000000
--------------------------------------------

View solution in original post

6 REPLIES 6
data_null__
Jade | Level 19

PROC STDIZE may have a method that will suite you.

PaigeMiller
Diamond | Level 26

@SAShole wrote:

I have some data that I'd like to standardize but can i also set a minimum and/or maximum value? Let's say in my example below that i wanted the minimum to be at least 0.25, is that possible?

 

proc standard data =sashelp.Baseball out=Baseball_Normalized 
mean=0 std=1;
	var CrAtBat;
run;

 

 


While I think I understand the request when you want the MAXIMUM to be 0.25, and there is the MAXABS method of PROC STDIZE (which is a form of standardizing, but it might not have mean zero and would not have a variance of 1).

 

However, you specified you want the MINIMUM to be 0.25, and that doesn't make any mathematical sense to me. Minimum of what? Can you explain further, and even provide a small example?

--
Paige Miller
SAShole
Pyrite | Level 9

I have survey weights that I need to transform such  that they sum to the number of observations and have a minimum of .25 and a maximum of 4 (if possible). Here is a small example of my dataset:

 

			
data Have;
	input HH_Weight 8.;
	datalines;
213
2161
973
1870
1335
2087
2002
312
2076
2226
484
669
1190
972
711
1874
1773
2291
2213
1582
1584
1581
2182
2329
382
158
1567
183
1171
2485
12
1283
1008
1283
1631
1206
2399
871
1892
1475
876
1373
2357
285
971
229
545
1800
1137
1390
run;
PaigeMiller
Diamond | Level 26

@SAShole wrote:

I have survey weights that I need to transform such  that they sum to the number of observations and have a minimum of .25 and a maximum of 4 (if possible). Here is a small example of my dataset:

 

			
data Have;
	input HH_Weight 8.;
	datalines;
213
2161
973
1870
1335
2087
2002
312
2076
2226
484
669
1190
972
711
1874
1773
2291
2213
1582
1584
1581
2182
2329
382
158
1567
183
1171
2485
12
1283
1008
1283
1631
1206
2399
871
1892
1475
876
1373
2357
285
971
229
545
1800
1137
1390
run;

Okay, this isn't really standardizing, is it? And I'm not sure you can get all of the conditions you specify to hold at the same time, min of 0.25, max of 4 and sum to the number of observations. Has this actually been done in other applications, getting all three of the conditions to hold at the same time, or is it something you have decided you need for this application?

 

The part where you want them to sum to the total number of observations is relatively easy. I am skeptical that you can then take the result and set the min to 0.25 and max to 4 and get the same sum.

 

To get them to sum to the number of observations, count the number of observatrions (let's call this N) and find the sum of all observations (let's call this SUM), and then each value would be multiplied by (N/SUM). After that, I don't see a path forward.

 

--
Paige Miller
FreelanceReinh
Jade | Level 19

Hi @SAShole,

 

There is no linear transformation that satisfies all three conditions. So, if the condition "transformed values sum to the number of observations" is mandatory, you can obtain either a minimum of .25 or a maximum of 4:

data trans1;
set have;
a=0.75/1321.18;
b=1-1333.18*a;
HH_Weight_t=a*HH_Weight+b;
run;

proc means data=trans1 sum min max;
var HH_Weight_t;
run;

Result:

         Sum         Minimum         Maximum
--------------------------------------------
  50.0000000       0.2500000       1.6538587
--------------------------------------------

Same with a=3/1151.82 (and again b=1-1333.18*a):

         Sum         Minimum         Maximum
--------------------------------------------
  50.0000000      -2.4411106       4.0000000
--------------------------------------------

Or a compromise between these two (using least squares)? (a=0.0014472809)

         Sum         Minimum         Maximum
--------------------------------------------
  50.0000000      -0.9121186       2.6670071
--------------------------------------------

 

However, there are non-linear transformations which satisfy all three conditions. Would this make sense? Unfortunately, there is no suitable quadratic transformation (t(x)=ax²+bx+c) -- in spite of three free parameters. Reason: It would not be monotone, hence there would be no unique inverse transformation!

 

But how about an exponential transformation (t(x)=a exp(bx)+c), which is of course monotone?

data trans;
retain a 0.01556848794537
       b 2.20865846733238e-3
       c 0.2340133696564;
set have;
HH_Weight_t=a*exp(b*HH_Weight)+c;
run;

Result of PROC MEANS:

         Sum         Minimum         Maximum
--------------------------------------------
  50.0000000       0.2500000       4.0000000
--------------------------------------------
ballardw
Super User

@SAShole wrote:

I have some data that I'd like to standardize but can i also set a minimum and/or maximum value? Let's say in my example below that i wanted the minimum to be at least 0.25, is that possible?

 

proc standard data =sashelp.Baseball out=Baseball_Normalized 
mean=0 std=1;
	var CrAtBat;
run;

 

 


I am wondering what you will do with the resulting values.

Are you looking for a sampling probability?

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 1813 views
  • 4 likes
  • 5 in conversation