Good Morning, I am willing to bet this isn't that hard, but with my little experience with SAS, I haven't figured this one out.
I have a data set with 2 variables being used as classification variables, Product and Machine. For each observation, I have a numeric value for TEST
I want to create for each observation a new variable DIF which is the result of this calculation: DIF= TEST - (Average TEST for all observations for each combination of product and machine)
Basically I want to calculate the difference between the observation versus the average result for each Class (Product*Machine*) and put that in the DIF variable for each observation.
Thanks!
Kim
If you want the difference between TEST and the average TEST of that Class you can make use of 1st scenario.
Second and third ones are average of average group means and overall mean for TEST, respectively.
data x;
input Product Machine Test;
datalines;
0 0 1
0 0 4
0 0 3
0 1 2
0 1 4
1 0 3
1 0 7
1 1 4
1 1 2
;
run;
proc sort data=x;
by Product Machine;
run;
/* Difference between TEST and group mean */
proc summary data=x nway missing;
class Product Machine;
var Test;
output Out=x_group_mean(drop=_:)
mean=Mean;
run;
data want_dif_group_mean;
merge x x_group_mean;
by Product Machine;
DIF = Test - Mean;
run;
/* Difference between TEST and mean of group means */
proc summary data=x_group_mean nway missing;
var Mean;
output Out=x_mean_of_group_mean(drop=_:)
mean=Mean;
run;
data want_dif_mean_of_group_mean;
if _n_ = 1 then set x_mean_of_group_mean;
set x ;
DIF = Test - Mean;
run;
/* Difference between TEST and overall mean */
proc summary data=x nway missing;
var Test;
output Out=x_overall_mean(drop=_:)
mean=Mean;
run;
data want_dif_overall_mean;
if _n_ = 1 then set x_overall_mean;
set x ;
DIF = Test - Mean;
run;
do you use proc sql?
data have;
input product $ machine $ value;
cards;
a aa 20
a bb 30
a cc 40
b aa 20
b bb 30
b cc 40
a aa 200
a bb 300
a cc 400
b aa 200
b bb 300
b cc 400
;
proc sql;
create table want as
select *,value-mean(value) as dif
from have
group by product,machine;
quit;
proc print;run;
obs product machine dif
1 a aa -90
2 a bb -135
3 a cc -180
4 b aa -90
5 b bb -135
6 b cc -180
7 a aa 90
8 a bb 135
9 a cc 180
10 b aa 90
11 b bb 135
12 b cc 180
Linlin
Message was edited by: Linlin
I have never used PROC SQL - I will need to learn this! I am not quite sure how to code the SQL, so I will spend some time better understanding that procedure. Thank you!
It amazes me how quickly you all coem up with code! Thanks everyone!
Kim
If you want the difference between TEST and the average TEST of that Class you can make use of 1st scenario.
Second and third ones are average of average group means and overall mean for TEST, respectively.
data x;
input Product Machine Test;
datalines;
0 0 1
0 0 4
0 0 3
0 1 2
0 1 4
1 0 3
1 0 7
1 1 4
1 1 2
;
run;
proc sort data=x;
by Product Machine;
run;
/* Difference between TEST and group mean */
proc summary data=x nway missing;
class Product Machine;
var Test;
output Out=x_group_mean(drop=_:)
mean=Mean;
run;
data want_dif_group_mean;
merge x x_group_mean;
by Product Machine;
DIF = Test - Mean;
run;
/* Difference between TEST and mean of group means */
proc summary data=x_group_mean nway missing;
var Mean;
output Out=x_mean_of_group_mean(drop=_:)
mean=Mean;
run;
data want_dif_mean_of_group_mean;
if _n_ = 1 then set x_mean_of_group_mean;
set x ;
DIF = Test - Mean;
run;
/* Difference between TEST and overall mean */
proc summary data=x nway missing;
var Test;
output Out=x_overall_mean(drop=_:)
mean=Mean;
run;
data want_dif_overall_mean;
if _n_ = 1 then set x_overall_mean;
set x ;
DIF = Test - Mean;
run;
Or 2XDOW will do:
data x;
input Product Machine Test;
datalines;
0 0 1
0 0 4
0 0 3
0 1 2
0 1 4
1 0 3
1 0 7
1 1 4
1 1 2
;
run;
proc sort data=x;
by Product Machine;
run;
data want;
do _n_=1 by 1 until (last.machine);
set x;
by product machine;
_total+test;
if last.machine then _mean=_total/_n_;
end;
do until (last.machine);
set x;
by product machine;
dif=test-_mean;
output;
end;
call missing (_total);
drop _:;
run;
proc print;run;
Haikuo
hi ... using Linlin's data (no sort needed) ...
proc summary data=have nway;
class product machine;
var value;
output out=stats (drop=_: index=(pm=(product machine))) mean=mean;
run;
data want (drop=mean);
set have;
set stats key=pm / unique;
diff = value-mean;
run;
or use hash:
data want;
if _n_=1 then do;
if 0 then set stats;
declare hash h(dataset:'stats');
h.definekey('product','machine');
h.definedata('mean');
h.definedone();
end;
set have;
if h.find()=0 then do;
dif=value-mean;
output;
end;
run;
proc print;run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.