BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Kimberley
Calcite | Level 5

Good Morning,  I am willing to bet this isn't that hard, but with my little experience with SAS, I haven't figured this one out.

I have a data set with 2 variables being used as classification variables, Product and Machine. For each observation, I have a numeric value for TEST

I want to create for each observation a new variable DIF which is the result of this calculation: DIF= TEST - (Average TEST for all observations for each combination of product and machine)

Basically I want to calculate the difference between the observation versus the average result for each Class (Product*Machine*) and put that in the DIF variable for each observation.

Thanks!

Kim

1 ACCEPTED SOLUTION

Accepted Solutions
Alpay
Fluorite | Level 6

If you want the difference between TEST and the average TEST of that Class you can make use of 1st scenario.

Second and third ones are average of average group means and overall mean for TEST, respectively.

data x;

  input Product Machine Test;

datalines;

0 0 1

0 0 4

0 0 3

0 1 2

0 1 4

1 0 3

1 0 7

1 1 4

1 1 2

;

run;

proc sort data=x;

  by Product Machine;

run;

/* Difference between TEST and group mean */

proc summary data=x nway missing;

  class Product Machine;

  var Test;

  output Out=x_group_mean(drop=_:)

  mean=Mean;

run;

data want_dif_group_mean;

  merge x x_group_mean;

  by Product Machine;

  DIF = Test - Mean;

run;

/* Difference between TEST and mean of group means */

proc summary data=x_group_mean nway missing;

  var Mean;

  output Out=x_mean_of_group_mean(drop=_:)

  mean=Mean;

run;

data want_dif_mean_of_group_mean;

  if _n_ = 1 then set x_mean_of_group_mean;

  set x ;

  DIF = Test - Mean;

run;

/* Difference between TEST and overall mean */

proc summary data=x nway missing;

  var Test;

  output Out=x_overall_mean(drop=_:)

  mean=Mean;

run;

data want_dif_overall_mean;

  if _n_ = 1 then set x_overall_mean;

  set x ;

  DIF = Test - Mean;

run;

View solution in original post

6 REPLIES 6
Linlin
Lapis Lazuli | Level 10

do you use proc sql?

data have;

input product $ machine $ value;

cards;

a aa 20

a bb 30

a cc 40

b aa 20

b bb 30

b cc 40

a aa 200

a bb 300

a cc 400

b aa 200

b bb 300

b cc 400

;

proc sql;

  create table want as

    select *,value-mean(value) as dif

   from have

     group by product,machine;

quit;

proc print;run;

                              obs    product    machine     dif

                                  1       a         aa        -90
                                  2       a         bb       -135
                                  3       a         cc       -180
                                  4       b         aa        -90
                                  5       b         bb       -135
                                  6       b         cc       -180
                                  7       a         aa         90
                                  8       a         bb        135
                                  9       a         cc        180
                                 10       b         aa         90
                                 11       b         bb        135
                                 12       b         cc        180

Linlin

Message was edited by: Linlin

Kimberley
Calcite | Level 5

I have never used PROC SQL - I will need to learn this! I am not quite sure how to code the SQL, so I will spend some time better understanding that procedure. Thank you!

It amazes me how quickly you all coem up with code! Thanks everyone!

Kim

Alpay
Fluorite | Level 6

If you want the difference between TEST and the average TEST of that Class you can make use of 1st scenario.

Second and third ones are average of average group means and overall mean for TEST, respectively.

data x;

  input Product Machine Test;

datalines;

0 0 1

0 0 4

0 0 3

0 1 2

0 1 4

1 0 3

1 0 7

1 1 4

1 1 2

;

run;

proc sort data=x;

  by Product Machine;

run;

/* Difference between TEST and group mean */

proc summary data=x nway missing;

  class Product Machine;

  var Test;

  output Out=x_group_mean(drop=_:)

  mean=Mean;

run;

data want_dif_group_mean;

  merge x x_group_mean;

  by Product Machine;

  DIF = Test - Mean;

run;

/* Difference between TEST and mean of group means */

proc summary data=x_group_mean nway missing;

  var Mean;

  output Out=x_mean_of_group_mean(drop=_:)

  mean=Mean;

run;

data want_dif_mean_of_group_mean;

  if _n_ = 1 then set x_mean_of_group_mean;

  set x ;

  DIF = Test - Mean;

run;

/* Difference between TEST and overall mean */

proc summary data=x nway missing;

  var Test;

  output Out=x_overall_mean(drop=_:)

  mean=Mean;

run;

data want_dif_overall_mean;

  if _n_ = 1 then set x_overall_mean;

  set x ;

  DIF = Test - Mean;

run;

Haikuo
Onyx | Level 15

Or 2XDOW will do:

data x;

  input Product Machine Test;

datalines;

0 0 1

0 0 4

0 0 3

0 1 2

0 1 4

1 0 3

1 0 7

1 1 4

1 1 2

;

run;

proc sort data=x;

  by Product Machine;

run;

data want;

  do _n_=1 by 1 until (last.machine);

  set x;

by product machine;

_total+test;

if last.machine then _mean=_total/_n_;

end;

do until (last.machine);

  set x;

by product machine;

dif=test-_mean;

output;

end;

call missing (_total);

drop _:;

run;

proc print;run;

Haikuo

MikeZdeb
Rhodochrosite | Level 12

hi ... using Linlin's data (no sort needed) ...

proc summary data=have nway;

class product machine;

var value;

output out=stats (drop=_: index=(pm=(product machine))) mean=mean;

run;

data want (drop=mean);

set have;

set stats key=pm / unique;

diff = value-mean;

run;

Linlin
Lapis Lazuli | Level 10

or use hash:

data want;

  if _n_=1 then do;

   if 0 then set stats;

    declare hash h(dataset:'stats');

  h.definekey('product','machine');

  h.definedata('mean');

  h.definedone();

  end;

  set have;

    if h.find()=0 then do;

      dif=value-mean;

      output;

                    end;

  run;

  proc print;run;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 1081 views
  • 7 likes
  • 5 in conversation