Solved: For each YM, finding Average Amount for each product

John04 · Posted 01-28-2021 09:58 AM

I am trying to find average amount by product_group, product, YM and then using that average for new YM.

for e.g

Until now I tried partition by and indexing but I think it doesn't work in sas.

Hoping that someone can point me in right direction on how to solve this problem.

data have;
input YM Product_Group$ Product$ Amount;
datalines;
200101 X P1 1.2
200201 X P1 8.9
200301 X P1 432.4
200101 Y P2 24.23
200201 Y P2 842.9
200301 Y P2 43.4
run;

I want output something like:

data want;
input YM Product_Group$ Product$ Average;
datalines;
200401 X P1 147.5
200401 Y P1 303.51
run;

ballardw · Posted 01-28-2021 11:42 AM

@John04 wrote:

I am trying to find average amount by product_group, product, YM and then using that average for new YM.

for e.g

Until now I tried partition by and indexing but I think it doesn't work in sas.

Hoping that someone can point me in right direction on how to solve this problem.
data have;
input YM Product_Group$ Product$ Amount;
datalines;
200101 X P1 1.2
200201 X P1 8.9
200301 X P1 432.4
200101 Y P2 24.23
200201 Y P2 842.9
200301 Y P2 43.4
run;
I want output something like:
data want;
input YM Product_Group$ Product$ Average;
datalines;
200401 X P1 147.5
200401 Y P1 303.51
run;

This is the easiest way I know to get the mean amount by product_group and Product

proc summary data=have nway;
   class  product_group product;
   var amount;
   output out=want (drop=_:) mean=;
run;

But getting a value that does not exist in the source data, ie that 200401, as part of any summary will require further processing, such as data step to add the value.

Large economy size hint: Instead of using random numbers containing parts of a date, create actual date values with appropriate formats such as:

data have;
input YM :yymmn6. Product_Group$ Product$ Amount;
format ym yymmn.;
datalines;
200101 X P1 1.2
200201 X P1 8.9
200301 X P1 432.4
200101 Y P2 24.23
200201 Y P2 842.9
200301 Y P2 43.4
run;

Then you have all the power of the SAS functions such as INTNX to increment years, such as Intnx('year',ym,1) instead of weird arithmetic like 200101 +100 to simulate a next year. Plus graphs with dates will space tickmarks in nicer intervals. The above would assume the first day of a month for date.

View solution in original post

PaigeMiller · Posted 01-28-2021 10:04 AM

Why do you need to create an average by product and YM? There is only 1 data point for each product and YM, so the average is just the value of that data point.

data want;
    set have;
run;

Perhaps a more representative example data set would help here.

--
Paige Miller

John04 · Posted 01-28-2021 10:10 AM

Goal is to use the Average value as the "Inline Predicted value" for the next YM; incase it would be for 200401 for product X and Y.

PaigeMiller · Posted 01-28-2021 10:14 AM

@John04 wrote:
Goal is to use the Average value as the "Inline Predicted value" for the next YM; incase it would be for 200401 for product X and Y.

I'm afraid this doesn't mean anything to me in the context of your original question. Please explain in more detail. And please provide a more representative data set.

--
Paige Miller

John04 · Posted 01-28-2021 10:31 AM

Sorry about that. I changed my data set if that helps.

PaigeMiller · Posted 01-28-2021 10:34 AM

Again, you haven't provided a meaningful example, you are computing means of ONE data point in each YM/Product combination, which is completely unnecessary, there's no programming involved, the mean of a single data point is that single data point. And then of course the desired output can't be computed from the shown input, as there is no 200401 in the input.

Please put some effort into creating meaningful and realistic examples of the input data, and output data that is based upon the input data.

Should you really be talking about a situation where you have multiple records for a YM/Product combination, this code will compute the means.

data have;
input YM Product_Group Product Amount;
datalines;
200101 X P1 1.2
200201 X P1 8.9
200301 X P1 432.4
200101 Y P2 24.23
200201 Y P2 842.9
200301 Y P2 43.4
run;
proc summary data=have nway;
    class ym product;
    var amount;
    output out=want mean=;
Run;

--
Paige Miller

ballardw · Posted 01-28-2021 10:09 AM

First, please test your data step code. It generates missing values for Product and Product_group because you don't have provision for reading character variables.

I think you meant something more like:

data have;
   input YM Product_Group $ Product $ Amount;
datalines;
200101 X P1 1.2
200101 Y P4 4.3
200102 X P2 2.3
200102 Y P5 4.6
200103 X P3 3.4
200103 Y P6 5.6
200201 X P1 8.9
200201 Y P4 35.6
200202 X P2 53.6
200202 Y P5 24.6
200203 X P3 346.7
200203 Y P6 23.5
200301 X P1 432.4
200301 Y P4 13.5
200302 X P2 1.25
200302 Y P5 32.6
200303 X P3 13.5
200303 Y P6 1236.1
;

You don't say if you want a data set (for further manipulation) or report (people read these). This one way to get a report:

proc means data=have mean;
  class ym product;
  var amount;
run;

Means will also create a data set with the OUTPUT statement.

Your example data is pretty boring for the question as your grouping variables don't show any duplicates so the "mean" is the value in the data.

John04 · Posted 01-28-2021 10:36 AM

I should have specified in more detail. However I changed my dataset if that helps. Thanks

whymath · Posted 01-28-2021 10:09 AM

Are you look for proc means?

proc means mean;
  class YM Product;
  var Amount;
run;

By the way, you input statement need to be modified:

input YM Product_Group$ Product$ Amount;

ballardw · Posted 01-28-2021 11:42 AM

@John04 wrote:

I am trying to find average amount by product_group, product, YM and then using that average for new YM.

for e.g

Until now I tried partition by and indexing but I think it doesn't work in sas.

Hoping that someone can point me in right direction on how to solve this problem.
data have;
input YM Product_Group$ Product$ Amount;
datalines;
200101 X P1 1.2
200201 X P1 8.9
200301 X P1 432.4
200101 Y P2 24.23
200201 Y P2 842.9
200301 Y P2 43.4
run;
I want output something like:
data want;
input YM Product_Group$ Product$ Average;
datalines;
200401 X P1 147.5
200401 Y P1 303.51
run;

This is the easiest way I know to get the mean amount by product_group and Product

proc summary data=have nway;
   class  product_group product;
   var amount;
   output out=want (drop=_:) mean=;
run;

But getting a value that does not exist in the source data, ie that 200401, as part of any summary will require further processing, such as data step to add the value.

Large economy size hint: Instead of using random numbers containing parts of a date, create actual date values with appropriate formats such as:

data have;
input YM :yymmn6. Product_Group$ Product$ Amount;
format ym yymmn.;
datalines;
200101 X P1 1.2
200201 X P1 8.9
200301 X P1 432.4
200101 Y P2 24.23
200201 Y P2 842.9
200301 Y P2 43.4
run;

Then you have all the power of the SAS functions such as INTNX to increment years, such as Intnx('year',ym,1) instead of weird arithmetic like 200101 +100 to simulate a next year. Plus graphs with dates will space tickmarks in nicer intervals. The above would assume the first day of a month for date.

For each YM, finding Average Amount for each product

Re: For each YM, finding Average Amount for each product

Re: For each YM, finding Average Amount for each product

Re: For each YM, finding Average Amount for each product

Re: For each YM, finding Average Amount for each product

Re: For each YM, finding Average Amount for each product

Re: For each YM, finding Average Amount for each product

Re: For each YM, finding Average Amount for each product

Re: For each YM, finding Average Amount for each product

Re: For each YM, finding Average Amount for each product

Re: For each YM, finding Average Amount for each product

Registration is open

Registration is open

SAS Training: Just a Click Away