BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
John04
Fluorite | Level 6

 

I am trying to find average amount by product_group, product, YM and then using that average for new YM.

for e.g 

Until now I tried partition by and indexing but I think it doesn't work in sas.

Hoping that someone can point me in right direction on how to solve this problem.

 

data have;
input YM Product_Group$ Product$ Amount;
datalines;
200101 X P1 1.2
200201 X P1 8.9
200301 X P1 432.4
200101 Y P2 24.23
200201 Y P2 842.9
200301 Y P2 43.4
run;

I want output something like:

 

data want;
input YM Product_Group$ Product$ Average;
datalines;
200401 X P1 147.5
200401 Y P1 303.51
run;
1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

@John04 wrote:

 

I am trying to find average amount by product_group, product, YM and then using that average for new YM.

for e.g 

Until now I tried partition by and indexing but I think it doesn't work in sas.

Hoping that someone can point me in right direction on how to solve this problem.

 

data have;
input YM Product_Group$ Product$ Amount;
datalines;
200101 X P1 1.2
200201 X P1 8.9
200301 X P1 432.4
200101 Y P2 24.23
200201 Y P2 842.9
200301 Y P2 43.4
run;

I want output something like:

 

data want;
input YM Product_Group$ Product$ Average;
datalines;
200401 X P1 147.5
200401 Y P1 303.51
run;

This is the easiest way I know to get the mean amount by product_group and Product

proc summary data=have nway;
   class  product_group product;
   var amount;
   output out=want (drop=_:) mean=;
run;

But getting a value that does not exist in the source data, ie that 200401, as part of any summary will require further processing, such as data step to add the value.

 

 

Large economy size hint: Instead of using random numbers containing parts of a date, create actual date values with appropriate formats such as:

data have;
input YM :yymmn6. Product_Group$ Product$ Amount;
format ym yymmn.;
datalines;
200101 X P1 1.2
200201 X P1 8.9
200301 X P1 432.4
200101 Y P2 24.23
200201 Y P2 842.9
200301 Y P2 43.4
run;

Then you have all the power of the SAS functions such as INTNX to increment years, such as Intnx('year',ym,1)  instead of weird arithmetic like 200101 +100 to simulate a next year. Plus graphs with dates will space tickmarks in nicer intervals. The above would assume the first day of a month for date.

View solution in original post

9 REPLIES 9
PaigeMiller
Diamond | Level 26

Why do you need to create an average by product and YM? There is only 1 data point for each product and YM, so the average is just the value of that data point.

 

data want;
    set have;
run;

Perhaps a more representative example data set would help here.

--
Paige Miller
John04
Fluorite | Level 6
Goal is to use the Average value as the "Inline Predicted value" for the next YM; incase it would be for 200401 for product X and Y.
PaigeMiller
Diamond | Level 26

@John04 wrote:
Goal is to use the Average value as the "Inline Predicted value" for the next YM; incase it would be for 200401 for product X and Y.

I'm afraid this doesn't mean anything to me in the context of your original question. Please explain in more detail. And please provide a more representative data set.

--
Paige Miller
John04
Fluorite | Level 6
Sorry about that. I changed my data set if that helps.
PaigeMiller
Diamond | Level 26

Again, you haven't provided a meaningful example, you are computing means of ONE data point in each YM/Product combination, which is completely unnecessary, there's no programming involved, the mean of a single data point is that single data point. And then of course the desired output can't be computed from the shown input, as there is no 200401 in the input.

 

Please put some effort into creating meaningful and realistic examples of the input data, and output data that is based upon the input data.

 

Should you really be talking about a situation where you have multiple records for a YM/Product combination, this code will compute the means.

 

data have;
input YM Product_Group Product Amount;
datalines;
200101 X P1 1.2
200201 X P1 8.9
200301 X P1 432.4
200101 Y P2 24.23
200201 Y P2 842.9
200301 Y P2 43.4
run;
proc summary data=have nway;
    class ym product;
    var amount;
    output out=want mean=;
Run;

 

--
Paige Miller
ballardw
Super User

First, please test your data step code. It generates missing values for Product and Product_group because you don't have provision for reading character variables.

I think you meant something more like:

data have;
   input YM Product_Group $ Product $ Amount;
datalines;
200101 X P1 1.2
200101 Y P4 4.3
200102 X P2 2.3
200102 Y P5 4.6
200103 X P3 3.4
200103 Y P6 5.6
200201 X P1 8.9
200201 Y P4 35.6
200202 X P2 53.6
200202 Y P5 24.6
200203 X P3 346.7
200203 Y P6 23.5
200301 X P1 432.4
200301 Y P4 13.5
200302 X P2 1.25
200302 Y P5 32.6
200303 X P3 13.5
200303 Y P6 1236.1
;

You don't say if you want a data set (for further manipulation) or report (people read these). This one way to get a report:

proc means data=have mean;
  class ym product;
  var amount;
run;
 

Means will also create a data set with the OUTPUT statement.

 

Your example data is pretty boring for the question as your grouping variables don't show any duplicates so the "mean" is the value in the data.

John04
Fluorite | Level 6
I should have specified in more detail. However I changed my dataset if that helps. Thanks
whymath
Lapis Lazuli | Level 10

Are you look for proc means?

proc means mean;
  class YM Product;
  var Amount;
run;

By the way, you input statement need to be modified:

input YM Product_Group$ Product$ Amount;
ballardw
Super User

@John04 wrote:

 

I am trying to find average amount by product_group, product, YM and then using that average for new YM.

for e.g 

Until now I tried partition by and indexing but I think it doesn't work in sas.

Hoping that someone can point me in right direction on how to solve this problem.

 

data have;
input YM Product_Group$ Product$ Amount;
datalines;
200101 X P1 1.2
200201 X P1 8.9
200301 X P1 432.4
200101 Y P2 24.23
200201 Y P2 842.9
200301 Y P2 43.4
run;

I want output something like:

 

data want;
input YM Product_Group$ Product$ Average;
datalines;
200401 X P1 147.5
200401 Y P1 303.51
run;

This is the easiest way I know to get the mean amount by product_group and Product

proc summary data=have nway;
   class  product_group product;
   var amount;
   output out=want (drop=_:) mean=;
run;

But getting a value that does not exist in the source data, ie that 200401, as part of any summary will require further processing, such as data step to add the value.

 

 

Large economy size hint: Instead of using random numbers containing parts of a date, create actual date values with appropriate formats such as:

data have;
input YM :yymmn6. Product_Group$ Product$ Amount;
format ym yymmn.;
datalines;
200101 X P1 1.2
200201 X P1 8.9
200301 X P1 432.4
200101 Y P2 24.23
200201 Y P2 842.9
200301 Y P2 43.4
run;

Then you have all the power of the SAS functions such as INTNX to increment years, such as Intnx('year',ym,1)  instead of weird arithmetic like 200101 +100 to simulate a next year. Plus graphs with dates will space tickmarks in nicer intervals. The above would assume the first day of a month for date.

hackathon24-white-horiz.png

The 2025 SAS Hackathon Kicks Off on June 11!

Watch the live Hackathon Kickoff to get all the essential information about the SAS Hackathon—including how to join, how to participate, and expert tips for success.

YouTube LinkedIn

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 1967 views
  • 0 likes
  • 4 in conversation