BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
NCEU
Calcite | Level 5

I am learning about how to merge summary statistics with the original data in order to compute fraction values which need e.g. an overall mean or a mean per category. It could also be done e.g. with proc sql, but my problem is not the logic, but why I get at the final report a line I do not want to have.

I have a data set "mergesummary":

Max Flight          running 1930

Zip Fit Leather    walking 2250

Zoom Airborne    running 4150

Light Step          walking 1130

Max Step Woven walking 2230

Zip Sneak          c-train  1190

I read this data with an infile and input statement. Then I sort it:

data mergesummary;

    infile 'D:\SAS\running.txt';

    input stylename $15. +1 typeex $7. +1 totalsales 4.;

run;

proc sort data=mergesummary;

    by typeex;

run;

Then I calculate the summaries and use them to output the SUM of totalsales into a new data set "summarydata":

proc means NOPRINT data=mergesummary ;

    var totalsales;

    class typeex;

    output out=summarydata SUM(totalsales)=Totalcategory;

run;

Now I sort again:

proc sort data=summarydata;

    by typeex;

run;

Then I merge the original data with the summary data set:

data finareport;

    merge mergesummary summarydata;

    by typeex;

    PercentageEach=totalsales/totalcategory;

run;

It all works fine, so I have no problem here!

Then I use a proc print to show the results:

proc print data=finareport;

    by typeex;

    ID typeex;

    var stylename totalsales totalcategory percentageeach;

    format PercentageEach Percent9.3;

run;

This looks like:

At the beginning the first row is just showing the total sum of all sales (the variable unfortunately is called totalsales, so it shows the total sum of totalsales). This is due to the fact that is has been calculated and output to the "summarydata" set. However, due to matching (since there is no match and it has no typeex entry) I thought it will drop out. I have the same solution in a book and there the line is not shown? But they use the same code?

Where is my mistake and how can I get rid of this row?

1 ACCEPTED SOLUTION

Accepted Solutions
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Would replacing class with by also work?

View solution in original post

8 REPLIES 8
PaigeMiller
Diamond | Level 26

It comes from PROC MEANS, if you don't want it use the NWAY option in PROC MEANS

There's no need to do a sort before your PROC MEANS.

--
Paige Miller
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Would replacing class with by also work?

NCEU
Calcite | Level 5

Wow! Nice one! This is EXACTLY my mistake! I used a class statement, but the book uses a by statement. What's the reason for this? Why does class give me this and by not?

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Third from last section in here:

Base SAS(R) 9.2 Procedures Guide

Yes, I have always used by statements in means until recently someone stated to use class.  They are similar but have differing options.

PaigeMiller
Diamond | Level 26

CLASS doesn't require pre-sorting of the data, and it gives you all possible comibnations of the class variables, including the "null combination" which is the statistics for all of your data regardless of the level of the class variables. That is the first line that you see in your output from PROC MEANS, it is the sum across all of the data. The NWAY turns off this feature and only gives you the sums for each level of the class variables. In a lot of ways, CLASS is a much more powerful tool than using BY in PROC MEANS. CLASS does everything that BY does, so BY is a subset of CLASS.

--
Paige Miller
ballardw
Super User

And an occasionally beneficial side effect with NWAY is the data on output is sorted by the class variables.

NCEU
Calcite | Level 5

Thanks a lot that helped! However this is not included in the book solution... but thanks!

I have to sort before proc means, not because of proc means, but because later on I use a data merge with a by statement.

ballardw
Super User

Maybe this gets close to what you want

proc tabulate data=mergesummary;

class typeex;

class stylename;

var totalsales;

table typeex*stylename,

      totalsales=''*(sum='Total Sales'*f=best6. pctsum<stylename>='Percentage Each'*f=best7.);

run;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 2413 views
  • 12 likes
  • 4 in conversation