BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
gsk
Obsidian | Level 7 gsk
Obsidian | Level 7

When I put break or rbreak after + some variable, sometimes proc report prints out sums, sometimes means or neither. If you find the attached for the output of the code below, the first highlighted value 25.2 seems to be the mean, but not other values. What are these numbers that proc report is showing as a summary? 

 

SAS documentation just says that the "summarize" option shows a summary line...... What kind of summary line is that showing though? There can be many, including means, medians, and quartiles....... 

 

 

proc report nowd data=cars;

where upcase(country) in ('GERMANY','JAPAN','USA') and
upcase(type) in ('SUV','SEDAN','HATCHBACK');

columns type country citympg hwympg ('Std Dev' citympg=citistd hwympg=hwystd);

define country / group 'Country of Origin';
define type / group 'Type of Car';
define citympg / analysis mean 'City MPG' format=5.1;
define hwympg / analysis mean 'Highway MPG' format=5.1;
define citistd / analysis std 'City MPG Std' format=5.1;
define hwystd / analysis std 'Highway MPG std' format=5.1;

break after type / summarize suppress DOL ul style=[BACKGROUNDCOLOR=ltgray];
/* break after cylinders/ summarize suppress ol ul; */
rbreak after / summarize ol ul;
run;

 

proc report.JPG

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

@gsk wrote:

Thank you for the reply!


But for Hatchback and Highway MPG for instance, 32.5 is not the average of 31.7, 32.2, and 36. Same as SUV and City MPG, SUV and Highway MPG, etc.

 

Also, what does rbreak indicate? It doesn't have averages of all cars for each City MPG either. 


You need to look at how many models fall into each category. Suppose the country with the 32.2 average has 6 models but the others only one each the overall mean is different as the divisor is different.

data example;
   simplemean= mean(31.7,32.2,36); /* and almost certainly incorrect unless each country has exactly one model*/
   meanwithcounts = mean (31.7, 32.2,32.2,32.2,32.2,32.2,32.2, 36);
run;

View solution in original post

4 REPLIES 4
Cynthia_sas
SAS Super FREQ

Hi:
The statistic that is produced on a BREAK or RBREAK summary line depends on the statistic that you list in your DEFINE statement for the variable.

For example, in your code, you have asked for MEAN for 2 of your variables and STD for the other 2 numeric variables. If you want SUM instead of MEAN or STD, you have to change the statistic listed. Without the name of a statistic, a numeric variable defaults to the SUM statistic. As soon as you use a statistic, then you get THAT statistic at the break.

Cynthia (see the example below)

 

proc report data=sashelp.cars
  style(summary) = Header;
  where make in ('Honda' 'Toyota' 'BMW' 'Cadillac');
  column type make wheelbase msrp mpg_highway mpg_city enginesize
         invoice horsepower weight length;
  define type / group;
  define make / group;
  define wheelbase / n 'N WheelBase';
  define msrp / mean 'Mean MSRP';
  define mpg_highway / sum 'SUM MPG HIGHWAY';
  define mpg_city / std 'STD MPG City';
  define enginesize / css 'CSS EngineSize';
  define invoice / median 'Median Invoice';
  define horsepower / stderr 'StdErr HorsePower'; 
  define weight / min 'Min Weight';
  define length / max 'Max Length';
  break after type / summarize;
  rbreak after / summarize;
run;
gsk
Obsidian | Level 7 gsk
Obsidian | Level 7

Thank you for the reply!


But for Hatchback and Highway MPG for instance, 32.5 is not the average of 31.7, 32.2, and 36. Same as SUV and City MPG, SUV and Highway MPG, etc.

 

Also, what does rbreak indicate? It doesn't have averages of all cars for each City MPG either. 

Cynthia_sas
SAS Super FREQ

Hi:

  I did not use your data, but for SASHELP.CARS, when I double check my numbers from PROC REPORT against the same statistics with PROC MEANS, I get the same numbers, so for my data (and all the other test cases I use) PROC MEANS and PROC REPORT give me the same values. Do remember that PROC REPORT is showing the formatted numbers, which is why some rounding occurs:

same_numbers_report_mean.png

 

 

  Try the revised code below and you should get the same results in PROC MEANS for all the BREAK lines.

 

  An RBREAK statement is the summary at the bottom (or top) of the report. It is the summary overall the entire report, if you are getting the SUM statistic, then it is what you might call a Grand Total. Only the statistic you get on the RBREAK line (in my case, at the bottom of the report) is the statistic requested for the variable.

 

  Since you did not provide any data, it is impossible to verify that PROC REPORT is generating incorrect numbers. However, remember that if you have grouped the items by your Country variables, so that each report row represents the AVERAGE for a country, that PROC REPORT is not averaging the averages -- it is taking the grand mean -- the sum of ALL the values divided by the TOTAL count of non-missing observations -- this will usually be different than the average of the averages, which is why you have to use a PROCEDURE like PROC MEANS to compare against. The best thing for you to do is either run a verification like mine, with PROC MEANS or open a track with Tech Support and send them ALL your data and ALL your code and see if they can explain why or how PROC REPORT is not generating the results you expect.

 

Cynthia

 

** Revised code that proves PROC REPORT is generating the same statistics as PROC MEANS;

proc report data=sashelp.cars
  style(summary) = Header;
  where make in ('Honda' 'Toyota' 'BMW' 'Cadillac');
  column type make wheelbase msrp mpg_highway mpg_city enginesize
         invoice horsepower weight length;
  define type / group;
  define make / group;
  define wheelbase / n 'N WheelBase';
  define msrp / mean 'Mean MSRP';
  define mpg_highway / sum 'SUM MPG HIGHWAY';
  define mpg_city / std 'STD MPG City';
  define enginesize / css 'CSS EngineSize';
  define invoice / median 'Median Invoice';
  define horsepower / stderr 'StdErr HorsePower'; 
  define weight / min 'Min Weight';
  define length / max 'Max Length';
  break after type / summarize;
  rbreak after / summarize;
run;

** get same numbers with PROC MEANS as PROC REPORT;
proc means data=sashelp.cars n;
   where make in ('Honda' 'Toyota' 'BMW' 'Cadillac');
   class type ;
  var wheelbase ; 
run;

proc means data=sashelp.cars mean;
   where make in ('Honda' 'Toyota' 'BMW' 'Cadillac');
   class type ;
  var msrp ; 
run;

proc means data=sashelp.cars sum;
   where make in ('Honda' 'Toyota' 'BMW' 'Cadillac');
   class type ;
  var mpg_highway ; 
run;

proc means data=sashelp.cars std;
   where make in ('Honda' 'Toyota' 'BMW' 'Cadillac');
   class type ;
  var mpg_city ; 
run;

proc means data=sashelp.cars css;
   where make in ('Honda' 'Toyota' 'BMW' 'Cadillac');
   class type ;
  var enginesize ; 
run;

proc means data=sashelp.cars median;
   where make in ('Honda' 'Toyota' 'BMW' 'Cadillac');
   class type ;
  var invoice ; 
run;

proc means data=sashelp.cars stderr;
   where make in ('Honda' 'Toyota' 'BMW' 'Cadillac');
   class type ;
  var horsepower ; 
run;

proc means data=sashelp.cars min;
   where make in ('Honda' 'Toyota' 'BMW' 'Cadillac');
   class type ;
  var weight ; 
run;

proc means data=sashelp.cars max;
   where make in ('Honda' 'Toyota' 'BMW' 'Cadillac');
   class type ;
  var length ; 
run;

 

 

ballardw
Super User

@gsk wrote:

Thank you for the reply!


But for Hatchback and Highway MPG for instance, 32.5 is not the average of 31.7, 32.2, and 36. Same as SUV and City MPG, SUV and Highway MPG, etc.

 

Also, what does rbreak indicate? It doesn't have averages of all cars for each City MPG either. 


You need to look at how many models fall into each category. Suppose the country with the 32.2 average has 6 models but the others only one each the overall mean is different as the divisor is different.

data example;
   simplemean= mean(31.7,32.2,36); /* and almost certainly incorrect unless each country has exactly one model*/
   meanwithcounts = mean (31.7, 32.2,32.2,32.2,32.2,32.2,32.2, 36);
run;

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 1871 views
  • 3 likes
  • 3 in conversation