BookmarkSubscribeRSS Feed
andypandy_swe
Obsidian | Level 7

I rarely use proc report but now I need to and I can't figure out how to write this...

I have a report that needs to show number of registrations per month for months that fall within a set timeperiod.

There are some grouping variables and then the months (only the ones where there is at least one registration) as columns and then two summary columns.

The first should calculate the number of registrations (for each group) that falls within the timespan.

The second should calculate the same + the number of registrations that fall outside of the timespan (for each group). The registrations outside the timeperiod don't have a month-value.


My thinking so far:

In order to first sum the columns except for the "outside timespan" I think I need to have "outside timespan" first and summarize all the columns that come after it (as the number of "inside" columns are dynamic).
Then I create another summary column that starts with the "outside" column first and sums up all columns that comes after it (only then the first summary column is also included so I have to deduct that from the total.

Below is some sample code.
With parts of the code commented I get the correct result in the "ouside timespan" column according to the input data. One registration for group 1A-2F-3K and one for group 1B-2J-3R
However when I try to do an across efter the "ouside" column with the "inside" columns below it... The values for the "outside" column disappears and both summary columns show the same value (when the second should show more registrations).

The "outside" column is only in the report for calculating the second summary column... It's shouldn't be visible in the final report...

Any tips on what I'm doing wrong or how to solve it another way would be greatly appreciated.

/Andreas

data work.registrations;
    input group1 $ group2 $ group3 $ year_month date9. inside_timespan outside_timespan;
format year_month yymmd7.;
    datalines;
1A 2F 3K           0 1
1A 2G 3K 01jan2020 1 0
1A 2G 3K 03feb2020 1 0
1A 2G 3L 07apr2020 1 0
1A 2H 3M 20may2020 1 0
1A 2H 3M 02jun2020 1 0
1A 2H 3M 01jul2020 1 0
1A 2H 3N 09jul2020 1 0
1B 2I 3O 12jul2020 1 0
1B 2I 3P 01aug2020 1 0
1B 2I 3Q 01aug2020 1 0
1B 2J 3R           0 1
1B 2J 3R 04sep2020 1 0
1B 2J 3S 07sep2020 1 0
 
;
run;
 
proc report data=work.registrations nowd
style(summary)=Header;
column group1
group2
group3 
outside_timespan
/* year_month, inside_timespan tot_inside tot_all */;
 
define group1 / group 'Group 1' missing order=formatted style(column)=Header ' ';
 
define group2 / group 'Group 2' missing order=formatted style(column)=Header ' ';
 
define group3 / group 'Group 3' missing order=formatted style(column)=Header ' ';
 
define outside_timespan / analysis sum format=6. width=8 /*noprint (will be hidden in finished report)*/;
/* define year_month / across order=formatted 'Year/Month';
define inside_timespan / analysis sum format=6. width=8 ' ';
define tot_inside / computed format=8. 'Total inside period';
define tot_all / computed format=8. 'Total inside and outside period';
rbreak after / summarize;
compute tot_inside;
  tot_inside = sum(_C5_,_C6_,_C7_,_C8_,_C9_,_C10_,_C11_,_C12_);
  endcompute;
compute tot_all;
  tot_all = sum(_C4_,_C5_,_C6_,_C7_,_C8_,_C9_,_C10_,_C11_,_C12_) - tot_inside;
  endcompute; */
by group1;
run;
quit;



  

5 REPLIES 5
ballardw
Super User

Can you show what the report is supposed to actually look like given that example data? Complex descriptions really call for a shown result.

 

andypandy_swe
Obsidian | Level 7
I just discovered that if there is a date (any date) in the outside registrations (the two in the included test data with missing dates) then it works. I don’t know why…

You can see what it’s supposed to look like by changing those two missing values to some dates.

However, when I run the same code with production data there is nothing in any of the two summary columns...

Real data has 18000 registrations within the timeframe and around 211000 outside the timeframe... Can't really understand why it works with test data but not a larger dataset.
andypandy_swe
Obsidian | Level 7
Ok, I tried outputting a dataset to see what is happening. It turns out that (if I remove the noprint for year_month) I see one column in the result but the produced dataset has like 255 columns. I'm guessing it's the number of different year_month -values that is ouside the range. So proc report creates an across for this variable too but hides it.
The reason I didn't get anything in my summary columns was that the "inside" columns started at _C256_!
andypandy_swe
Obsidian | Level 7

So the sums should be like this:

andypandy_swe_0-1686237650217.png

 

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 610 views
  • 0 likes
  • 2 in conversation