I have a transactional dataset that effectively shows the items sold in every region with the date it was sold ( 1 row per item sold). I have created a week number variable and want a simple output summary that shows the region name in the rows and week number across the columns. The number of regions increases throughout the year as they make sales so this is run every week.
I've summarised the transactional table using a proc summary
proc summary data=raw_trans nway missing;
class region week_no;
var widgets;
output out=regional_summary (drop=_type_ _freq_) sum=;
run;
This gives an output of e.g.:
Region Week_no Widgets
A 2 50
B 2 7
B 3 5
C 1 14
C 2 6
C 3 4
etc
Once I use a proc transpose:
proc transpose data=region_summary out= region_summary1 prefix = W_;
by region;
id week_no;
var widgets;
run;
Then I get the final output with Week2 and week3 first because region A didn't have any sales in week 1. What could I add to my code or is there another function that will ensure the output is always ordered alphabetically for the regions and the weeks are in numerical order across the columns?
Region W_2 W_3 W_1
A 50
B 7 5
C 14 6 4
/*If you want output to be a dataset,Just reorder it.*/
data have;
input Region  $     Week_no        Widgets;
cards;
A                      2                      50
B                       2                     7
B                       3                     5
C                      1                      14
C                      2                      6
C                      3                      4
;
proc transpose data=have out= region_summary1 prefix = W_;
by region;
id week_no;
var widgets;
run;
proc sql noprint nowarn;
select distinct cats('W_',Week_no) into : vnames separated by ',' from have  order by Week_no;
create table want as
select region,&vnames. from region_summary1;
quit;
TRANSPOSE is the wrong PROC to use here. Instead, try this:
proc report data=region_summary;
    columns region week,widgets;
    define region/group 'Region';
    define week/across 'Week' order=internal;
    define widgets/sum ' ';
run; 
In fact, if you are going to use PROC REPORT, then you don't really need PROC SUMMARY, the sums can be computed in PROC REPORT on the RAW_TRANS data set.
And to add to @PaigeMiller's answer: You also wouldn't need to create a week number variable but could use a format like week<u,v,w> applied to a SAS date variable.
Thank you @PaigeMiller - I've never used proc report before. I've just had a read at the documentation and I've another quick question is possible please.
I had over simplified my query (thinking that would help) but I actually have a few other variables in the transactional summary that I also need to pull through - like the brand and country the region is in. I've tried using the code below (which does mostly produce what I need) but it only shows the brand and country name for the first record in each brand and in each country (there can be up to 2 regions in some countries and there are 2 brands). Is there an option I can add that would show those variables value in each record?
proc report data=region_summary;
columns brand country region id_region week_no widgets;
define brand / group;
define country / group;
define region /group ;
define id_region / group ;
define week_no/across order=internal;
define widgets /sum ' ';
run;
You heed to combine the ACROSS with the ANALYSIS variable:
proc report data=region_summary;
columns brand country region id_region week_no,widgets;
define brand / group;
define country / group;
define region /group ;
define id_region / group ;
define week_no/across order=internal;
define widgets /sum ' ';
run;To have a group value on all lines of a group, see https://support.sas.com/kb/24/322.html
Thank you - will take me a while to fully work through proc report and all it's capabilities - but this looks a very useful procedure.
@ScoobieR wrote:
Thank you - will take me a while to fully work through proc report and all it's capabilities - but this looks a very useful procedure.
Although I have been using PROC REPORT since the 13th century, and I have been using it extensively since Pope Gregory XIII added 11 days to the calendar in 1582, I still don't know all of it. Perhaps no one knows all of it — except for @Cynthia_sas . Learn as you go, when you need PROC REPORT to do things that you don't know how to, ask questions. I get great answers in this forum from @Kurt_Bremser , @Ksharp and many others.
Also, PROC REPORT Basics: Getting Started with the Primary Statements
There is also Proc Tabulate as an alternative reporting procedure. Differences are Proc Tabulate doesn't let you have ways to calculate with the summary values shown (which feature you haven't shown a need for yet).
Tabulate has different syntax for building a table. You indicate which variables with be row/colum (or page) headers which are analogous to "group" variables with CLASS statement(s) and analysis variables go on a VAR statement.
Some of the advantages with Tabulate different behaviors between nested items (those with * between) and without *.
A comma is used to separate row/column statments (or Page/row/column).
Tabulate also allows multiple Table statements.
Does not use Break or Rbreak but has a predicate All to indicate summaries of combined Class values.
Also requesting multiple statistics for a single variable is a bit easier.
However you cannot request intersections of row and column statistics (report sot of allows a calculate it yourself in some situations).
Consider this brief example:
proc tabulate data=region_summary;
 class brand country region id_region week_no ;
 var widgets;
 table brand* country * region *id_region   ,
       week_no * widgets*sum = '';
 table brand country  region id_region   ,
       week_no * widgets*sum = '';
 table brand* (country All='All Countries')* region*id_region ,
       week_no * widgets*(sum n );
    
run;
The = ' ' after Sum is an example of suppressing the default result label. You could provide alternate label text at any point that a variable or statistic is used in the table statement. So it is possible to have one label for "brand" in one table and a different one in a different table if that is useful.
For exploratory purposes I often will create multiple tables changing the order of the row headings. For example I can use City * Zip code to see all the zipcodes that appear within each level of City and then Zipcode * City to see cities within Zip code. Most Zip codes within a geography are "similar" at least in the first few digits. If a city should have mostly 837XX and I find 378XX then I have a data entry issue. Really shouldn't see to many different cities within a Zip and such may indicate data entry issues.
Caveat: a missing value for a Class variable by default means that Tabulate will drop the observation.
Thank you - will take me a while to fully work through proc tabulate and all it's capabilities - this looks a very useful procedure.
@ScoobieR wrote:
Thank you @PaigeMiller - I've never used proc report before. I've just had a read at the documentation and I've another quick question is possible please.
I had over simplified my query (thinking that would help) but I actually have a few other variables in the transactional summary that I also need to pull through - like the brand and country the region is in. I've tried using the code below (which does mostly produce what I need) but it only shows the brand and country name for the first record in each brand and in each country (there can be up to 2 regions in some countries and there are 2 brands). Is there an option I can add that would show those variables value in each record?
What you describe is the default behavior of PROC REPORT. Here is a method to obtain variable values on each record: PROC REPORT option to repeat values of group variables on every row - SAS Support Communities
/*If you want output to be a dataset,Just reorder it.*/
data have;
input Region  $     Week_no        Widgets;
cards;
A                      2                      50
B                       2                     7
B                       3                     5
C                      1                      14
C                      2                      6
C                      3                      4
;
proc transpose data=have out= region_summary1 prefix = W_;
by region;
id week_no;
var widgets;
run;
proc sql noprint nowarn;
select distinct cats('W_',Week_no) into : vnames separated by ',' from have  order by Week_no;
create table want as
select region,&vnames. from region_summary1;
quit;
Thank you - am going to use this solution for now and will research proc tabulate and proc report for future uses,
I am going to express my opinion that the solution using PROC REPORT is far superior than the solution using PROC TRANSPOSE. It has long been my opinion that the job of a data analyst is not simply to produce correct numbers in a table on whatever the problem is. In my opinion, the job of a data analyst is to produce results that are not only correct, but to make the output look good to the audience who will be viewing the results. To do this you use proper English (or whatever your language is) phrases as column headings and row labels. Using SAS variable names, like W_1 W_2 W_3, is not English, and does not look as good, and is not as professional, as the same report where the columns are labelled by "Week" and then show the week number underneath. In fact, this is a great advantage of PROC REPORT, in that it can make your output look good in many ways, but if you are just relying on a SAS data set, you are limited in that regard.
I see far too many times where I work that people leave the somewhat cryptic SAS variable names where with a little extra work they can have readable English columns heading and row headings. While the difference in this case may be trivial, and it may be that everyone understands W_1 W_2 W_3 (or it may not be that they understand), that's not always going to be the case; creating readable reports with proper language headings and labels is a very good habit to get into.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.
