Hi All,
I have 4 groups in my data and I need to summarize based on each column but no repetition of rows. Data looks like this
Zone | Region | ZH_Abbreviation | RH_Abbreviation | Expense_In_Thousands |
Western | Maharashtra | MH | Rahul | 100 |
Western | Maharashtra | MH | Ajay | 600 |
Western | Maharashtra | MH | Ajay | 200 |
Western | Maharashtra | NGP | Rahul | 400 |
Western | Maharashtra | NGP | Pankaj | 450 |
Western | Gujarat | AM | Ashok | 600 |
Western | Gujarat | AM | Ajay | 400 |
Western | Gujarat | GK | Pankaj | 300 |
Eastern | Bengal | AM | Pavan | 300 |
Eastern | Bengal | AM | Rajesh | 400 |
Eastern | Tripura | GK | Rohan | 300 |
And I need output like below
Zone | Region | ZH_Abbreviation | RH_Abbreviation | Expense_In_Thousands |
Western | Maharashtra | MH | Rahul | 100 |
Total for Rahul | 100 | |||
Ajay | 800 | |||
Total for Ajay | 800 | |||
Total for MH | 900 | |||
NGP | Rahul | 400 | ||
Total for Rahul | 400 | |||
Pankaj | 450 | |||
Total for Pankaj | 450 | |||
Total for NGP | 850 | |||
Total for Maharashtra | 1750 | |||
Total for Western | 1750 | |||
Eastern | Bengal | AM | Pavan | 300 |
Total for Pavan | 300 | |||
Rajesh | 400 | |||
Total for Rajesh | 400 | |||
Total for AM | 700 | |||
Tripura | GK | Rohan | 300 | |
Total for Rohan | 300 | |||
Total for GK | 300 | |||
Total for Tripura | 300 | |||
Total for Eastern | 300 | |||
Grand Total | 2050 |
I tried with the Proc report and Group by on 4 columns but it results in repetition of cell values. Is this doable ? Any help is really appreciated.
Thanks in advance
Swapnil
What happened to Gujarat and AM under Western? What order are the rows supposed to be in? Normally the G in Gujarat would sort before the M in Maharashtra. And, Ajay would sort before Rahul. Just curious because what you show here seems different than what you would get.
The break variable information will be repeated unless you use SUPPRESS on the Break statement. However, if you want to customize the text at the break, you'll need a COMPUTE block.
But I wouldn't bother with using SUPPRESS, since you need the values at the break in order to customize the BREAK line as shown below for HTML output:
Basically a COMPUTE block for an AFTER location can "touch" all the other cells on the row, so in the COMPUTE blocks below, after I customize the main cell for the COMPUTE block, I blank out the other cells on the row:
That did require a LENGTH statement when the variables were created so that the cells would be wide enough for the customized text:
I added order=data to the PROC REPORT DEFINE statements to keep the report in the same order as the original data. Otherwise everything would be in alpha order.
Cynthia
Absolutely doable in PROC REPORT. Define ZONE, REGION, ZH_Abbreviation, RH_Abbreviation as GROUP columns, and Expense_In_Thousands as an ANALYSIS column.
Then add BREAK AFTER <Varname> / SUMMARIZE statements for ZONE, REGION, ZH_Abbreviation, RH_Abbreviation.
Lastly add a RBREAK AFTER / SUMMARIZE statement.
@SASKiwi I have tried that way. However, result is not as per our expectations. It shows repeated values on the groups.
Proc report data=have;
column zone Region ZH_Abbreviation RH_Abbreviation Expense_in_Thousands;
define Zone / Group "Zone"
define ZH_Abbreviation / Group "ZH Abbreviaton";
define RH_Abbreviation / Group "RH Abbreviation";
define Expense_In_Thousands/ Analysis;
break after Zone / summarize;
break after Region / summarize;
break after ZH_Abbreviation / summarize;
break after RH_Abbreviation / summarize;
RBREAK AFTER/ Summarize;
Run;
It produces multiple rows per group. However, I want only first value of the member and rest to be blank.
What happened to Gujarat and AM under Western? What order are the rows supposed to be in? Normally the G in Gujarat would sort before the M in Maharashtra. And, Ajay would sort before Rahul. Just curious because what you show here seems different than what you would get.
The break variable information will be repeated unless you use SUPPRESS on the Break statement. However, if you want to customize the text at the break, you'll need a COMPUTE block.
But I wouldn't bother with using SUPPRESS, since you need the values at the break in order to customize the BREAK line as shown below for HTML output:
Basically a COMPUTE block for an AFTER location can "touch" all the other cells on the row, so in the COMPUTE blocks below, after I customize the main cell for the COMPUTE block, I blank out the other cells on the row:
That did require a LENGTH statement when the variables were created so that the cells would be wide enough for the customized text:
I added order=data to the PROC REPORT DEFINE statements to keep the report in the same order as the original data. Otherwise everything would be in alpha order.
Cynthia
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.