BookmarkSubscribeRSS Feed
alepage
Barite | Level 11

Hello,

 

I was asked to calculate the sum of var1 through var6 and that per category. But now they want the sum of var5 and the sum of var6 and they want that sum reported in CAT2 and CBT1 respectively.

 

Is there a more efficient way to carry out that task 

see the code below.

 

data test1;
   infile datalines delimiter=',';
   input category $ TransType $ var1 var2 var3 var4 var5 var6 ; 
   datalines;
CAT1, DIR, 100.00,  8.49, 0.08, 1.00, 0.00, 0.00
CAT2, DIR, 200.00,  2.12, 0.08, 1.00, 0.20, 0.02
CAT3, DIR, 800.00, 67.95, 0.08, 1.00, 0.20, 0.02
CAT4, DIR, 100.00, 10.19, 0.08, 1.00, 0.00, 0.00
CAT5, DIR, 700.00, 63.70, 0.08, 1.00, 0.20, 0.02
CBT1, DIR,  50.00,  4.25, 0.08, 1.00, 0.00, 0.00
CBT2, DIR, 300.00, 25.48, 0.08, 1.00, 0.25, 0.02
CBT3, DIR, 900.00, 76.44, 0.08, 1.00, 0.25, 0.02
CBT4, DIR, 100.00,  8.49, 0.08, 1.00, 0.00, 0.00

;
run;
data test1;
format category $8. TransType $8. mjcategory $8. var1 var2 var3 var4 var5 var6 f7.2 ;
set test1;
if substr(category,1,2) eq 'CA' then mjcategory = 'CAT2';
if substr(category,1,2) eq 'CB' then mjcategory = 'CBT1';
run;
proc sql;
create table test2 as
select a.*
		,sum(var5) as var7
		,sum(var6) as var8
from test1 as a
group by mjcategory;
quit;

Data test2 (drop= var7 var8 mjcategory);
set test2;
if category eq mjcategory then 
	do;
		var5=var7;
		var6=var8;
	end;
Else
	do;
		var5=0;
		var6=0;
	end;
run;
6 REPLIES 6
Tom
Super User Tom
Super User

I do not understand what you are trying to do.  Is there some business logic behind this?

 

If you want to treat CAT1,CAT2,... as one category and CBT1,CBT2,.. as other then simply using a format would be simpler.

data have;
   input category $ TransType $ var1 var2 var3 var4 var5 var6 ; 
datalines;
CAT1  DIR  100.00   8.49  0.08  1.00  0.00  0.00
CAT2  DIR  200.00   2.12  0.08  1.00  0.20  0.02
CAT3  DIR  800.00  67.95  0.08  1.00  0.20  0.02
CAT4  DIR  100.00  10.19  0.08  1.00  0.00  0.00
CAT5  DIR  700.00  63.70  0.08  1.00  0.20  0.02
CBT1  DIR   50.00   4.25  0.08  1.00  0.00  0.00
CBT2  DIR  300.00  25.48  0.08  1.00  0.25  0.02
CBT3  DIR  900.00  76.44  0.08  1.00  0.25  0.02
CBT4  DIR  100.00   8.49  0.08  1.00  0.00  0.00
;

proc summary data=have nway;
  class category / mlf ;
  format category $3. ;
  var var1-var6 ;
  output out=two_categories sum=;
run;

Results

 

Obs    category    _TYPE_    _FREQ_    var1     var2     var3    var4    var5    var6

 1       CAT          1         5      1900    152.45    0.40      5      0.6    0.06
 2       CBT          1         4      1350    114.66    0.32      4      0.5    0.04
PaigeMiller
Diamond | Level 26

Did you leave out PROC FORMAT from your code?

 

I agree, a custom format is the best way to do this.

--
Paige Miller
Tom
Super User Tom
Super User

For now I am just use the built in $3 format.

 

I could not figure out what the heck the original posted actually wanted.

alepage
Barite | Level 11

Is there some business logic behind this?  I am affraid not.  I was asked to calculate the sum of var1 - var6 and that, per category.  Then because they don't like fraction in var5 and var6, their requirement is now to rollup the sum for var5 and var6 and to report it as cat2 and cbt1.  (see the obtained dataset below)

 

category TransType var1 var2 var3 var4 var5 var6
CAT3 DIR 800.00 67.95 0.08 1.00 0.00 0.00
CAT5 DIR 700.00 63.70 0.08 1.00 0.00 0.00
CAT2 DIR 200.00 2.12 0.08 1.00 0.60 0.06
CAT1 DIR 100.00 8.49 0.08 1.00 0.00 0.00
CAT4 DIR 100.00 10.19 0.08 1.00 0.00 0.00
CBT2 DIR 300.00 25.48 0.08 1.00 0.00 0.00
CBT1 DIR 50.00 4.25 0.08 1.00 0.50 0.04
CBT4 DIR 100.00 8.49 0.08 1.00 0.00 0.00
CBT3 DIR 900.00 76.44 0.08 1.00 0.00 0.00

 

 

 

Tom
Super User Tom
Super User

So you have the detailed summary already:

data summary;
   input category $ TransType $ var1 var2 var3 var4 var5 var6 ; 
datalines;
CAT1  DIR  100.00   8.49  0.08  1.00  0.00  0.00
CAT2  DIR  200.00   2.12  0.08  1.00  0.20  0.02
CAT3  DIR  800.00  67.95  0.08  1.00  0.20  0.02
CAT4  DIR  100.00  10.19  0.08  1.00  0.00  0.00
CAT5  DIR  700.00  63.70  0.08  1.00  0.20  0.02
CBT1  DIR   50.00   4.25  0.08  1.00  0.00  0.00
CBT2  DIR  300.00  25.48  0.08  1.00  0.25  0.02
CBT3  DIR  900.00  76.44  0.08  1.00  0.25  0.02
CBT4  DIR  100.00   8.49  0.08  1.00  0.00  0.00
;

And you want to collapse VAR5 and VAR6 to have only one non-zero value that is the sum of all the values in that group.

 

You can get there directly with PROC SQL code like this:

proc sql ;
 create table want as 
   select category, transtype
        , var1 ,var2 ,var3 ,var4
        , case when (category in ('CAT2' 'CBT1')) then sum(var5) else 0 end as var5 
        , case when (category in ('CAT2' 'CBT1')) then sum(var6) else 0 end as var6
   from summary 
   group by substr(category,1,3)
   order by category
 ;
quit;

Which takes advantage of the fact that PROC SQL will re-merge the aggregate values onto all of the observations for you.  As noted in the log:

NOTE: The query requires remerging summary statistics back with the original data.

Results

                   Trans
Obs    category    Type     var1     var2    var3    var4    var5    var6

 1       CAT1       DIR      100     8.49    0.08      1      0.0    0.00
 2       CAT2       DIR      200     2.12    0.08      1      0.6    0.06
 3       CAT3       DIR      800    67.95    0.08      1      0.0    0.00
 4       CAT4       DIR      100    10.19    0.08      1      0.0    0.00
 5       CAT5       DIR      700    63.70    0.08      1      0.0    0.00
 6       CBT1       DIR       50     4.25    0.08      1      0.5    0.04
 7       CBT2       DIR      300    25.48    0.08      1      0.0    0.00
 8       CBT3       DIR      900    76.44    0.08      1      0.0    0.00
 9       CBT4       DIR      100     8.49    0.08      1      0.0    0.00
alepage
Barite | Level 11
Could you please provide an example of a custom format. I am not familar with it

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

Creating Custom Steps in SAS Studio

Check out this tutorial series to learn how to build your own steps in SAS Studio.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 1276 views
  • 1 like
  • 3 in conversation