BookmarkSubscribeRSS Feed
vietlinh12hoa
Obsidian | Level 7

I have a table

Group Revenue
A 100
A 50
B 200
B 100

 

I want to sum aggregation by proc sql, like:

Group Total Revenue
A 150
B 300

 

proc sql;
        select 
             Group, sum(Revenue) as Total_Revenue
        from mytable group by Group
;quit;

However, it produces duplicates records:

Group Total Revenue
A 150
A 150
B 300
B 300

 

Of course, there are couple of solution like NOMERGE SQL (not always work) or SORT NODUPKEY. But I wonder any efficient and direct way, something like:

  

proc sql;
        select 
             first(Group), # Take the first name record in the Group 
             sum(Revenue) as Total_Revenue
        from mytable group by Group
;quit;

 

6 REPLIES 6
Reeza
Super User

Either your GROUP variable values are not exact because of leading spaces or invisible characters, but this should work as expected. What do you get if you run the following on the same data set. Please show the log from your original code and this.

 

 

proc freq data=mytable;
table group;
weight revenue;
run;

Note that I cannot replicate your issue either:

data have;
infile cards dlm=',';
input Group $	Revenue;
cards;
A,	100
A,	50
B,	200
B,	100
;;;;

proc sql;
     create table want as
        select 
             Group, sum(Revenue) as Total_Revenue
        from have group by Group
;quit;

proc print data=want;run;

This generates the table shown in your original post.

 

Log:

 1          OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
 68         
 69         data have;
 70         infile cards dlm=',';
 71         input Group $Revenue;
 72         cards;
 
 NOTE: The data set WORK.HAVE has 4 observations and 2 variables.
 NOTE: DATA statement used (Total process time):
       real time           0.00 seconds
       user cpu time       0.00 seconds
       system cpu time     0.00 seconds
       memory              524.12k
       OS Memory           24740.00k
       Timestamp           05/31/2021 05:23:57 PM
       Step Count                        45  Switch Count  2
       Page Faults                       0
       Page Reclaims                     126
       Page Swaps                        0
       Voluntary Context Switches        10
       Involuntary Context Switches      0
       Block Input Operations            0
       Block Output Operations           264
       
 77         ;;;;
 
 78         
 79         proc sql;
 80              create table want as
 81                 select
 82                      Group, sum(Revenue) as Total_Revenue
 83                 from have group by Group
 84         ;
 NOTE: Table WORK.WANT created, with 2 rows and 2 columns.
 
 84       !  quit;
 NOTE: PROCEDURE SQL used (Total process time):
       real time           0.00 seconds
       user cpu time       0.01 seconds
       system cpu time     0.00 seconds
       memory              5433.81k
       OS Memory           29864.00k
       Timestamp           05/31/2021 05:23:57 PM
       Step Count                        46  Switch Count  2
       Page Faults                       0
       Page Reclaims                     219
       Page Swaps                        0
       Voluntary Context Switches        10
       Involuntary Context Switches      0
       Block Input Operations            0
       Block Output Operations           272
Spoiler

@vietlinh12hoa wrote:

I have a table

Group Revenue
A 100
A 50
B 200
B 100

 

I want to sum aggregation by proc sql, like:

Group Total Revenue
A 150
B 300

 

proc sql;
        select 
             Group, sum(Revenue) as Total_Revenue
        from mytable group by Group
;quit;

However, it produces duplicates records:

Group Total Revenue
A 150
A 150
B 300
B 300

 

Of course, there are couple of solution like NOMERGE SQL (not always work) or SORT NODUPKEY. But I wonder any efficient and direct way, something like:

  

proc sql;
        select 
             first(Group), # Take the first name record in the Group 
             sum(Revenue) as Total_Revenue
        from mytable group by Group
;quit;

 


 

vietlinh12hoa
Obsidian | Level 7

Ok Thanks. I check and so it apparently had the space in Group. But anyway, any chance I can use first name (or last name) observation of Group (in case all Group values are not perfectly matched)?  

Reeza
Super User
The differences would cause them to be considered different groups, so first/last would not work. FIRST/LAST are data step concepts that do not overlap with SQL anyways.

You can try to use COMPRESS() or TRIM() within a query.

proc sql;
create table want as
select
compress(Group) as group, sum(Revenue) as Total_Revenue
from have group by compress(Group)
;quit;




Reeza
Super User
FYI this would happen if you also had a "Select *" or another column/variable in the query that isn't in your GROUP BY or in an aggregation statement. But that means your query wouldn't match what you've shown us.
yabwon
Onyx | Level 15

Just for fun, you can have both situations generated in one data step 😉

data have;
infile cards dlm=',';
input Group $	Revenue;
cards;
A,	100
A,	50
B,	200
B,	100
;;;;
run;

data want1 want2;
  do _N_=1 by 1 until(last.group);
    set have;
    by group;
    Total_Revenue + Revenue;
  end;

  do _N_ = 1 to _N_;
    set have;
    output want1;
  end;
  output want2;
  Total_Revenue=.;
run;

proc print data = want1;
run;
proc print data = want2;
run;

 

Bart

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



Ksharp
Super User
It should not be, there is some clause you didn't posted.


proc sql;
select
DISTINCT Group, sum(Revenue) as Total_Revenue
from mytable group by Group
;quit;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 2155 views
  • 2 likes
  • 4 in conversation