BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
sameer112217
Quartz | Level 8

For some reason I have to use PROC SQL to do cumulative sum for the number variables but I am unable to do it using Proc SQL

 

Table1

 

Fiscal_year   Premium_date  type     premiumamount  sum_assured

2016-17       01JAN2017      corp     100                       100

2016-17       01JAN2017      corp     100                       100

2016-17       01FEB2017      corp     200                       200

2016-17       01FEB2017      corp     200                       200

 

I used

PROC SQL;

Create table want as

Select fiscal_year, premium_date, type, sum(premiumamount) as YTD_Premiumamount,

sum(sum_assured) as YTD_sum_assured

from table1 as a

group by fiscal_year, premium_date, type

quit

 

I would get result like this

fiscal_year   Premium_date  type     YTD_premiumamount  YTD_sum_assured

2016-17       01JAN2017      corp     200             200

2016-17       01FEB2017      corp    400              400

 

But I want like this which is culmulative sum YTD

 

fiscal_year  Premium_date  type    YTD_premiumamount  YTD_sum_assured

2016-17       01JAN2017      corp     200             200

2016-17       01FEB2017      corp     600             600

 

In short it should have cumulative sum of measures until the fiscal year ends in 31mar2017. I have tried in data step using first.var but want in Proc SQL

1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20

HI @Reeza  While it is trivial in this thread and I am 100% certain, a professional orgranization will/should not resort to SQL for this, I'm afraid your 2*summary times initial and then group by product and filter will slow down the process as opposed to having summary done in one go and one by group product & filter before the final summary,  hence was my approach.

 

But of course you made me think what I could have done better, and that is get the temp as view rather than a table as follows. 

 

data have;
input Fiscal_year $  Premium_date :date9. type   $  premiumamount  sum_assured;
format Premium_date date9.;
cards;
2016-17       01JAN2017      corp     100                       100
2016-17       01JAN2017      corp     100                       100
2016-17       01FEB2017      corp     200                       200
2016-17       01FEB2017      corp     200                       200
;
/*create view temp */
proc sql;
create view temp as
select fiscal_year, premium_date, type, sum(premiumamount) as YTD_Premiumamount,sum(sum_assured) as YTD_sum_assured
from have 
group by fiscal_year, premium_date, type;
quit;

proc sql;
create table want as
select a.Fiscal_year ,a.Premium_date ,a.type,sum(b.YTD_Premiumamount) as YTD_Premiumamount,sum(b.YTD_sum_assured) as YTD_sum_assured 
from temp a inner join temp b
on a.Fiscal_year   =b.Fiscal_year    and a.type=b.type and  b.Premium_date<=a.Premium_date  
group by a.fiscal_year,a.premium_date,a.type
order by a.fiscal_year,a.Premium_date,a.type;
quit;

 which should improve efficiency.

 

My two cents!

 

View solution in original post

15 REPLIES 15
PaigeMiller
Diamond | Level 26

@sameer112217 wrote:

For some reason I have to use PROC SQL to do cumulative sum for the number variables but I am unable to do it using Proc SQL

 


I'm not surprised, this is NOT easy to do (if it is possible at all) in PROC SQL.

 

I don't know why you feel restricted to doing it in PROC SQL, but I have no such restrictions, so for everyone else reading along, this is a very simple method without SQL to compute these cumulative sums.

 

proc summary data=have nway;
    var premiumamount sum_assured;
    class fiscal_year premium_date;
    output out=sums sum=;
run;
data want;
    set sums;
    by fiscal_year;
    if first.fiscal_year then do;
         cumulative_premiumamount=0;
         cumulative_sum_assured=0;
    end;
    cumulative_premiumamount+premiumamount;
    cumulative_sum_assured+sum_assured;
run;
--
Paige Miller
MarkWik
Quartz | Level 8

May we know what is that some reason that precludes the usage of datastep/other summary procs assuming you have a commercial or student SAS software version. 

 

 


@sameer112217 wrote:

For some reason I have to use PROC SQL to do cumulative sum for the number variables but I am unable to do it using Proc SQL

 

 


 

MarkWik
Quartz | Level 8

As a matter of fact, we the community including you can build up supportive arguments to challenge your boss/team/peers who may have assigned you such tasks with real legitimate reasons

novinosrin
Tourmaline | Level 20
data have;
input Fiscal_year $  Premium_date :date9. type   $  premiumamount  sum_assured;
format Premium_date date9.;
cards;
2016-17       01JAN2017      corp     100                       100
2016-17       01JAN2017      corp     100                       100
2016-17       01FEB2017      corp     200                       200
2016-17       01FEB2017      corp     200                       200
;
proc sql;
create table temp as
select fiscal_year, premium_date, type, sum(premiumamount) as YTD_Premiumamount,sum(sum_assured) as YTD_sum_assured
from have 
group by fiscal_year, premium_date, type;
quit;
proc sql;
create table want as
select a.Fiscal_year ,a.Premium_date ,a.type,sum(b.YTD_Premiumamount) as YTD_Premiumamount,sum(b.YTD_sum_assured) as YTD_sum_assured 
from temp a inner join temp b
on a.Fiscal_year   =b.Fiscal_year    and a.type=b.type and  b.Premium_date<=a.Premium_date  
group by a.fiscal_year,a.premium_date,a.type
order by a.fiscal_year,a.Premium_date,a.type;
quit;
PaigeMiller
Diamond | Level 26

Ok, this uses PROC SQL but is a whole lot easier than I thought it would be 🙂

 

proc summary data=have nway;
    var premiumamount sum_assured;
    class fiscal_year premium_date;
    output out=sums sum=;
run;
data sums1;
    set sums;
    by fiscal_year;
    if first.fiscal_year then do;
         cumulative_premiumamount=0;
         cumulative_sum_assured=0;
    end;
    cumulative_premiumamount+premiumamount;
    cumulative_sum_assured+sum_assured;
run;
proc sql;
    create table cumulative_sums as select * from sums1;
quit;
--
Paige Miller
novinosrin
Tourmaline | Level 20

lol point taken. You must have had  a pretty good lunch. I just had bigMAC haha

Reeza
Super User

You shouldn't do this. It's inefficient and better done in a data step and is probably faster there too. 

 

But since most people ignore that advice here's one solution. I highly suspect this could be simplified, but since you have repeating values for each month, the data first needs to be summaries to a monthly level then the cumulative month calculation can occur, otherwise when you join you get too duplicate records because the keys are not unique and then your numbers are duplicated. 

 

	data have;
input Fiscal_year $  Premium_date :date9. type   $  premiumamount  sum_assured;
format Premium_date date9.;
cards;
2016-17       01JAN2017      corp     100                       100
2016-17       01JAN2017      corp     100                       100
2016-17       01FEB2017      corp     200                       200
2016-17       01FEB2017      corp     200                       200
;


proc sql;
create table want as 
select h1.fiscal_year, h1.premium_date, sum(h2.T_amount) as YTD_premium, sum(h2.T_assured) as YTD_assured
from 
	(select fiscal_year, premium_date, type, sum(premiumamount) as T_amount, sum(sum_assured) as T_assured
	from have
	group by fiscal_year, premium_date, type) as h1
left join 
	(select fiscal_year, premium_date, type, sum(premiumamount) as T_amount, sum(sum_assured) as T_assured
	from have
	group by fiscal_year, premium_date, type) as h2

on h2.premium_date <= h1.premium_date
and h1.fiscal_year=h2.fiscal_year
and h1.type=h2.type
group by h1.fiscal_year, h1.premium_date;
quit;


novinosrin
Tourmaline | Level 20

HI @Reeza  While it is trivial in this thread and I am 100% certain, a professional orgranization will/should not resort to SQL for this, I'm afraid your 2*summary times initial and then group by product and filter will slow down the process as opposed to having summary done in one go and one by group product & filter before the final summary,  hence was my approach.

 

But of course you made me think what I could have done better, and that is get the temp as view rather than a table as follows. 

 

data have;
input Fiscal_year $  Premium_date :date9. type   $  premiumamount  sum_assured;
format Premium_date date9.;
cards;
2016-17       01JAN2017      corp     100                       100
2016-17       01JAN2017      corp     100                       100
2016-17       01FEB2017      corp     200                       200
2016-17       01FEB2017      corp     200                       200
;
/*create view temp */
proc sql;
create view temp as
select fiscal_year, premium_date, type, sum(premiumamount) as YTD_Premiumamount,sum(sum_assured) as YTD_sum_assured
from have 
group by fiscal_year, premium_date, type;
quit;

proc sql;
create table want as
select a.Fiscal_year ,a.Premium_date ,a.type,sum(b.YTD_Premiumamount) as YTD_Premiumamount,sum(b.YTD_sum_assured) as YTD_sum_assured 
from temp a inner join temp b
on a.Fiscal_year   =b.Fiscal_year    and a.type=b.type and  b.Premium_date<=a.Premium_date  
group by a.fiscal_year,a.premium_date,a.type
order by a.fiscal_year,a.Premium_date,a.type;
quit;

 which should improve efficiency.

 

My two cents!

 

Reeza
Super User

While it is trivial in this thread and I am 100% certain, a professional orgranization will/should not resort to SQL for this, I'm afraid your 2*summary times initial and then group by product and filter will slow down the process as opposed to having summary done in one go and one by group product & filter before the final summary,  hence was my approach.

 

The question explicitly asks for an inefficient solution though 😉

 

novinosrin
Tourmaline | Level 20

So true. And believe it or not,

 

I had a very similar VIZ Almost same question in a HW and here is a feedback from my Prof lol

 

 

 

Grade Item Points Grade Feedback

Part I:

Good, but not efficient solution, try using "Cursor/Merge".You should have challenged me

Part II:

Good.

Part III: -1

The result of your query should display ALL projectNo even if the quantity is "0". Plus the sum is Quantity Not PartNo.

PaigeMiller
Diamond | Level 26

@Reeza wrote:

The question explicitly asks for an inefficient solution though 😉

 


My current philosophy for situations where the user excludes the most appropriate and efficient method is that I will give the answer I want to give, without the original restrictions, because someone (perhaps not the original poster) may benefit. And of course, sometimes the original poster may "see the light" and benefit as well.

 

The only time I try to follow the originally stated restrictions is if a user has not licensed the proper SAS  product, for example, PROC EXPAND in SAS/ETS, and wants to replicate some of the functionality in PROC EXPAND using DATA steps or other PROCs that the user has available.

--
Paige Miller
novinosrin
Tourmaline | Level 20

But I am sure you will grant "partial credit" if you were to grade? (winks)

PaigeMiller
Diamond | Level 26

In my opinion, another benefit of not using PROC SQL here is:

 

The code I originally provided is much more readable and understandable and maintainable than any PROC SQL solution in this thread. If a new person had to take over the code, I think there's a much better chance that the new person would understand the PROC SUMMARY/DATA step solution than the SQL solutions.

--
Paige Miller
PGStats
Opal | Level 21

Finding a context where this request would be legitimate requires a fertile imagination, but it can be a fun exercise. Here is a single query variation on the theme initiated by @novinosrin:

 

proc sql;
create table want as
select 
    a.Fiscal_year,
    a.Premium_date,
    a.type,
    sum(b.premiumAmount) as YTD_PremiumAmount,
    sum(b.sum_assured) as YTD_sum_assured 
from
  ( select distinct
        a.Fiscal_year,
        a.Premium_date,
        a.type,
        b.Premium_date as other_date
    from have as a inner join have as b 
        on  a.Fiscal_year = b.Fiscal_year and 
            a.type = b.type and 
            a.Premium_date >= b.Premium_date ) as a inner join
    have as b 
        on  a.Fiscal_year = b.Fiscal_year and 
            a.type = b.type and
            a.other_date = b.Premium_date
group by
    a.Fiscal_year,
    a.Premium_date,
    a.type
;
quit;
PG

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 15 replies
  • 24683 views
  • 16 likes
  • 6 in conversation