- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
For some reason I have to use PROC SQL to do cumulative sum for the number variables but I am unable to do it using Proc SQL
Table1
Fiscal_year Premium_date type premiumamount sum_assured
2016-17 01JAN2017 corp 100 100
2016-17 01JAN2017 corp 100 100
2016-17 01FEB2017 corp 200 200
2016-17 01FEB2017 corp 200 200
I used
PROC SQL;
Create table want as
Select fiscal_year, premium_date, type, sum(premiumamount) as YTD_Premiumamount,
sum(sum_assured) as YTD_sum_assured
from table1 as a
group by fiscal_year, premium_date, type
quit
I would get result like this
fiscal_year Premium_date type YTD_premiumamount YTD_sum_assured
2016-17 01JAN2017 corp 200 200
2016-17 01FEB2017 corp 400 400
But I want like this which is culmulative sum YTD
fiscal_year Premium_date type YTD_premiumamount YTD_sum_assured
2016-17 01JAN2017 corp 200 200
2016-17 01FEB2017 corp 600 600
In short it should have cumulative sum of measures until the fiscal year ends in 31mar2017. I have tried in data step using first.var but want in Proc SQL
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
HI @Reeza While it is trivial in this thread and I am 100% certain, a professional orgranization will/should not resort to SQL for this, I'm afraid your 2*summary times initial and then group by product and filter will slow down the process as opposed to having summary done in one go and one by group product & filter before the final summary, hence was my approach.
But of course you made me think what I could have done better, and that is get the temp as view rather than a table as follows.
data have;
input Fiscal_year $ Premium_date :date9. type $ premiumamount sum_assured;
format Premium_date date9.;
cards;
2016-17 01JAN2017 corp 100 100
2016-17 01JAN2017 corp 100 100
2016-17 01FEB2017 corp 200 200
2016-17 01FEB2017 corp 200 200
;
/*create view temp */
proc sql;
create view temp as
select fiscal_year, premium_date, type, sum(premiumamount) as YTD_Premiumamount,sum(sum_assured) as YTD_sum_assured
from have
group by fiscal_year, premium_date, type;
quit;
proc sql;
create table want as
select a.Fiscal_year ,a.Premium_date ,a.type,sum(b.YTD_Premiumamount) as YTD_Premiumamount,sum(b.YTD_sum_assured) as YTD_sum_assured
from temp a inner join temp b
on a.Fiscal_year =b.Fiscal_year and a.type=b.type and b.Premium_date<=a.Premium_date
group by a.fiscal_year,a.premium_date,a.type
order by a.fiscal_year,a.Premium_date,a.type;
quit;
which should improve efficiency.
My two cents!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@sameer112217 wrote:
For some reason I have to use PROC SQL to do cumulative sum for the number variables but I am unable to do it using Proc SQL
I'm not surprised, this is NOT easy to do (if it is possible at all) in PROC SQL.
I don't know why you feel restricted to doing it in PROC SQL, but I have no such restrictions, so for everyone else reading along, this is a very simple method without SQL to compute these cumulative sums.
proc summary data=have nway;
var premiumamount sum_assured;
class fiscal_year premium_date;
output out=sums sum=;
run;
data want;
set sums;
by fiscal_year;
if first.fiscal_year then do;
cumulative_premiumamount=0;
cumulative_sum_assured=0;
end;
cumulative_premiumamount+premiumamount;
cumulative_sum_assured+sum_assured;
run;
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
May we know what is that some reason that precludes the usage of datastep/other summary procs assuming you have a commercial or student SAS software version.
@sameer112217 wrote:
For some reason I have to use PROC SQL to do cumulative sum for the number variables but I am unable to do it using Proc SQL
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
As a matter of fact, we the community including you can build up supportive arguments to challenge your boss/team/peers who may have assigned you such tasks with real legitimate reasons
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
data have;
input Fiscal_year $ Premium_date :date9. type $ premiumamount sum_assured;
format Premium_date date9.;
cards;
2016-17 01JAN2017 corp 100 100
2016-17 01JAN2017 corp 100 100
2016-17 01FEB2017 corp 200 200
2016-17 01FEB2017 corp 200 200
;
proc sql;
create table temp as
select fiscal_year, premium_date, type, sum(premiumamount) as YTD_Premiumamount,sum(sum_assured) as YTD_sum_assured
from have
group by fiscal_year, premium_date, type;
quit;
proc sql;
create table want as
select a.Fiscal_year ,a.Premium_date ,a.type,sum(b.YTD_Premiumamount) as YTD_Premiumamount,sum(b.YTD_sum_assured) as YTD_sum_assured
from temp a inner join temp b
on a.Fiscal_year =b.Fiscal_year and a.type=b.type and b.Premium_date<=a.Premium_date
group by a.fiscal_year,a.premium_date,a.type
order by a.fiscal_year,a.Premium_date,a.type;
quit;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Ok, this uses PROC SQL but is a whole lot easier than I thought it would be 🙂
proc summary data=have nway;
var premiumamount sum_assured;
class fiscal_year premium_date;
output out=sums sum=;
run;
data sums1;
set sums;
by fiscal_year;
if first.fiscal_year then do;
cumulative_premiumamount=0;
cumulative_sum_assured=0;
end;
cumulative_premiumamount+premiumamount;
cumulative_sum_assured+sum_assured;
run;
proc sql;
create table cumulative_sums as select * from sums1;
quit;
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
lol point taken. You must have had a pretty good lunch. I just had bigMAC haha
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
You shouldn't do this. It's inefficient and better done in a data step and is probably faster there too.
But since most people ignore that advice here's one solution. I highly suspect this could be simplified, but since you have repeating values for each month, the data first needs to be summaries to a monthly level then the cumulative month calculation can occur, otherwise when you join you get too duplicate records because the keys are not unique and then your numbers are duplicated.
data have;
input Fiscal_year $ Premium_date :date9. type $ premiumamount sum_assured;
format Premium_date date9.;
cards;
2016-17 01JAN2017 corp 100 100
2016-17 01JAN2017 corp 100 100
2016-17 01FEB2017 corp 200 200
2016-17 01FEB2017 corp 200 200
;
proc sql;
create table want as
select h1.fiscal_year, h1.premium_date, sum(h2.T_amount) as YTD_premium, sum(h2.T_assured) as YTD_assured
from
(select fiscal_year, premium_date, type, sum(premiumamount) as T_amount, sum(sum_assured) as T_assured
from have
group by fiscal_year, premium_date, type) as h1
left join
(select fiscal_year, premium_date, type, sum(premiumamount) as T_amount, sum(sum_assured) as T_assured
from have
group by fiscal_year, premium_date, type) as h2
on h2.premium_date <= h1.premium_date
and h1.fiscal_year=h2.fiscal_year
and h1.type=h2.type
group by h1.fiscal_year, h1.premium_date;
quit;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
HI @Reeza While it is trivial in this thread and I am 100% certain, a professional orgranization will/should not resort to SQL for this, I'm afraid your 2*summary times initial and then group by product and filter will slow down the process as opposed to having summary done in one go and one by group product & filter before the final summary, hence was my approach.
But of course you made me think what I could have done better, and that is get the temp as view rather than a table as follows.
data have;
input Fiscal_year $ Premium_date :date9. type $ premiumamount sum_assured;
format Premium_date date9.;
cards;
2016-17 01JAN2017 corp 100 100
2016-17 01JAN2017 corp 100 100
2016-17 01FEB2017 corp 200 200
2016-17 01FEB2017 corp 200 200
;
/*create view temp */
proc sql;
create view temp as
select fiscal_year, premium_date, type, sum(premiumamount) as YTD_Premiumamount,sum(sum_assured) as YTD_sum_assured
from have
group by fiscal_year, premium_date, type;
quit;
proc sql;
create table want as
select a.Fiscal_year ,a.Premium_date ,a.type,sum(b.YTD_Premiumamount) as YTD_Premiumamount,sum(b.YTD_sum_assured) as YTD_sum_assured
from temp a inner join temp b
on a.Fiscal_year =b.Fiscal_year and a.type=b.type and b.Premium_date<=a.Premium_date
group by a.fiscal_year,a.premium_date,a.type
order by a.fiscal_year,a.Premium_date,a.type;
quit;
which should improve efficiency.
My two cents!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
While it is trivial in this thread and I am 100% certain, a professional orgranization will/should not resort to SQL for this, I'm afraid your 2*summary times initial and then group by product and filter will slow down the process as opposed to having summary done in one go and one by group product & filter before the final summary, hence was my approach.
The question explicitly asks for an inefficient solution though 😉
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
So true. And believe it or not,
I had a very similar VIZ Almost same question in a HW and here is a feedback from my Prof lol
Grade Item | Points | Grade | Feedback |
---|---|---|---|
|
|
|
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@Reeza wrote:
The question explicitly asks for an inefficient solution though 😉
My current philosophy for situations where the user excludes the most appropriate and efficient method is that I will give the answer I want to give, without the original restrictions, because someone (perhaps not the original poster) may benefit. And of course, sometimes the original poster may "see the light" and benefit as well.
The only time I try to follow the originally stated restrictions is if a user has not licensed the proper SAS product, for example, PROC EXPAND in SAS/ETS, and wants to replicate some of the functionality in PROC EXPAND using DATA steps or other PROCs that the user has available.
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
But I am sure you will grant "partial credit" if you were to grade? (winks)
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
In my opinion, another benefit of not using PROC SQL here is:
The code I originally provided is much more readable and understandable and maintainable than any PROC SQL solution in this thread. If a new person had to take over the code, I think there's a much better chance that the new person would understand the PROC SUMMARY/DATA step solution than the SQL solutions.
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Finding a context where this request would be legitimate requires a fertile imagination, but it can be a fun exercise. Here is a single query variation on the theme initiated by @novinosrin:
proc sql;
create table want as
select
a.Fiscal_year,
a.Premium_date,
a.type,
sum(b.premiumAmount) as YTD_PremiumAmount,
sum(b.sum_assured) as YTD_sum_assured
from
( select distinct
a.Fiscal_year,
a.Premium_date,
a.type,
b.Premium_date as other_date
from have as a inner join have as b
on a.Fiscal_year = b.Fiscal_year and
a.type = b.type and
a.Premium_date >= b.Premium_date ) as a inner join
have as b
on a.Fiscal_year = b.Fiscal_year and
a.type = b.type and
a.other_date = b.Premium_date
group by
a.Fiscal_year,
a.Premium_date,
a.type
;
quit;
Part I:
Good, but not efficient solution, try using "Cursor/Merge".You should have challenged me
Part II:
Good.
Part III: -1
The result of your query should display ALL projectNo even if the quantity is "0". Plus the sum is Quantity Not PartNo.