DATA Step, Macro, Functions and more

PROC SQL - Running sum/total for a range

Accepted Solution Solved
Reply
Contributor
Posts: 34
Accepted Solution

PROC SQL - Running sum/total for a range

I am runnnig query in PROC SQL and need to resolved below,

 

I want - Sum of weight for next 12month

(for time=0 need sum of weight from time 0 to 11, then

for time=1 need sum of weight from time 1 to 12)

so on.. util time=24.

 

The tricky part is, I need this in PROC SQL, so I believe it has to be SQL query. Please find attached sample file and column of what I want.

 

Thank you

Ranny 


Accepted Solutions
Solution
a week ago
Super User
Posts: 2,049

Re: PROC SQL - Running sum/total for a range

Slight improvement to my previous. Yet still violates your rules  by a mile. 

 


data have;
input Name $	Sex $	Age	Height	Weight	Time_month;
cards;
Alfred	M	14	69	112.5	0
Alice	F	13	56.5	84	1
Barbara	F	13	65.3	98	2
Carol	F	14	62.8	102.5	3
Henry	M	14	63.5	102.5	4
James	M	12	57.3	83	5
Jane	F	12	59.8	84.5	6
Janet	F	15	62.5	112.5	7
Jeffrey	M	13	62.5	84	8
John	M	12	59	99.5	9
Joyce	F	11	51.3	50.5	10
Judy	F	14	64.3	90	11
Louise	F	12	56.3	77	12
Mary	F	15	66.5	112	13
Philip	M	16	72	150	14
Robert	M	12	64.8	128	15
Ronald	M	15	67	133	16
Thomas	M	11	57.5	85	17
William	M	15	66.5	112	18
Louise	F	12	56.3	77	19
Mary	F	15	66.5	112	20
Philip	M	16	72	150	21
Robert	M	12	64.8	128	22
Ronald	M	15	67	133	23
Alfred	M	14	69	112.5	24
Alice	F	13	56.5	84	25
Barbara	F	13	65.3	98	26
Carol	F	14	62.8	102.5	27
Henry	M	14	63.5	102.5	28
James	M	12	57.3	83	29
Jane	F	12	59.8	84.5	30
Janet	F	15	62.5	112.5	31
Jeffrey	M	13	62.5	84	32
John	M	12	59	99.5	33
Joyce	F	11	51.3	50.5	34
Alfred	M	14	69	112.5	35
Alice	F	13	56.5	84	36
;
proc sql;
create table want(drop=t Time_month_) as
select a.*,b.Time_month as Time_month_,b.Time_month<=a.Time_month<=b.Time_month+11 as t,sum(a.weight)*(calculated t=1) as want
from have a, have b
group by b.Time_month,t
having time_month=min(time_month) and t=1
order by  b.Time_month,a.time_month;
quit;

 

 

 

 

View solution in original post


All Replies
Respected Advisor
Posts: 3,257

Re: PROC SQL - Running sum/total for a range


@Ranny wrote:

 

The tricky part is, I need this in PROC SQL, so I believe it has to be SQL query. Please find attached sample file and column of what I want.

 


Yes, very tricky indeed, and if it is possible, you'd have to write a lot of code to get it to work. But off the top of my head, I don't think it is possible. So, don't do this in PROC SQL.

 

Don't do this in PROC SQL.

 

Don't do this in PROC SQL.

 

It would be rather simple in PROC EXPAND.

--
Paige Miller
Contributor
Posts: 34

Re: PROC SQL - Running sum/total for a range

Posted in reply to PaigeMiller

Thank you for your input, but my process is build in sql and just to change this part in sas is going to be nightmare at this moment as it is impacting lot more on this calculation. so badly needed this in sql.

Respected Advisor
Posts: 3,257

Re: PROC SQL - Running sum/total for a range

Don't do this in PROC SQL.

 

If it has to be built in PROC SQL, then I think you have the wrong process, but anyway, I can't help you further.

--
Paige Miller
Contributor
Posts: 34

Re: PROC SQL - Running sum/total for a range

Posted in reply to PaigeMiller

Thank you

Super User
Super User
Posts: 9,813

Re: PROC SQL - Running sum/total for a range

Sorry, this does not make sense:
"but my process is build in sql and just to change this part in sas is going to be nightmare at this moment as it is impacting lot more on this calculation".  SAS is the programming language, proc sql is just one procedure of many in the language.  You can use Base SAS datasteps and procedures, alongside proc sql steps without any problems.  Therefore your constraints are your only issue here and not something we can fix for you.  Running sums and totals can be done quite easily in a number of methods using datasteps - you can find plenty of examples on here and in papers written on the subject.

Super User
Posts: 10,534

Re: PROC SQL - Running sum/total for a range

The task defines the tool, not the other way 'round. aka Maxim 14.

 

If you don't have the necessary license for proc expand, you can do it in a data step.

Post your source dataset (in a data step with datalines), so we have something to play with.

Note that spreadsheets are not useful. Many organizations block the download of Office files from the net, and spreadsheets cannot convey important dataset attributes (variable types, lengths, formats etc).

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Contributor
Posts: 34

Re: PROC SQL - Running sum/total for a range

Posted in reply to KurtBremser

Thank you all for your inputs,

 

here is the data step raw data - code

 

data input;
input Name$ Sex$ Age Height Weight month;
cards;
Alfred M 14 69 112.5 0
Alice F 13 56.5 84 1
Barbara F 13 65.3 98 2
Carol F 14 62.8 102.5 3
Henry M 14 63.5 102.5 4
James M 12 57.3 83 5
Jane F 12 59.8 84.5 6
Janet F 15 62.5 112.5 7
Jeffrey M 13 62.5 84 8
John M 12 59 99.5 9
Joyce F 11 51.3 50.5 10
Judy F 14 64.3 90 11
Louise F 12 56.3 77 12
Mary F 15 66.5 112 13
Philip M 16 72 150 14
Robert M 12 64.8 128 15
Ronald M 15 67 133 16
Thomas M 11 57.5 85 17
William M 15 66.5 112 18
Louise F 12 56.3 77 19
Mary F 15 66.5 112 20
Philip M 16 72 150 21
Robert M 12 64.8 128 22
Ronald M 15 67 133 23
Alfred M 14 69 112.5 24
Alice F 13 56.5 84 25
Barbara F 13 65.3 98 26
Carol F 14 62.8 102.5 27
Henry M 14 63.5 102.5 28
James M 12 57.3 83 29
Jane F 12 59.8 84.5 30
Janet F 15 62.5 112.5 31
Jeffrey M 13 62.5 84 32
John M 12 59 99.5 33
Joyce F 11 51.3 50.5 34
Alfred M 14 69 112.5 35
Alice F 13 56.5 84 36
run;

 

Output - looking for 

 

Data    Want
Name Sex Age Height Weight Time - month   want - Sum of weight for 12month
(for time=0 need sum of weight from time 0 to 11,
then for time=1 need sum of weight from time 1 to 12)
Alfred M 14 69 112.5 0   1103.5
Alice F 13 56.5 84 1   1068
Barbara F 13 65.3 98 2   1096
Carol F 14 62.8 102.5 3   1148
Henry M 14 63.5 102.5 4   1173.5
James M 12 57.3 83 5   1204
Jane F 12 59.8 84.5 6   1206
Janet F 15 62.5 112.5 7   1233.5
Jeffrey M 13 62.5 84 8   1198
John M 12 59 99.5 9   1226
Joyce F 11 51.3 50.5 10   1276.5
Judy F 14 64.3 90 11   1354
Louise F 12 56.3 77 12   1397
Mary F 15 66.5 112 13   1432.5
Philip M 16 72 150 14   1404.5
Robert M 12 64.8 128 15   1352.5
Ronald M 15 67 133 16   1327
Thomas M 11 57.5 85 17   1296.5
William M 15 66.5 112 18   1294.5
Louise F 12 56.3 77 19   1267
Mary F 15 66.5 112 20   1302.5
Philip M 16 72 150 21   1274.5
Robert M 12 64.8 128 22   1224
Ronald M 15 67 133 23   1146.5
Alfred M 14 69 112.5 24   N/A
Alice F 13 56.5 84 25   N/A
Barbara F 13 65.3 98 26   N/A
Carol F 14 62.8 102.5 27   N/A
Henry M 14 63.5 102.5 28   N/A
James M 12 57.3 83 29   N/A
Jane F 12 59.8 84.5 30   N/A
Janet F 15 62.5 112.5 31   N/A
Jeffrey M 13 62.5 84 32   N/A
John M 12 59 99.5 33   N/A
Joyce F 11 51.3 50.5 34   N/A
Alfred M 14 69 112.5 35   N/A
Alice F 13 56.5 84 36   N/A
Trusted Advisor
Posts: 1,322

Re: PROC SQL - Running sum/total for a range

[ Edited ]

Try this:

proc sql;
create table want as
select c.*, sumwgt12
from input c left join
  (select a.month, sum(b.weight) as sumwgt12
   from input a left join input b
   on a.month<=b.month<a.month+12
   where a.month<24
   group by 1
   having count(b.month)=12) d
on c.month=d.month;
quit;

(Edit: simplified GROUP BY clause and corresponding SELECT variables.) 

PROC Star
Posts: 549

Re: PROC SQL - Running sum/total for a range

[ Edited ]

Your data should have single name will be ther for 36 months as per your description and you need to total within that name. So I did an update and then run the query.

 



Proc Sql;
update input
set name ='Alfred'; proc sql; select name, month, weight, case when month ge 23 then . else tot end as tot from (select name, month,weight,(select sum(weight) from input a where a.name = b.name and a.month ge b.month and a.month between b.month and b.month+11group by name) as tot from input b);

 

Contributor
Posts: 34

Re: PROC SQL - Running sum/total for a range

This is giving me an error

 

ERROR: Column month could not be found in the table/view identified with the correlation name A.
ERROR: Unresolved reference to table/correlation name a.
ERROR: Column month could not be found in the table/view identified with the correlation name B.
ERROR: Column month could not be found in the table/view identified with the correlation name B.
ERROR: Column month could not be found in the table/view identified with the correlation name A.
ERROR: Unresolved reference to table/correlation name a.
ERROR: Column month could not be found in the table/view identified with the correlation name B.
ERROR: Column month could not be found in the table/view identified with the correlation name B.
ERROR: Column month could not be found in the table/view identified with the correlation name B.
ERROR: Column month could not be found in the table/view identified with the correlation name B.
ERROR: The following columns were not found in the contributing tables: month.
NOTE: PROC SQL set option NOEXEC and will continue to check the syntax of statements.
25 quit;

PROC Star
Posts: 549

Re: PROC SQL - Running sum/total for a range

I used the same data you gave

data input;
input Name$ Sex$ Age Height Weight month;
cards;
Alfred M 14 69 112.5 0
Alice F 13 56.5 84 1
Barbara F 13 65.3 98 2
Carol F 14 62.8 102.5 3
Henry M 14 63.5 102.5 4
James M 12 57.3 83 5
Jane F 12 59.8 84.5 6
Janet F 15 62.5 112.5 7
Jeffrey M 13 62.5 84 8
John M 12 59 99.5 9
Joyce F 11 51.3 50.5 10
Judy F 14 64.3 90 11
Louise F 12 56.3 77 12
Mary F 15 66.5 112 13
Philip M 16 72 150 14
Robert M 12 64.8 128 15
Ronald M 15 67 133 16
Thomas M 11 57.5 85 17
William M 15 66.5 112 18
Louise F 12 56.3 77 19
Mary F 15 66.5 112 20
Philip M 16 72 150 21
Robert M 12 64.8 128 22
Ronald M 15 67 133 23
Alfred M 14 69 112.5 24
Alice F 13 56.5 84 25
Barbara F 13 65.3 98 26
Carol F 14 62.8 102.5 27
Henry M 14 63.5 102.5 28
James M 12 57.3 83 29
Jane F 12 59.8 84.5 30
Janet F 15 62.5 112.5 31
Jeffrey M 13 62.5 84 32
John M 12 59 99.5 33
Joyce F 11 51.3 50.5 34
Alfred M 14 69 112.5 35
Alice F 13 56.5 84 36
run;

proc sql;
update input
set name = 'Alfred' ;


proc sql;
select name, month, weight, case when month ge 23 then .
else tot end as tot
from 
(select name, month,weight,
(select sum(weight) from 
input a
where a.name = b.name
and a.month ge b.month
and a.month between b.month and b.month+11
group by name) as  tot
from input b);


Super User
Posts: 13,909

Re: PROC SQL - Running sum/total for a range


@Ranny wrote:

I am runnnig query in PROC SQL and need to resolved below,

 

I want - Sum of weight for next 12month

(for time=0 need sum of weight from time 0 to 11, then

for time=1 need sum of weight from time 1 to 12)

so on.. util time=24.

 

The tricky part is, I need this in PROC SQL, so I believe it has to be SQL query. Please find attached sample file and column of what I want.

 

Thank you

Ranny 


One of the reasons you are getting some push-back on Proc Sql is that SQL deals with set operations not "order of data". So it adds a great deal of additional code. Especially when you say "sum for next 12 months". What do you want if there are not 12 months following (or 11 or 10 or any number less than 12)? The code for SQL to identify and implement such is much more complex.

 

Also if you do not have a variable that contains actual dates, SAS numeric with an appropriate date format, then "next 12" can be extremely to identify in some cases.

Since the example data you provided was basically the SASHELP.Class data set with a sequence identifier of 1 less than record position it is fairly likely that your actual data will behave significantly differently.

Contributor
Posts: 34

Re: PROC SQL - Running sum/total for a range

my actual data has 36month for same customer, and all I want this until 24month, so we will not run out of data to sum for next 12month.

Super User
Posts: 2,049

Re: PROC SQL - Running sum/total for a range

data have;
input Name $	Sex $	Age	Height	Weight	Time_month;
cards;
Alfred	M	14	69	112.5	0
Alice	F	13	56.5	84	1
Barbara	F	13	65.3	98	2
Carol	F	14	62.8	102.5	3
Henry	M	14	63.5	102.5	4
James	M	12	57.3	83	5
Jane	F	12	59.8	84.5	6
Janet	F	15	62.5	112.5	7
Jeffrey	M	13	62.5	84	8
John	M	12	59	99.5	9
Joyce	F	11	51.3	50.5	10
Judy	F	14	64.3	90	11
Louise	F	12	56.3	77	12
Mary	F	15	66.5	112	13
Philip	M	16	72	150	14
Robert	M	12	64.8	128	15
Ronald	M	15	67	133	16
Thomas	M	11	57.5	85	17
William	M	15	66.5	112	18
Louise	F	12	56.3	77	19
Mary	F	15	66.5	112	20
Philip	M	16	72	150	21
Robert	M	12	64.8	128	22
Ronald	M	15	67	133	23
Alfred	M	14	69	112.5	24
Alice	F	13	56.5	84	25
Barbara	F	13	65.3	98	26
Carol	F	14	62.8	102.5	27
Henry	M	14	63.5	102.5	28
James	M	12	57.3	83	29
Jane	F	12	59.8	84.5	30
Janet	F	15	62.5	112.5	31
Jeffrey	M	13	62.5	84	32
John	M	12	59	99.5	33
Joyce	F	11	51.3	50.5	34
Alfred	M	14	69	112.5	35
Alice	F	13	56.5	84	36
;
proc sql;
create table want(drop=t Time_month_)  as
select *, sum(weight) as want
from
(select a.*,b.Time_month as Time_month_,b.Time_month<=a.Time_month<=b.Time_month+11 as t
from have a, have b
group by b.Time_month,a.Time_month
having  t=1)
group by Time_month_,t
having time_month=min(time_month);
quit;
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 31 replies
  • 178 views
  • 12 likes
  • 10 in conversation