- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I am runnnig query in PROC SQL and need to resolved below,
I want - Sum of weight for next 12month
(for time=0 need sum of weight from time 0 to 11, then
for time=1 need sum of weight from time 1 to 12)
so on.. util time=24.
The tricky part is, I need this in PROC SQL, so I believe it has to be SQL query. Please find attached sample file and column of what I want.
Thank you
Ranny
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Slight improvement to my previous. Yet still violates your rules by a mile.
data have;
input Name $ Sex $ Age Height Weight Time_month;
cards;
Alfred M 14 69 112.5 0
Alice F 13 56.5 84 1
Barbara F 13 65.3 98 2
Carol F 14 62.8 102.5 3
Henry M 14 63.5 102.5 4
James M 12 57.3 83 5
Jane F 12 59.8 84.5 6
Janet F 15 62.5 112.5 7
Jeffrey M 13 62.5 84 8
John M 12 59 99.5 9
Joyce F 11 51.3 50.5 10
Judy F 14 64.3 90 11
Louise F 12 56.3 77 12
Mary F 15 66.5 112 13
Philip M 16 72 150 14
Robert M 12 64.8 128 15
Ronald M 15 67 133 16
Thomas M 11 57.5 85 17
William M 15 66.5 112 18
Louise F 12 56.3 77 19
Mary F 15 66.5 112 20
Philip M 16 72 150 21
Robert M 12 64.8 128 22
Ronald M 15 67 133 23
Alfred M 14 69 112.5 24
Alice F 13 56.5 84 25
Barbara F 13 65.3 98 26
Carol F 14 62.8 102.5 27
Henry M 14 63.5 102.5 28
James M 12 57.3 83 29
Jane F 12 59.8 84.5 30
Janet F 15 62.5 112.5 31
Jeffrey M 13 62.5 84 32
John M 12 59 99.5 33
Joyce F 11 51.3 50.5 34
Alfred M 14 69 112.5 35
Alice F 13 56.5 84 36
;
proc sql;
create table want(drop=t Time_month_) as
select a.*,b.Time_month as Time_month_,b.Time_month<=a.Time_month<=b.Time_month+11 as t,sum(a.weight)*(calculated t=1) as want
from have a, have b
group by b.Time_month,t
having time_month=min(time_month) and t=1
order by b.Time_month,a.time_month;
quit;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@Ranny wrote:
The tricky part is, I need this in PROC SQL, so I believe it has to be SQL query. Please find attached sample file and column of what I want.
Yes, very tricky indeed, and if it is possible, you'd have to write a lot of code to get it to work. But off the top of my head, I don't think it is possible. So, don't do this in PROC SQL.
Don't do this in PROC SQL.
Don't do this in PROC SQL.
It would be rather simple in PROC EXPAND.
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you for your input, but my process is build in sql and just to change this part in sas is going to be nightmare at this moment as it is impacting lot more on this calculation. so badly needed this in sql.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Don't do this in PROC SQL.
If it has to be built in PROC SQL, then I think you have the wrong process, but anyway, I can't help you further.
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Sorry, this does not make sense:
"but my process is build in sql and just to change this part in sas is going to be nightmare at this moment as it is impacting lot more on this calculation". SAS is the programming language, proc sql is just one procedure of many in the language. You can use Base SAS datasteps and procedures, alongside proc sql steps without any problems. Therefore your constraints are your only issue here and not something we can fix for you. Running sums and totals can be done quite easily in a number of methods using datasteps - you can find plenty of examples on here and in papers written on the subject.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
The task defines the tool, not the other way 'round. aka Maxim 14.
If you don't have the necessary license for proc expand, you can do it in a data step.
Post your source dataset (in a data step with datalines), so we have something to play with.
Note that spreadsheets are not useful. Many organizations block the download of Office files from the net, and spreadsheets cannot convey important dataset attributes (variable types, lengths, formats etc).
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you all for your inputs,
here is the data step raw data - code
data input;
input Name$ Sex$ Age Height Weight month;
cards;
Alfred M 14 69 112.5 0
Alice F 13 56.5 84 1
Barbara F 13 65.3 98 2
Carol F 14 62.8 102.5 3
Henry M 14 63.5 102.5 4
James M 12 57.3 83 5
Jane F 12 59.8 84.5 6
Janet F 15 62.5 112.5 7
Jeffrey M 13 62.5 84 8
John M 12 59 99.5 9
Joyce F 11 51.3 50.5 10
Judy F 14 64.3 90 11
Louise F 12 56.3 77 12
Mary F 15 66.5 112 13
Philip M 16 72 150 14
Robert M 12 64.8 128 15
Ronald M 15 67 133 16
Thomas M 11 57.5 85 17
William M 15 66.5 112 18
Louise F 12 56.3 77 19
Mary F 15 66.5 112 20
Philip M 16 72 150 21
Robert M 12 64.8 128 22
Ronald M 15 67 133 23
Alfred M 14 69 112.5 24
Alice F 13 56.5 84 25
Barbara F 13 65.3 98 26
Carol F 14 62.8 102.5 27
Henry M 14 63.5 102.5 28
James M 12 57.3 83 29
Jane F 12 59.8 84.5 30
Janet F 15 62.5 112.5 31
Jeffrey M 13 62.5 84 32
John M 12 59 99.5 33
Joyce F 11 51.3 50.5 34
Alfred M 14 69 112.5 35
Alice F 13 56.5 84 36
run;
Output - looking for
Data | Want | ||||||
Name | Sex | Age | Height | Weight | Time - month | want - Sum of weight for 12month (for time=0 need sum of weight from time 0 to 11, then for time=1 need sum of weight from time 1 to 12) |
|
Alfred | M | 14 | 69 | 112.5 | 0 | 1103.5 | |
Alice | F | 13 | 56.5 | 84 | 1 | 1068 | |
Barbara | F | 13 | 65.3 | 98 | 2 | 1096 | |
Carol | F | 14 | 62.8 | 102.5 | 3 | 1148 | |
Henry | M | 14 | 63.5 | 102.5 | 4 | 1173.5 | |
James | M | 12 | 57.3 | 83 | 5 | 1204 | |
Jane | F | 12 | 59.8 | 84.5 | 6 | 1206 | |
Janet | F | 15 | 62.5 | 112.5 | 7 | 1233.5 | |
Jeffrey | M | 13 | 62.5 | 84 | 8 | 1198 | |
John | M | 12 | 59 | 99.5 | 9 | 1226 | |
Joyce | F | 11 | 51.3 | 50.5 | 10 | 1276.5 | |
Judy | F | 14 | 64.3 | 90 | 11 | 1354 | |
Louise | F | 12 | 56.3 | 77 | 12 | 1397 | |
Mary | F | 15 | 66.5 | 112 | 13 | 1432.5 | |
Philip | M | 16 | 72 | 150 | 14 | 1404.5 | |
Robert | M | 12 | 64.8 | 128 | 15 | 1352.5 | |
Ronald | M | 15 | 67 | 133 | 16 | 1327 | |
Thomas | M | 11 | 57.5 | 85 | 17 | 1296.5 | |
William | M | 15 | 66.5 | 112 | 18 | 1294.5 | |
Louise | F | 12 | 56.3 | 77 | 19 | 1267 | |
Mary | F | 15 | 66.5 | 112 | 20 | 1302.5 | |
Philip | M | 16 | 72 | 150 | 21 | 1274.5 | |
Robert | M | 12 | 64.8 | 128 | 22 | 1224 | |
Ronald | M | 15 | 67 | 133 | 23 | 1146.5 | |
Alfred | M | 14 | 69 | 112.5 | 24 | N/A | |
Alice | F | 13 | 56.5 | 84 | 25 | N/A | |
Barbara | F | 13 | 65.3 | 98 | 26 | N/A | |
Carol | F | 14 | 62.8 | 102.5 | 27 | N/A | |
Henry | M | 14 | 63.5 | 102.5 | 28 | N/A | |
James | M | 12 | 57.3 | 83 | 29 | N/A | |
Jane | F | 12 | 59.8 | 84.5 | 30 | N/A | |
Janet | F | 15 | 62.5 | 112.5 | 31 | N/A | |
Jeffrey | M | 13 | 62.5 | 84 | 32 | N/A | |
John | M | 12 | 59 | 99.5 | 33 | N/A | |
Joyce | F | 11 | 51.3 | 50.5 | 34 | N/A | |
Alfred | M | 14 | 69 | 112.5 | 35 | N/A | |
Alice | F | 13 | 56.5 | 84 | 36 | N/A |
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Try this:
proc sql;
create table want as
select c.*, sumwgt12
from input c left join
(select a.month, sum(b.weight) as sumwgt12
from input a left join input b
on a.month<=b.month<a.month+12
where a.month<24
group by 1
having count(b.month)=12) d
on c.month=d.month;
quit;
(Edit: simplified GROUP BY clause and corresponding SELECT variables.)
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Your data should have single name will be ther for 36 months as per your description and you need to total within that name. So I did an update and then run the query.
Proc Sql;update inputset name ='Alfred';
proc sql;
select name,
month,
weight,
case when month ge 23 then . else tot end as tot
from
(select name, month,weight,(select sum(weight)
from input a
where a.name = b.name
and a.month ge b.month
and a.month between b.month and b.month+11group by name) as tot from input b);
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
This is giving me an error
ERROR: Column month could not be found in the table/view identified with the correlation name A.
ERROR: Unresolved reference to table/correlation name a.
ERROR: Column month could not be found in the table/view identified with the correlation name B.
ERROR: Column month could not be found in the table/view identified with the correlation name B.
ERROR: Column month could not be found in the table/view identified with the correlation name A.
ERROR: Unresolved reference to table/correlation name a.
ERROR: Column month could not be found in the table/view identified with the correlation name B.
ERROR: Column month could not be found in the table/view identified with the correlation name B.
ERROR: Column month could not be found in the table/view identified with the correlation name B.
ERROR: Column month could not be found in the table/view identified with the correlation name B.
ERROR: The following columns were not found in the contributing tables: month.
NOTE: PROC SQL set option NOEXEC and will continue to check the syntax of statements.
25 quit;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I used the same data you gave
data input;
input Name$ Sex$ Age Height Weight month;
cards;
Alfred M 14 69 112.5 0
Alice F 13 56.5 84 1
Barbara F 13 65.3 98 2
Carol F 14 62.8 102.5 3
Henry M 14 63.5 102.5 4
James M 12 57.3 83 5
Jane F 12 59.8 84.5 6
Janet F 15 62.5 112.5 7
Jeffrey M 13 62.5 84 8
John M 12 59 99.5 9
Joyce F 11 51.3 50.5 10
Judy F 14 64.3 90 11
Louise F 12 56.3 77 12
Mary F 15 66.5 112 13
Philip M 16 72 150 14
Robert M 12 64.8 128 15
Ronald M 15 67 133 16
Thomas M 11 57.5 85 17
William M 15 66.5 112 18
Louise F 12 56.3 77 19
Mary F 15 66.5 112 20
Philip M 16 72 150 21
Robert M 12 64.8 128 22
Ronald M 15 67 133 23
Alfred M 14 69 112.5 24
Alice F 13 56.5 84 25
Barbara F 13 65.3 98 26
Carol F 14 62.8 102.5 27
Henry M 14 63.5 102.5 28
James M 12 57.3 83 29
Jane F 12 59.8 84.5 30
Janet F 15 62.5 112.5 31
Jeffrey M 13 62.5 84 32
John M 12 59 99.5 33
Joyce F 11 51.3 50.5 34
Alfred M 14 69 112.5 35
Alice F 13 56.5 84 36
run;
proc sql;
update input
set name = 'Alfred' ;
proc sql;
select name, month, weight, case when month ge 23 then .
else tot end as tot
from
(select name, month,weight,
(select sum(weight) from
input a
where a.name = b.name
and a.month ge b.month
and a.month between b.month and b.month+11
group by name) as tot
from input b);
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@Ranny wrote:
I am runnnig query in PROC SQL and need to resolved below,
I want - Sum of weight for next 12month
(for time=0 need sum of weight from time 0 to 11, then
for time=1 need sum of weight from time 1 to 12)
so on.. util time=24.
The tricky part is, I need this in PROC SQL, so I believe it has to be SQL query. Please find attached sample file and column of what I want.
Thank you
Ranny
One of the reasons you are getting some push-back on Proc Sql is that SQL deals with set operations not "order of data". So it adds a great deal of additional code. Especially when you say "sum for next 12 months". What do you want if there are not 12 months following (or 11 or 10 or any number less than 12)? The code for SQL to identify and implement such is much more complex.
Also if you do not have a variable that contains actual dates, SAS numeric with an appropriate date format, then "next 12" can be extremely to identify in some cases.
Since the example data you provided was basically the SASHELP.Class data set with a sequence identifier of 1 less than record position it is fairly likely that your actual data will behave significantly differently.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
my actual data has 36month for same customer, and all I want this until 24month, so we will not run out of data to sum for next 12month.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
data have;
input Name $ Sex $ Age Height Weight Time_month;
cards;
Alfred M 14 69 112.5 0
Alice F 13 56.5 84 1
Barbara F 13 65.3 98 2
Carol F 14 62.8 102.5 3
Henry M 14 63.5 102.5 4
James M 12 57.3 83 5
Jane F 12 59.8 84.5 6
Janet F 15 62.5 112.5 7
Jeffrey M 13 62.5 84 8
John M 12 59 99.5 9
Joyce F 11 51.3 50.5 10
Judy F 14 64.3 90 11
Louise F 12 56.3 77 12
Mary F 15 66.5 112 13
Philip M 16 72 150 14
Robert M 12 64.8 128 15
Ronald M 15 67 133 16
Thomas M 11 57.5 85 17
William M 15 66.5 112 18
Louise F 12 56.3 77 19
Mary F 15 66.5 112 20
Philip M 16 72 150 21
Robert M 12 64.8 128 22
Ronald M 15 67 133 23
Alfred M 14 69 112.5 24
Alice F 13 56.5 84 25
Barbara F 13 65.3 98 26
Carol F 14 62.8 102.5 27
Henry M 14 63.5 102.5 28
James M 12 57.3 83 29
Jane F 12 59.8 84.5 30
Janet F 15 62.5 112.5 31
Jeffrey M 13 62.5 84 32
John M 12 59 99.5 33
Joyce F 11 51.3 50.5 34
Alfred M 14 69 112.5 35
Alice F 13 56.5 84 36
;
proc sql;
create table want(drop=t Time_month_) as
select *, sum(weight) as want
from
(select a.*,b.Time_month as Time_month_,b.Time_month<=a.Time_month<=b.Time_month+11 as t
from have a, have b
group by b.Time_month,a.Time_month
having t=1)
group by Time_month_,t
having time_month=min(time_month);
quit;