- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I'm trying find a sum for var1 and separate sum for var2 for each row based on the sas date column labelled dt and two days prior (which is in the sum_to_date) within each ID.
For example,
ID232445
line 5 sum(var1) = 0 sum(var2) =1
line 6 sum(var1) = 0 sum(var2) =2
line 7 sum(var1) = 0 sum(var2) =3
line 8 sum(var1) = 0 sum(var2) =3
line 9 sum(var1) = 0 sum(var2) =4
line 10 sum(var1) = 0 sum(var2) =3
line 11 sum(var1) = 0 sum(var2) =1
line 12 sum(var1) =0 sum(var2)=2
continue summing until reaching ID 112344, then begin the summing again.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Can you post this example data as a CARDS data step, for the data you HAVE, along with the data you WANT to create, and also the code you have tried? Are you familiar with BY-group processing? Seeing the code you have tried will help others help you.
I don't really understand how DT is involved in your logic. Seeing the data you HAVE as input and the data you WANT as output will help.
Next webinar will be in January 2025. Until then, check out our archives: https://www.basug.org/videos. And be sure to subscribe to our our email list.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Quentin,
Thank you! The date that I have as input is in the picture of the spreadsheet. I will create what I want in another spreadsheet.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I'm new to this forum. Is this what you mean by 'posting data as text directly into forum'?
HAVE | ||||||
ID | var1 | var2 | dt | sum_to_date | ||
232445 | 0 | 1 | 23145 | 23143 | ||
232445 | 0 | 0 | 23145 | 23143 | ||
232445 | 0 | 1 | 23150 | 23148 | ||
232445 | 0 | 0 | 23151 | 23149 | ||
232445 | 0 | 1 | 23151 | 23149 | ||
232445 | 0 | 1 | 23151 | 23149 | ||
232445 | 0 | 0 | 23152 | 23150 | ||
232445 | 0 | 1 | 23152 | 23150 | ||
232445 | 0 | 0 | 23153 | 23151 | ||
232445 | 0 | 1 | 23157 | 23155 | ||
232445 | 0 | 1 | 23157 | 23155 | ||
232445 | 0 | 0 | 23158 | 23156 | ||
232445 | 0 | 0 | 23159 | 23157 | ||
232445 | 0 | 0 | 23159 | 23157 | ||
232445 | 0 | 1 | 23159 | 23157 | ||
232445 | 0 | 0 | 23160 | 23158 | ||
112344 | 0 | 0 | 23135 | 23133 | ||
112344 | 0 | 0 | 23135 | 23133 | ||
112344 | 0 | 1 | 23135 | 23133 | ||
112344 | 0 | 1 | 23136 | 23134 | ||
112344 | 0 | 0 | 23136 | 23134 | ||
112344 | 0 | 1 | 23137 | 23135 | ||
112344 | 0 | 1 | 23137 | 23135 | ||
WANT | ||||||
ID | var1 | var2 | dt | sum_to_date | sum(var1) | sum(var2) |
232445 | 0 | 1 | 23145 | 23143 | ||
232445 | 0 | 0 | 23145 | 23143 | ||
232445 | 0 | 1 | 23150 | 23148 | 0 | 1 |
232445 | 0 | 0 | 23151 | 23149 | 0 | 1 |
232445 | 0 | 1 | 23151 | 23149 | 0 | 2 |
232445 | 0 | 1 | 23151 | 23149 | 0 | 3 |
232445 | 0 | 0 | 23152 | 23150 | 0 | 3 |
232445 | 0 | 1 | 23152 | 23150 | 0 | 4 |
232445 | 0 | 0 | 23153 | 23151 | 0 | 3 |
232445 | 0 | 1 | 23157 | 23155 | 0 | 1 |
232445 | 0 | 1 | 23157 | 23155 | 0 | 2 |
232445 | 0 | 0 | 23158 | 23156 | 0 | 2 |
232445 | 0 | 0 | 23159 | 23157 | 0 | 2 |
232445 | 0 | 0 | 23159 | 23157 | 0 | 2 |
232445 | 0 | 1 | 23159 | 23157 | 0 | 3 |
232445 | 0 | 0 | 23160 | 23158 | 0 | 1 |
112344 | 0 | 0 | 23135 | 23133 | 0 | 0 |
112344 | 0 | 0 | 23135 | 23133 | 0 | 0 |
112344 | 0 | 1 | 23135 | 23133 | 0 | 1 |
112344 | 0 | 1 | 23136 | 23134 | 0 | 2 |
112344 | 0 | 0 | 23136 | 23134 | 0 | 2 |
112344 | 0 | 1 | 23137 | 23135 | 0 | 3 |
112344 | 0 | 1 | 23137 | 23135 | 0 | 4 |
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
You have duplicate rows with nothing that uniquely identifies a row making this a bit messier.
This seems to roughly work. Not sure of your logic though as you start the answers at 5 but line 4 should also have an answer as well? Are the end of the intervals included?
data have;
infile cards truncover;
input ID $ var1 var2 dt sum_to_date;
format dt sum_to_date date9.;
cards;
232445 0 1 23145 23143
232445 0 0 23145 23143
232445 0 1 23150 23148
232445 0 0 23151 23149
232445 0 1 23151 23149
232445 0 1 23151 23149
232445 0 0 23152 23150
232445 0 1 23152 23150
232445 0 0 23153 23151
232445 0 1 23157 23155
232445 0 1 23157 23155
232445 0 0 23158 23156
232445 0 0 23159 23157
232445 0 0 23159 23157
232445 0 1 23159 23157
232445 0 0 23160 23158
112344 0 0 23135 23133
112344 0 0 23135 23133
112344 0 1 23135 23133
112344 0 1 23136 23134
112344 0 0 23136 23134
112344 0 1 23137 23135
112344 0 1 23137 23135
;
run;
data have;
set have;
obs= _n_;
run;
proc sql;
create table want as
select distinct t1.*, sum(t2.var1) as sum_var1, sum(t2.var2) as sum_var2
from have as t1
left join have as t2
on t1.id=t2.id and t2.dt between t1.sum_to_date and t1.dt-1
group by t1.obs
order by t1.obs;
quit;
@mfowlk wrote:
I'm trying find a sum for var1 and separate sum for var2 for each row based on the sas date column labelled dt and two days prior (which is in the sum_to_date) within each ID.
For example,
ID232445
line 5 sum(var1) = 0 sum(var2) =1
line 6 sum(var1) = 0 sum(var2) =2
line 7 sum(var1) = 0 sum(var2) =3
line 8 sum(var1) = 0 sum(var2) =3
line 9 sum(var1) = 0 sum(var2) =4
line 10 sum(var1) = 0 sum(var2) =3
line 11 sum(var1) = 0 sum(var2) =1
line 12 sum(var1) =0 sum(var2)=2
continue summing until reaching ID 112344, then begin the summing again.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you. Yes, lines 1-4 should have answers as well, but I only included 1 month and the dates early in the month will not have prior dates to sum. For the ouput, I would like to include all the columns that are listed for each observation (ID, var1, var2, dt, sum_to_date) as well as the summation of var1 and var2.
Would the monotonic function help to uniquely identify each row?
Yes, the end of the interval would be include in the sum. So the observation/row would be possibly summing 3 numbers if there is a date that is within 2 days prior to the date on the observation/row.