I'm trying find a sum for var1 and separate sum for var2 for each row based on the sas date column labelled dt and two days prior (which is in the sum_to_date) within each ID.
For example,
ID232445
line 5 sum(var1) = 0 sum(var2) =1
line 6 sum(var1) = 0 sum(var2) =2
line 7 sum(var1) = 0 sum(var2) =3
line 8 sum(var1) = 0 sum(var2) =3
line 9 sum(var1) = 0 sum(var2) =4
line 10 sum(var1) = 0 sum(var2) =3
line 11 sum(var1) = 0 sum(var2) =1
line 12 sum(var1) =0 sum(var2)=2
continue summing until reaching ID 112344, then begin the summing again.
Can you post this example data as a CARDS data step, for the data you HAVE, along with the data you WANT to create, and also the code you have tried? Are you familiar with BY-group processing? Seeing the code you have tried will help others help you.
I don't really understand how DT is involved in your logic. Seeing the data you HAVE as input and the data you WANT as output will help.
Quentin,
Thank you! The date that I have as input is in the picture of the spreadsheet. I will create what I want in another spreadsheet.
I'm new to this forum. Is this what you mean by 'posting data as text directly into forum'?
HAVE | ||||||
ID | var1 | var2 | dt | sum_to_date | ||
232445 | 0 | 1 | 23145 | 23143 | ||
232445 | 0 | 0 | 23145 | 23143 | ||
232445 | 0 | 1 | 23150 | 23148 | ||
232445 | 0 | 0 | 23151 | 23149 | ||
232445 | 0 | 1 | 23151 | 23149 | ||
232445 | 0 | 1 | 23151 | 23149 | ||
232445 | 0 | 0 | 23152 | 23150 | ||
232445 | 0 | 1 | 23152 | 23150 | ||
232445 | 0 | 0 | 23153 | 23151 | ||
232445 | 0 | 1 | 23157 | 23155 | ||
232445 | 0 | 1 | 23157 | 23155 | ||
232445 | 0 | 0 | 23158 | 23156 | ||
232445 | 0 | 0 | 23159 | 23157 | ||
232445 | 0 | 0 | 23159 | 23157 | ||
232445 | 0 | 1 | 23159 | 23157 | ||
232445 | 0 | 0 | 23160 | 23158 | ||
112344 | 0 | 0 | 23135 | 23133 | ||
112344 | 0 | 0 | 23135 | 23133 | ||
112344 | 0 | 1 | 23135 | 23133 | ||
112344 | 0 | 1 | 23136 | 23134 | ||
112344 | 0 | 0 | 23136 | 23134 | ||
112344 | 0 | 1 | 23137 | 23135 | ||
112344 | 0 | 1 | 23137 | 23135 | ||
WANT | ||||||
ID | var1 | var2 | dt | sum_to_date | sum(var1) | sum(var2) |
232445 | 0 | 1 | 23145 | 23143 | ||
232445 | 0 | 0 | 23145 | 23143 | ||
232445 | 0 | 1 | 23150 | 23148 | 0 | 1 |
232445 | 0 | 0 | 23151 | 23149 | 0 | 1 |
232445 | 0 | 1 | 23151 | 23149 | 0 | 2 |
232445 | 0 | 1 | 23151 | 23149 | 0 | 3 |
232445 | 0 | 0 | 23152 | 23150 | 0 | 3 |
232445 | 0 | 1 | 23152 | 23150 | 0 | 4 |
232445 | 0 | 0 | 23153 | 23151 | 0 | 3 |
232445 | 0 | 1 | 23157 | 23155 | 0 | 1 |
232445 | 0 | 1 | 23157 | 23155 | 0 | 2 |
232445 | 0 | 0 | 23158 | 23156 | 0 | 2 |
232445 | 0 | 0 | 23159 | 23157 | 0 | 2 |
232445 | 0 | 0 | 23159 | 23157 | 0 | 2 |
232445 | 0 | 1 | 23159 | 23157 | 0 | 3 |
232445 | 0 | 0 | 23160 | 23158 | 0 | 1 |
112344 | 0 | 0 | 23135 | 23133 | 0 | 0 |
112344 | 0 | 0 | 23135 | 23133 | 0 | 0 |
112344 | 0 | 1 | 23135 | 23133 | 0 | 1 |
112344 | 0 | 1 | 23136 | 23134 | 0 | 2 |
112344 | 0 | 0 | 23136 | 23134 | 0 | 2 |
112344 | 0 | 1 | 23137 | 23135 | 0 | 3 |
112344 | 0 | 1 | 23137 | 23135 | 0 | 4 |
You have duplicate rows with nothing that uniquely identifies a row making this a bit messier.
This seems to roughly work. Not sure of your logic though as you start the answers at 5 but line 4 should also have an answer as well? Are the end of the intervals included?
data have;
infile cards truncover;
input ID $ var1 var2 dt sum_to_date;
format dt sum_to_date date9.;
cards;
232445 0 1 23145 23143
232445 0 0 23145 23143
232445 0 1 23150 23148
232445 0 0 23151 23149
232445 0 1 23151 23149
232445 0 1 23151 23149
232445 0 0 23152 23150
232445 0 1 23152 23150
232445 0 0 23153 23151
232445 0 1 23157 23155
232445 0 1 23157 23155
232445 0 0 23158 23156
232445 0 0 23159 23157
232445 0 0 23159 23157
232445 0 1 23159 23157
232445 0 0 23160 23158
112344 0 0 23135 23133
112344 0 0 23135 23133
112344 0 1 23135 23133
112344 0 1 23136 23134
112344 0 0 23136 23134
112344 0 1 23137 23135
112344 0 1 23137 23135
;
run;
data have;
set have;
obs= _n_;
run;
proc sql;
create table want as
select distinct t1.*, sum(t2.var1) as sum_var1, sum(t2.var2) as sum_var2
from have as t1
left join have as t2
on t1.id=t2.id and t2.dt between t1.sum_to_date and t1.dt-1
group by t1.obs
order by t1.obs;
quit;
@mfowlk wrote:
I'm trying find a sum for var1 and separate sum for var2 for each row based on the sas date column labelled dt and two days prior (which is in the sum_to_date) within each ID.
For example,
ID232445
line 5 sum(var1) = 0 sum(var2) =1
line 6 sum(var1) = 0 sum(var2) =2
line 7 sum(var1) = 0 sum(var2) =3
line 8 sum(var1) = 0 sum(var2) =3
line 9 sum(var1) = 0 sum(var2) =4
line 10 sum(var1) = 0 sum(var2) =3
line 11 sum(var1) = 0 sum(var2) =1
line 12 sum(var1) =0 sum(var2)=2
continue summing until reaching ID 112344, then begin the summing again.
Thank you. Yes, lines 1-4 should have answers as well, but I only included 1 month and the dates early in the month will not have prior dates to sum. For the ouput, I would like to include all the columns that are listed for each observation (ID, var1, var2, dt, sum_to_date) as well as the summation of var1 and var2.
Would the monotonic function help to uniquely identify each row?
Yes, the end of the interval would be include in the sum. So the observation/row would be possibly summing 3 numbers if there is a date that is within 2 days prior to the date on the observation/row.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.