I have a dataset with values by date and county. I want to look at rolling averages over time and compare them by county, but I think I need to back fill-in the dates between a specific range of values first. I want to all the dates from March 1 to March 6. If the date isn't currently there then the values should be set to 0.
Can anyone help me find a way to do that?
data have; input county $ date :mmddyy10. value1 value2 total; format date mmddyy10.; datalines; CountyA 03/02/2020 2 0 2 CountyA 03/03/2020 3 1 4 CountyA 03/05/2020 5 2 7 CountyB 03/02/2020 1 0 1 CountyB 03/04/2020 4 1 5 CountyB 03/06/2020 2 1 3 ; run; data want; input county $ date :mmddyy10. value1 value2 total; format date mmddyy10.; datalines; CountyA 03/01/2020 0 0 0 CountyA 03/02/2020 2 0 2 CountyA 03/03/2020 3 1 4 CountyA 03/04/2020 0 0 0 CountyA 03/05/2020 5 2 7 CountyA 03/06/2020 0 0 0 CountyB 03/01/2020 0 0 0 CountyB 03/02/2020 1 0 1 CountyB 03/03/2020 0 0 0 CountyB 03/04/2020 4 1 5 CountyB 03/05/2020 0 0 0 CountyB 03/06/2020 2 1 3 ; run;
data have;
input county $ date :mmddyy10. value1 value2 total;
format date mmddyy10.;
datalines;
CountyA 03/02/2020 2 0 2
CountyA 03/03/2020 3 1 4
CountyA 03/05/2020 5 2 7
CountyB 03/02/2020 1 0 1
CountyB 03/04/2020 4 1 5
CountyB 03/06/2020 2 1 3
;
run;
%let start_date=01mar2020;
%let end_date=06mar2020;
data want;
if _n_=1 then do;
dcl hash H () ;
h.definekey ("date") ;
h.definedata ("value1", "value2", "total") ;
h.definedone () ;
end;
do until(last.county);
set have;
by county;
h.add();
end;
do date="&start_date"d to "&end_date"d;
value1=0;value2=0;total=0;
_n_=h.find();
output;
end;
h.clear();
run;
county | date | value1 | value2 | total |
---|---|---|---|---|
CountyA | 03/01/2020 | 0 | 0 | 0 |
CountyA | 03/02/2020 | 2 | 0 | 2 |
CountyA | 03/03/2020 | 3 | 1 | 4 |
CountyA | 03/04/2020 | 0 | 0 | 0 |
CountyA | 03/05/2020 | 5 | 2 | 7 |
CountyA | 03/06/2020 | 0 | 0 | 0 |
CountyB | 03/01/2020 | 0 | 0 | 0 |
CountyB | 03/02/2020 | 1 | 0 | 1 |
CountyB | 03/03/2020 | 0 | 0 | 0 |
CountyB | 03/04/2020 | 4 | 1 | 5 |
CountyB | 03/05/2020 | 0 | 0 | 0 |
CountyB | 03/06/2020 | 2 | 1 | 3 |
data have;
input county $ date :mmddyy10. value1 value2 total;
format date mmddyy10.;
datalines;
CountyA 03/02/2020 2 0 2
CountyA 03/03/2020 3 1 4
CountyA 03/05/2020 5 2 7
CountyB 03/02/2020 1 0 1
CountyB 03/04/2020 4 1 5
CountyB 03/06/2020 2 1 3
;
run;
%let start_date=01mar2020;
%let end_date=06mar2020;
data want;
if _n_=1 then do;
dcl hash H () ;
h.definekey ("date") ;
h.definedata ("value1", "value2", "total") ;
h.definedone () ;
end;
do until(last.county);
set have;
by county;
h.add();
end;
do date="&start_date"d to "&end_date"d;
value1=0;value2=0;total=0;
_n_=h.find();
output;
end;
h.clear();
run;
county | date | value1 | value2 | total |
---|---|---|---|---|
CountyA | 03/01/2020 | 0 | 0 | 0 |
CountyA | 03/02/2020 | 2 | 0 | 2 |
CountyA | 03/03/2020 | 3 | 1 | 4 |
CountyA | 03/04/2020 | 0 | 0 | 0 |
CountyA | 03/05/2020 | 5 | 2 | 7 |
CountyA | 03/06/2020 | 0 | 0 | 0 |
CountyB | 03/01/2020 | 0 | 0 | 0 |
CountyB | 03/02/2020 | 1 | 0 | 1 |
CountyB | 03/03/2020 | 0 | 0 | 0 |
CountyB | 03/04/2020 | 4 | 1 | 5 |
CountyB | 03/05/2020 | 0 | 0 | 0 |
CountyB | 03/06/2020 | 2 | 1 | 3 |
try next code:
data have;
input county $ date :mmddyy10. value1 value2 total;
format date mmddyy10.;
datalines;
CountyA 03/02/2020 2 0 2
CountyA 03/03/2020 3 1 4
CountyA 03/05/2020 5 2 7
CountyB 03/02/2020 1 0 1
CountyB 03/04/2020 4 1 5
CountyB 03/06/2020 2 1 3
;
run;
proc sort data=have; by county date; run;
data want;
set have;
by county;
array dayx d1-d6;
if first.county then do 1 to 6;
dayx(i) = .;
end;
if not last.county then do;
if date < '01jun2020'd or
date > '06jun2020'd then output;
else dayx(day(date)) = 1;
end;
if last.county then do;
if '01jun2020'd le date le '06jun2020'd
then dayx(day(date)) = 1;
output;
do i=1 to 6;
if dayx(i) = 0 then do;
valu1=0; valu2=0; value3=0;
date = mdy(6,i,2020);
output;
end;
end;
end;
run;
proc sort data=have; by county date; run;
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.