I have a dataset with values by date and county. I want to look at rolling averages over time and compare them by county, but I think I need to back fill-in the dates between a specific range of values first. I want to all the dates from March 1 to March 6. If the date isn't currently there then the values should be set to 0.
Can anyone help me find a way to do that?
data have; input county $ date :mmddyy10. value1 value2 total; format date mmddyy10.; datalines; CountyA 03/02/2020 2 0 2 CountyA 03/03/2020 3 1 4 CountyA 03/05/2020 5 2 7 CountyB 03/02/2020 1 0 1 CountyB 03/04/2020 4 1 5 CountyB 03/06/2020 2 1 3 ; run; data want; input county $ date :mmddyy10. value1 value2 total; format date mmddyy10.; datalines; CountyA 03/01/2020 0 0 0 CountyA 03/02/2020 2 0 2 CountyA 03/03/2020 3 1 4 CountyA 03/04/2020 0 0 0 CountyA 03/05/2020 5 2 7 CountyA 03/06/2020 0 0 0 CountyB 03/01/2020 0 0 0 CountyB 03/02/2020 1 0 1 CountyB 03/03/2020 0 0 0 CountyB 03/04/2020 4 1 5 CountyB 03/05/2020 0 0 0 CountyB 03/06/2020 2 1 3 ; run;
data have;
input county $ date :mmddyy10. value1 value2 total;
format date mmddyy10.;
datalines;
CountyA 03/02/2020 2 0 2
CountyA 03/03/2020 3 1 4
CountyA 03/05/2020 5 2 7
CountyB 03/02/2020 1 0 1
CountyB 03/04/2020 4 1 5
CountyB 03/06/2020 2 1 3
;
run;
%let start_date=01mar2020;
%let end_date=06mar2020;
data want;
if _n_=1 then do;
dcl hash H () ;
h.definekey ("date") ;
h.definedata ("value1", "value2", "total") ;
h.definedone () ;
end;
do until(last.county);
set have;
by county;
h.add();
end;
do date="&start_date"d to "&end_date"d;
value1=0;value2=0;total=0;
_n_=h.find();
output;
end;
h.clear();
run;
county | date | value1 | value2 | total |
---|---|---|---|---|
CountyA | 03/01/2020 | 0 | 0 | 0 |
CountyA | 03/02/2020 | 2 | 0 | 2 |
CountyA | 03/03/2020 | 3 | 1 | 4 |
CountyA | 03/04/2020 | 0 | 0 | 0 |
CountyA | 03/05/2020 | 5 | 2 | 7 |
CountyA | 03/06/2020 | 0 | 0 | 0 |
CountyB | 03/01/2020 | 0 | 0 | 0 |
CountyB | 03/02/2020 | 1 | 0 | 1 |
CountyB | 03/03/2020 | 0 | 0 | 0 |
CountyB | 03/04/2020 | 4 | 1 | 5 |
CountyB | 03/05/2020 | 0 | 0 | 0 |
CountyB | 03/06/2020 | 2 | 1 | 3 |
data have;
input county $ date :mmddyy10. value1 value2 total;
format date mmddyy10.;
datalines;
CountyA 03/02/2020 2 0 2
CountyA 03/03/2020 3 1 4
CountyA 03/05/2020 5 2 7
CountyB 03/02/2020 1 0 1
CountyB 03/04/2020 4 1 5
CountyB 03/06/2020 2 1 3
;
run;
%let start_date=01mar2020;
%let end_date=06mar2020;
data want;
if _n_=1 then do;
dcl hash H () ;
h.definekey ("date") ;
h.definedata ("value1", "value2", "total") ;
h.definedone () ;
end;
do until(last.county);
set have;
by county;
h.add();
end;
do date="&start_date"d to "&end_date"d;
value1=0;value2=0;total=0;
_n_=h.find();
output;
end;
h.clear();
run;
county | date | value1 | value2 | total |
---|---|---|---|---|
CountyA | 03/01/2020 | 0 | 0 | 0 |
CountyA | 03/02/2020 | 2 | 0 | 2 |
CountyA | 03/03/2020 | 3 | 1 | 4 |
CountyA | 03/04/2020 | 0 | 0 | 0 |
CountyA | 03/05/2020 | 5 | 2 | 7 |
CountyA | 03/06/2020 | 0 | 0 | 0 |
CountyB | 03/01/2020 | 0 | 0 | 0 |
CountyB | 03/02/2020 | 1 | 0 | 1 |
CountyB | 03/03/2020 | 0 | 0 | 0 |
CountyB | 03/04/2020 | 4 | 1 | 5 |
CountyB | 03/05/2020 | 0 | 0 | 0 |
CountyB | 03/06/2020 | 2 | 1 | 3 |
try next code:
data have;
input county $ date :mmddyy10. value1 value2 total;
format date mmddyy10.;
datalines;
CountyA 03/02/2020 2 0 2
CountyA 03/03/2020 3 1 4
CountyA 03/05/2020 5 2 7
CountyB 03/02/2020 1 0 1
CountyB 03/04/2020 4 1 5
CountyB 03/06/2020 2 1 3
;
run;
proc sort data=have; by county date; run;
data want;
set have;
by county;
array dayx d1-d6;
if first.county then do 1 to 6;
dayx(i) = .;
end;
if not last.county then do;
if date < '01jun2020'd or
date > '06jun2020'd then output;
else dayx(day(date)) = 1;
end;
if last.county then do;
if '01jun2020'd le date le '06jun2020'd
then dayx(day(date)) = 1;
output;
do i=1 to 6;
if dayx(i) = 0 then do;
valu1=0; valu2=0; value3=0;
date = mdy(6,i,2020);
output;
end;
end;
end;
run;
proc sort data=have; by county date; run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.