I have a data set that has person, day, and events. The data is sparse; there are no observations showing events=0. I want to fill it out with all of the days from 01jan2012 to 31mar2012 for each person with events=0 so that I can eventually use proc summary or a data step to create monthly and global sums by person, where I'll need to report months with zero events.
data have;
input person day : date9. events;
format day date9.;
datalines;
1 03jan2012 1
1 15jan2012 2
1 01mar2012 1
2 02feb2012 3
2 01mar2012 1
;
run;
want:
person day events
1 01jan2012 0
1 02jan2012 0
1 03jan2012 1
1 04jan2012 0
...
1 31mar2012 0
2 01jan2012 0
...
10 31mar2012 0
How can I use a data step to loop through all people (1 to 10) and from 01jan2012 to 31mar2012 and append the missing observations with events=0 ?
Thanks,
pteranodon
This is by person/date but you could do the same by date only.
data dates;
do person=1 to 10;
do date='01Jan2012'd to '31Mar2012'd by 1;
event=1; output;
end;
run;
proc sql;
create table want as
select people, date, coalesce(b.event, a.event) as event
from dates a
left join have b
on a.person=b.person and a.date=b.date;
quit;
I suggest creating your empty data set, Jan1-mar31 with 0 events and then merge in your actual data by date, allowing your actual data to overwrite the events.
This is by person/date but you could do the same by date only.
data dates;
do person=1 to 10;
do date='01Jan2012'd to '31Mar2012'd by 1;
event=1; output;
end;
run;
proc sql;
create table want as
select people, date, coalesce(b.event, a.event) as event
from dates a
left join have b
on a.person=b.person and a.date=b.date;
quit;
It looks like Coalesce on (empty set) Left Join (sparse data) is what I'm looking for.
The solution Reeza suggested will work if you know you have persons in sequential order of 1 through 10. If you don't, here's a more flexible solution that gets the list of persons from the have dataset.
data have;
input person day : date9. events;
format day date9.;
datalines;
1 03jan2012 1
1 15jan2012 2
1 01mar2012 1
2 02feb2012 3
2 01mar2012 1
23 28jan2012 4
;
run;
proc sort data=have (keep=person) out=persons nodup;
by person;
run;
data person_days;
keep person day;
set persons;
format day date9.;
do day = '01jan2012'd to '31mar2012'd;
output;
end;
run;
proc sql;
create table want as
select a.person, a.day, coalesce (b.events, 0) as events
from person_days a
left join have b on a.person = b.person
and a.day = b.day
;
quit;
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.