I have a data set that has person, day, and events. The data is sparse; there are no observations showing events=0. I want to fill it out with all of the days from 01jan2012 to 31mar2012 for each person with events=0 so that I can eventually use proc summary or a data step to create monthly and global sums by person, where I'll need to report months with zero events.
data have;
input person day : date9. events;
format day date9.;
datalines;
1 03jan2012 1
1 15jan2012 2
1 01mar2012 1
2 02feb2012 3
2 01mar2012 1
;
run;
want:
person day events
1 01jan2012 0
1 02jan2012 0
1 03jan2012 1
1 04jan2012 0
...
1 31mar2012 0
2 01jan2012 0
...
10 31mar2012 0
How can I use a data step to loop through all people (1 to 10) and from 01jan2012 to 31mar2012 and append the missing observations with events=0 ?
Thanks,
pteranodon
This is by person/date but you could do the same by date only.
data dates;
do person=1 to 10;
do date='01Jan2012'd to '31Mar2012'd by 1;
event=1; output;
end;
run;
proc sql;
create table want as
select people, date, coalesce(b.event, a.event) as event
from dates a
left join have b
on a.person=b.person and a.date=b.date;
quit;
I suggest creating your empty data set, Jan1-mar31 with 0 events and then merge in your actual data by date, allowing your actual data to overwrite the events.
This is by person/date but you could do the same by date only.
data dates;
do person=1 to 10;
do date='01Jan2012'd to '31Mar2012'd by 1;
event=1; output;
end;
run;
proc sql;
create table want as
select people, date, coalesce(b.event, a.event) as event
from dates a
left join have b
on a.person=b.person and a.date=b.date;
quit;
It looks like Coalesce on (empty set) Left Join (sparse data) is what I'm looking for.
The solution Reeza suggested will work if you know you have persons in sequential order of 1 through 10. If you don't, here's a more flexible solution that gets the list of persons from the have dataset.
data have;
input person day : date9. events;
format day date9.;
datalines;
1 03jan2012 1
1 15jan2012 2
1 01mar2012 1
2 02feb2012 3
2 01mar2012 1
23 28jan2012 4
;
run;
proc sort data=have (keep=person) out=persons nodup;
by person;
run;
data person_days;
keep person day;
set persons;
format day date9.;
do day = '01jan2012'd to '31mar2012'd;
output;
end;
run;
proc sql;
create table want as
select a.person, a.day, coalesce (b.events, 0) as events
from person_days a
left join have b on a.person = b.person
and a.day = b.day
;
quit;
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.