- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi. I have a data table where the data for different Names only shows entry rows for dates (MY)
which had a Value.
I would like to get to finding average monthly Values SO I would like a table with a row for each month.
I would like to fill out the table for the Names data such that all Names rows start on the same date
and fills in value=0 for the added rows.
For example: I would like to say start all the date ranges at 11/2017 and
fill in up to the Previous full month (07/2019 since we are in August; Previous will change each month).
Thanks!
HAVE: WANT:
NAME DATE VALUE NAME DATE VALUE
Name1 12/2017 2 Name1 11/2017 0
Name1 07/2018 1 Name1 12/2017 2
Name1 02/2019 3 Name1 01/2019 0
Name2 01/2018 3 Name1 02/2019 0 etc.
Name2 11/2018 4 ......Name1 07/2019 0 end of the Name1 series
Name2 07/2019 1 Name2 11/2017 0
Name3.....etc Name2 12/2017 0
Name2 01/2018 3
...etc.
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
There are more slick ways to do this, but here is one
data have;
input name $ date :anydtdte7. value;
format date mmyys7.;
datalines;
NAME1 12/2017 2
NAME1 07/2018 1
NAME1 02/2019 3
NAME2 01/2018 3
NAME2 11/2018 4
NAME2 07/2019 1
;
proc summary data=have nway;
class name date;
var value;
format date mmyys7.;
output out=agg(drop=_TYPE_ _FREQ_) sum=;
run;
data want(drop=rc);
declare hash h(dataset:'agg');
h.definekey('name', 'date');
h.definedata('value');
h.definedone();
set have;
by name;
if first.name then do;
date='01oct2017'd;
do until (date=intnx('month', today(), -1, 'b'));
value=0;
rc=h.find();
output;
date=intnx('month', date, 1, 'b');
end;
end;
format date mmyys7.;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Do you have a SAS/ETS license?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
There are more slick ways to do this, but here is one
data have;
input name $ date :anydtdte7. value;
format date mmyys7.;
datalines;
NAME1 12/2017 2
NAME1 07/2018 1
NAME1 02/2019 3
NAME2 01/2018 3
NAME2 11/2018 4
NAME2 07/2019 1
;
proc summary data=have nway;
class name date;
var value;
format date mmyys7.;
output out=agg(drop=_TYPE_ _FREQ_) sum=;
run;
data want(drop=rc);
declare hash h(dataset:'agg');
h.definekey('name', 'date');
h.definedata('value');
h.definedone();
set have;
by name;
if first.name then do;
date='01oct2017'd;
do until (date=intnx('month', today(), -1, 'b'));
value=0;
rc=h.find();
output;
date=intnx('month', date, 1, 'b');
end;
end;
format date mmyys7.;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Draycut, thanks. It worked just right.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Using straight "old school" data step means (and a sort):
data have;
input name $ date :anydtdte7. value;
format date mmyys7.;
datalines;
NAME1 12/2017 2
NAME1 07/2018 1
NAME1 02/2019 3
NAME2 01/2018 3
NAME2 11/2018 4
NAME2 07/2019 1
;
%let start='01nov2017'd;
%let end='01jul2019'd;
data want;
set have;
by name;
retain olddate;
if first.name
then do;
output; /* keep current observation */
olddate = date;
value = 0;
date = &start;
do while (date < olddate);
output;
date = intnx('month',date,1);
end;
end;
else do;
output;
intdate = date;
value = 0;
date = intnx('month',olddate,1);
do while (date < intdate);
output;
date = intnx('month',date,1);
end;
olddate = intdate;
end;
if last.name
then do;
value = 0;
date = intnx('month',olddate,1);
do while (date le &end);
output;
date = intnx('month',date,1);
end;
end;
drop olddate intdate;
run;
proc sort data=want;
by name date;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If you have SAS/ETS license, then this is a job for PROC TIMESERIES
data have;
input name $ date :anydtdte7. value;
format date mmyys7.;
datalines;
NAME1 12/2017 2
NAME1 07/2018 1
NAME1 02/2019 3
NAME2 01/2018 3
NAME2 11/2018 4
NAME2 07/2019 1
;
proc timeseries data=have out=want;
by name;
id date interval=month
accumulate=total
setmiss=0
start='01nov2017'd
end ='01jul2019'd;
var value;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
data have;
input name $ date :anydtdte7. value;
format date mmyys7.;
datalines;
NAME1 12/2017 2
NAME1 07/2018 1
NAME1 02/2019 3
NAME2 01/2018 3
NAME2 11/2018 4
NAME2 07/2019 1
;
data want;
if _n_=1 then do;
declare hash h();
h.definekey('date');
h.definedata('value');
h.definedone();
max=intnx('mon',today(),-1,'e');
end;
do until(last.name);
set have;
by name;
rc=h.add();
end;
retain max min '01nov2017'd;
date=min;
do while(date<max);
if h.find() ne 0 then value=0;
output;
date=intnx('mon',date,1);
end;
h.clear();
drop min max rc;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks, everyone! (I don't have ETS btw).
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If your goal is "to get to finding average monthly Values", I see no sense in this exercise of expanding the data set and filling in with zeroes. This is because the date ranges you already have are sufficient for computing your averages directly:
data have ;
input name :$5. date :anydtdte7. value ;
format date yymms7. ;
cards ;
NAME1 12/2017 1
NAME1 07/2018 2
NAME1 02/2019 3
NAME2 01/2018 4
NAME2 11/2018 5
NAME2 07/2019 6
;
run ;
data mean (keep = name sum N mean) ;
do until (last.name) ;
set have ;
by name ;
if first.name then _d = date ;
sum = sum (sum, value) ;
end ;
N = 1 + intck ("mon", _d, date) ;
Mean = divide (sum, N) ;
run ;
Doing so will also save you computer time, disk space, and the pain of figuring out how to create the "fill in" data set.
However, if you still want to create it, it can be done on the fly by looking forward to the next date, for examplei:
data fill (drop = _:) ;
set have ;
by name ;
if last.name then output ;
else do ;
p = _n_ + 1 ;
set have (keep=date rename=date=_d) point = p ;
do while (date < _d) ;
output ;
value = 0 ;
date = intnx ("mon", date, 1) ;
end ;
end ;
run ;
For those averse to explicit POINT= look-ahead logic, perhaps even simpler would be:
data v / view = v ;
set have (keep = name date rename=date=_d) ;
by name ;
if not first.name ;
run ;
data fill (drop = _:) ;
merge have v ;
by name ;
if last.name then output ;
else do while (date < _d) ;
output ;
value = 0 ;
date = intnx ("mon", date, 1) ;
end ;
run ;
Kind regards
Paul D.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for a different approach!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
data have;
input name $ _date $ value;
date=input(cats('01/',_date),ddmmyy10.);
drop _:;
format date mmyys7.;
datalines;
NAME1 12/2017 2
NAME1 07/2018 1
NAME1 02/2019 3
NAME2 01/2018 3
NAME2 11/2018 4
NAME2 07/2019 1
;
data date;
do date='01nov2017'd to '01jul2019'd;
if month(date) ne month then output;
month=month(date);
end;
drop month;
format date mmyys7.;
run;
proc sql;
create table want as
select a.*,coalesce(b.value,0) as value
from (
select * from
(select distinct name from have),(select date from date)
) as a left join have as b
on a.name=b.name and a.date=b.date;
quit;