I want create a starting date and ending date by group of variables. Below is an abbreviated example of what I want.
data have;
input Scenario $ Date $;
datalines;
A 20180601
A 20180701
A 20180801
B 20180701
B 20180801
C 20180601
C 20180701
;
run;
data want;
input Scenario $ Date $;
datalines;
A 20180601
A 20180701
A 20180801
B 20180601
B 20180701
B 20180801
C 20180601
C 20180701
C 20180801
;
run;
I want to figure out for each scenario A, B, C how to make the first date 20180601 and the last date 20180801? My real problem has several variables I need to group by.
So you want all groups to have all of the dates that appear in your data?
So if there are three variables that define the groups then something like this will make sure that all groups have all of the dates.
proc sql noprint;
create skeleton as
select *
from (select distinct var1,var2,var3 from have)
, (select distinct date from have)
order by var1,var2,var3,date
;
quit;
data want ;
merge skeleton have;
by var1 var2 var3 date;
run;
If instead you want to have all groups have every month between the min and max date for all groups then you need to first generate that list. This time let's show a method that doesn't use PROC SQL.
proc summary data=have ;
var date ;
output dates min=min_date max=max_date;
run;
data skeleton ;
set have (keep=var1 var2 var3 date);
by var1 var2 var3;
if first.var3;
if _n_=1 then set dates;
do i=0 to intck('month',min_date,max_date);
date=intnx('month',min_date,i);
output;
end;
drop i min_date max_date;
run;
data want;
merge skeleton have;
by var1 var2 var3 date;
run;
First thing is likely not to treat "dates" as character variables.
data have; input Scenario $ Date :yymmdd.; format date date9.; datalines; A 20180601 A 20180701 A 20180801 B 20180701 B 20180801 C 20180601 C 20180701 ; run;
Since you state that you have multiple variables that you need to "group by" then you should provide example data with multiple variables and what the "grouping by" would look like.
I actually don't see any "grouping" going on but adding records to have common values. And if there are other variables involved the approaches need to know what would be going on with other variables' values.
So you want all groups to have all of the dates that appear in your data?
So if there are three variables that define the groups then something like this will make sure that all groups have all of the dates.
proc sql noprint;
create skeleton as
select *
from (select distinct var1,var2,var3 from have)
, (select distinct date from have)
order by var1,var2,var3,date
;
quit;
data want ;
merge skeleton have;
by var1 var2 var3 date;
run;
If instead you want to have all groups have every month between the min and max date for all groups then you need to first generate that list. This time let's show a method that doesn't use PROC SQL.
proc summary data=have ;
var date ;
output dates min=min_date max=max_date;
run;
data skeleton ;
set have (keep=var1 var2 var3 date);
by var1 var2 var3;
if first.var3;
if _n_=1 then set dates;
do i=0 to intck('month',min_date,max_date);
date=intnx('month',min_date,i);
output;
end;
drop i min_date max_date;
run;
data want;
merge skeleton have;
by var1 var2 var3 date;
run;
Brilliant! Thanks!
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.