/*1*/
data temp_actuals;
format day date8.;
do day= 21185 to 21243;
temp=35+rand('uniform')*10;
output;
end;
run;
/*2*/
data dist_periods (drop=counter);
ID=0;
format
period_start_date date8.
run_date date8.
period_end_date date8.;
do period_start_date= 21212 to 21214 by 2;
run_date=period_start_date+14;
period_end_date=period_start_date+29;
if _N_ = 1 then counter =1;
id+counter;
output;
end;
run;
/*3*/
proc sql;
create table comb1 as
select
t2.*,
t1.*
from
temp_actuals t1,
(select * from dist_periods where period_start_date=21212) t2
where
t1.day >= t2.period_start_date and
t1.day <= t2.run_date
;
/*4*/
proc sql;
create table comb2 as
select
t2.*,
t1.*
from
temp_actuals t1,
(select * from dist_periods where period_start_date=21214) t2
where
t1.day >= t2.period_start_date and
t1.day <= t2.run_date
;
/*5*/
data combined;
set comb1 comb2;
run;
See the 5 steps numbered and commented out in the code.
I want to merge temp_actuals
and dist_periods
into combined
in one data step without using a macro (there are a few reasons for avoiding macro that I don't want to get into, please don't provide a solution like that, I'm 99% sure this can be done in a data step).
To resummarize, i want to combine the data in step 1 with the data in step 2 to get the data in step 5.
Although the example has 2 rows, the real dist_periods
file can have any number of rows in the hundreds, so writing out joins for every iteration of dist_periods
is not feasible.
Here's me trying to describe the routine, probably not that helpful:
for each row 1 to n in dist_periods,
join each dist_periods row to every row in temp_actuals that is within the date range in dist_periods
repeat process for row 2 to n of dist_periods
Sorry I can't be more descriptive, but that's why google has failed me, i don't know how to describe it well but the code should do the trick. I'm trying to be a better data step programmer; leaning in proc sql is an easy way to forget Programming II class.
Why not SQL:
proc sql; create table want as select a.*, b.* from dist_periods as a, temp_actuals as b where a.period_start_date le b.day le a.run_date ; quit;
Of course if you have large data that may take some time as it is not optimized in any way.
Why not SQL:
proc sql; create table want as select a.*, b.* from dist_periods as a, temp_actuals as b where a.period_start_date le b.day le a.run_date ; quit;
Of course if you have large data that may take some time as it is not optimized in any way.
Thanks, that works--what a brain fart on my part, I don't know why I assumed that wouldn't work.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.