/*1*/
data temp_actuals;
format day date8.;
do day= 21185 to 21243;
temp=35+rand('uniform')*10;
output;
end;
run;
/*2*/
data dist_periods (drop=counter);
ID=0;
format
period_start_date date8.
run_date date8.
period_end_date date8.;
do period_start_date= 21212 to 21214 by 2;
run_date=period_start_date+14;
period_end_date=period_start_date+29;
if _N_ = 1 then counter =1;
id+counter;
output;
end;
run;
/*3*/
proc sql;
create table comb1 as
select
t2.*,
t1.*
from
temp_actuals t1,
(select * from dist_periods where period_start_date=21212) t2
where
t1.day >= t2.period_start_date and
t1.day <= t2.run_date
;
/*4*/
proc sql;
create table comb2 as
select
t2.*,
t1.*
from
temp_actuals t1,
(select * from dist_periods where period_start_date=21214) t2
where
t1.day >= t2.period_start_date and
t1.day <= t2.run_date
;
/*5*/
data combined;
set comb1 comb2;
run;See the 5 steps numbered and commented out in the code.
I want to merge temp_actuals and dist_periods into combined in one data step without using a macro (there are a few reasons for avoiding macro that I don't want to get into, please don't provide a solution like that, I'm 99% sure this can be done in a data step).
To resummarize, i want to combine the data in step 1 with the data in step 2 to get the data in step 5.
Although the example has 2 rows, the real dist_periods file can have any number of rows in the hundreds, so writing out joins for every iteration of dist_periods is not feasible.
Here's me trying to describe the routine, probably not that helpful:
for each row 1 to n in dist_periods,
join each dist_periods row to every row in temp_actuals that is within the date range in dist_periods
repeat process for row 2 to n of dist_periods
Sorry I can't be more descriptive, but that's why google has failed me, i don't know how to describe it well but the code should do the trick. I'm trying to be a better data step programmer; leaning in proc sql is an easy way to forget Programming II class.
Why not SQL:
proc sql;
create table want as
select a.*, b.*
from dist_periods as a, temp_actuals as b
where a.period_start_date le b.day le a.run_date
;
quit;
Of course if you have large data that may take some time as it is not optimized in any way.
Why not SQL:
proc sql;
create table want as
select a.*, b.*
from dist_periods as a, temp_actuals as b
where a.period_start_date le b.day le a.run_date
;
quit;
Of course if you have large data that may take some time as it is not optimized in any way.
Thanks, that works--what a brain fart on my part, I don't know why I assumed that wouldn't work.
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Still thinking about your presentation idea? The submission deadline has been extended to Friday, Nov. 14, at 11:59 p.m. ET.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.