I achieve my goal (code below), but it feels complicated.
I would like to make an sql join with update, but I don't know how to do this.
The data step merge to my surprise didn't work either.
My data:
data work.DAT1;
infile datalines dsd truncover;
input GR_VR:$10. term:32. end:DDMMYY10. start_date:DDMMYY10. rv_mean:PERCENT9.1;
format 'end'n start_date ddmmyy10.;
datalines4;
102,12,01/08/2012,09/03/2012,57.0%
102,12,31/01/2013,02/08/2012,57.0%
102,12,18/04/2013,01/02/2013,52.0%
102,12,22/07/2013,19/04/2013,55.0%
102,12,01/01/2014,23/07/2013,51.0%
102,12,,02/01/2014,51.0%
103,12,08/03/2012,16/11/2011,65.0%
103,12,01/08/2012,09/03/2012,65.0%
103,12,31/01/2013,02/08/2012,61.0%
103,12,18/02/2013,01/02/2013,56.0%
103,12,18/04/2013,19/02/2013,59.0%
103,12,22/07/2013,19/04/2013,59.0%
103,12,01/01/2014,23/07/2013,57.0%
103,12,,02/01/2014,58.0%
;;;;
data work.DAT2;
infile datalines dsd truncover;
input GR_VR:$10. term:32. start_date:DDMMYY10. rv_mean:PERCENT9.1;
format start_date ddmmyy10.;
datalines4;
102,24,09/03/2012,48.0%
102,36,09/03/2012,41.0%
102,48,09/03/2012,34.0%
102,60,09/03/2012,25.0%
102,72,09/03/2012,20.0%
102,24,02/08/2012,48.0%
102,36,02/08/2012,41.0%
102,48,02/08/2012,32.0%
102,60,02/08/2012,25.0%
102,72,02/08/2012,20.0%
102,24,01/02/2013,43.0%
102,36,01/02/2013,36.0%
102,48,01/02/2013,27.0%
102,60,01/02/2013,20.0%
102,72,01/02/2013,16.0%
102,24,19/04/2013,45.0%
102,36,19/04/2013,39.0%
102,48,19/04/2013,29.0%
102,60,19/04/2013,26.0%
102,72,19/04/2013,21.0%
102,24,23/07/2013,45.0%
102,36,23/07/2013,39.0%
102,48,23/07/2013,29.0%
102,60,23/07/2013,26.0%
102,72,23/07/2013,22.0%
102,24,02/01/2014,45.0%
102,36,02/01/2014,38.0%
102,48,02/01/2014,29.0%
102,60,02/01/2014,26.0%
102,72,02/01/2014,22.0%
103,24,16/11/2011,55.0%
103,36,16/11/2011,47.0%
103,48,16/11/2011,39.0%
103,60,16/11/2011,25.0%
103,72,16/11/2011,19.0%
103,24,09/03/2012,55.0%
103,36,09/03/2012,47.0%
103,48,09/03/2012,39.0%
103,60,09/03/2012,30.0%
103,72,09/03/2012,24.0%
103,24,02/08/2012,52.0%
103,36,02/08/2012,45.0%
103,48,02/08/2012,38.0%
103,60,02/08/2012,30.0%
103,72,02/08/2012,25.0%
103,24,01/02/2013,47.0%
103,36,01/02/2013,40.0%
103,48,01/02/2013,33.0%
103,60,01/02/2013,25.0%
103,72,01/02/2013,20.0%
103,24,19/02/2013,47.0%
103,36,19/02/2013,40.0%
103,48,19/02/2013,33.0%
103,60,19/02/2013,25.0%
103,72,19/02/2013,20.0%
103,24,19/04/2013,46.0%
103,36,19/04/2013,39.0%
103,48,19/04/2013,33.0%
103,60,19/04/2013,28.0%
103,72,19/04/2013,23.0%
103,24,23/07/2013,46.0%
103,36,23/07/2013,40.0%
103,48,23/07/2013,33.0%
103,60,23/07/2013,28.0%
103,72,23/07/2013,23.0%
103,24,02/01/2014,46.0%
103,36,02/01/2014,41.0%
103,48,02/01/2014,33.0%
103,60,02/01/2014,28.0%
103,72,02/01/2014,23.0%
;;;;
My solution:
data dat1a;
set dat1;
do temp=12, 24, 36, 48, 60, 72;
term=temp;
output;
end;
run;
proc sql;
create table all_tarifs(drop=rv:) as
select a.*, b.rv_mean as rv, case when rv=. then a.rv_mean else rv end as VR_tablas format=percent9.1
from dat1a a left join dat2 b
on a.gr_vr=b.gr_vr and a.start_date=b.start_date and a.term=b.term ;
quit;
proc sort data=all_tarifs;
by gr_vr start_date term;
run;
My failed attempts:
/* not working */
data all_tarifs;
merge dat1 dat2;
by gr_vr start_date term;
run;
here simplified the 'end'n date is missing but should be copied from dat1 if matches start_date and gr_vr
Is this what you want?
data all_tarifs;
merge dat1(in=in1) dat2;
retain end1;
if in1 then end1=end;
by gr_vr start_date term;
format end1 ddmmyy10.;
drop end;
run;
I achieve my goal (code below), but it feels complicated.
I would like to make an sql join with update, but I don't know how to do this.
The data step merge to my surprise didn't work either.
How can we help you if you don't tell us what is wrong and what you expect as the result? Saying "didn't work" really doesn't help. What didn't work? What did you expect as the result?
@PaigeMiller, the resulting file all_tarifs shows you the desired output. As I said, I achieve what I want.
But the data step merge for example doesn't create all rows as expected.
the dat1 file contains the term=12 and additional information.
the dat2 includes terms in (24 36 48 60 72) but has less information (here simplified the 'end'n date is missing but should be copied from dat1 if matches start_date and gr_vr).
And I wonder if the update function in sql can do what I want.
here simplified the 'end'n date is missing but should be copied from dat1 if matches start_date and gr_vr
Is this what you want?
data all_tarifs;
merge dat1(in=in1) dat2;
retain end1;
if in1 then end1=end;
by gr_vr start_date term;
format end1 ddmmyy10.;
drop end;
run;
Does below SQL return what you desire?
proc sql;
create table all_tarifs as
select distinct
a.gr_vr,
a.term,
a.start_date,
coalesce(a.rv_mean,b.rv_mean) as VR_tablas format=percent9.1
from dat1a a left join dat2 b
on
a.gr_vr=b.gr_vr
and a.start_date=b.start_date
order by
a.gr_vr, a.start_date, a.term
;
quit;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.