Dear
I am trying Proc sql step to reduce my the code in my program.
I need to calculate baseline values for weight and height by comparing dates in ex with vs.
output needed;
id weightbl heightbl
1 70 170
The vdate should less than edate and i need to select the last obs close to on or before edate. Please help in my code. Thank you.
data vsone;
input id tcd $ vdate $10. value;
datalines;
1 weight 2016-04-06 69
1 weight 2016-04-18 70
1 weight 2016-04-19 .
1 weight 2016-05-18 80
1 height 2016-04-16 169
1 height 2016-04-18 170
;
data exone;
input id edate $19.;
datalines;
1 2016-04-26T13:10
;
proc sql;
create table want as
select *
from vone as a left join eone as b
on a.id=b.id and input(a.vdate, is8601da.)- datepart(input(b.edate, is8601dt.) =< 1
where value ne .
group by id
having max(input(a.vdate,is8601da.)- datepart(input(b.edate,is8601dt.)));
quit;
Your SQL is certainly not much shorter, and absolutely much more difficult to follow than a data step.
> the last obs close to on or before edate
processing records in order screams data step.
data VSONE;
input ID TCD $ VDATE yymmdd10. VALUE;
datalines;
1 weight 2016-04-06 69
1 weight 2016-04-18 70
1 weight 2016-04-19 .
1 weight 2016-05-18 80
1 height 2016-04-16 169
1 height 2016-04-18 170
run;
data EXONE;
input ID EDATE yymmdd10.;
datalines;
1 2016-04-26T13:10
run;
data WANT;
retain WEIGHTBL HEIGHTBL;
keep ID WEIGHTBL HEIGHTBL;
merge VSONE (rename=(VALUE=VALW VDATE=DATEW) where=(TCD='weight' & VALW))
VSONE (rename=(VALUE=VALH VDATE=DATEH) where=(TCD='height' & VALH))
EXONE ;
by ID;
if first.ID then call missing( WEIGHTBL, HEIGHTBL);
if DATEW<=EDATE then WEIGHTBL=VALW;
if DATEH<=EDATE then HEIGHTBL=VALH;
if last.ID then output;
run;
WEIGHTBL HEIGHTBL ID
70 170 1
If you insist on SQL then:
data VSONE;
input ID TCD $ VDATE yymmdd10. VALUE;
datalines;
1 weight 2016-04-06 69
1 weight 2016-04-18 70
1 weight 2016-04-19 .
1 weight 2016-05-18 80
1 height 2016-04-16 169
1 height 2016-04-18 170
;
data EXONE;
input ID EDATE yymmdd10.;
datalines;
1 2016-04-26T13:10
;
proc sql;
select
a.id, b.value as weight, c.value as height
from
exone as a left join
(select * from vsone where tcd="weight" and value is not missing) as b
on a.id=b.id and a.edate>= b.vdate left join
(select * from vsone where tcd="height" and value is not missing) as c
on a.id=c.id and a.edate>= c.vdate
group by a.id
having b.vdate=max(b.vdate) and c.vdate=max(c.vdate);
quit;
Rather than hardcoding "height" and weight" I would use transpose to get the final table:
proc sql;
create table last as
select vsone.*
from vsone join exone
on vsone.id=exone.id and vsone.vdate <= exone.edate
where value ne .
group by vsone.id,tcd
having vsone.vdate=max(vsone.vdate);
quit;
proc transpose out=want(drop=_name_) suffix=tbl;
by id;
var value;
id tcd;
run;
I would have used the code window to post code, but for some reason "paste" does not work right now (just got a new Windows version, maybe therefore).
Rather than convering all the date texts to SAS dates, you can use the fact that they are ordered (both formats start with yyyy-mm-dd). I dropped the aliases, they do save a few keystrokes, but otherwise they just make the program harder to read.
I also dropped the left join. But maybe you want the highest VDATE when there is no EDATE? Then use the left join again, and add "or exone.edate is null" to the where clause.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.