SQL works as intended
proc sql;
create table ds as select * from sdtm.ds
where dscat='DISP EVENT'
group by usubjid
having visitnum=max(visitnum)
order by usubjid, visitnum;
run;
quit;
FEDSQL gives ERROR: Column "DS.STUDYID" must be GROUPed or used in an aggregate function
It seems as if the FEDSQL requires all the selected variables to be grouped, but then it doesn't take the latest visit.
Is there a way to replicate the SQL code with FEDSQL?
proc fedsql;
create table ds3 as select * from sdtm.ds
where dscat='DISP EVENT'
group by usubjid
having visitnum=max(visitnum)
order by usubjid, visitnum;
run;
quit;
This remerging with original data that PROC SQL does, is not ANSI standard. Which it seems that FedSQL is following at least in this case.
In later implementations of SQL this kind of functionality is handled in window functions. They, however to my knowledge, is not implemented (yet) in FedSQL.
So the alternative would be to do the grouping in an in-line view and join with the original data.
Thank you for the information; I used your suggestion and looked up an ANSI solution. Here it is:
proc fedsql;
create table ds3 as select a.*
from sdtm.ds a,
(select max(visitnum) as maxvis, usubjid from sdtm.ds group by usubjid) b
where a.usubjid=b.usubjid and
a.visitnum=b.maxvis and
a.dscat='DISP EVENT'
;
run;
quit;
Save $250 on SAS Innovate and get a free advance copy of the new SAS For Dummies book! Use the code "SASforDummies" to register. Don't miss out, May 6-9, in Orlando, Florida.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.