SQL works as intended
proc sql;
create table ds as select * from sdtm.ds
where dscat='DISP EVENT'
group by usubjid
having visitnum=max(visitnum)
order by usubjid, visitnum;
run;
quit;
FEDSQL gives ERROR: Column "DS.STUDYID" must be GROUPed or used in an aggregate function
It seems as if the FEDSQL requires all the selected variables to be grouped, but then it doesn't take the latest visit.
Is there a way to replicate the SQL code with FEDSQL?
proc fedsql;
create table ds3 as select * from sdtm.ds
where dscat='DISP EVENT'
group by usubjid
having visitnum=max(visitnum)
order by usubjid, visitnum;
run;
quit;
This remerging with original data that PROC SQL does, is not ANSI standard. Which it seems that FedSQL is following at least in this case.
In later implementations of SQL this kind of functionality is handled in window functions. They, however to my knowledge, is not implemented (yet) in FedSQL.
So the alternative would be to do the grouping in an in-line view and join with the original data.
Thank you for the information; I used your suggestion and looked up an ANSI solution. Here it is:
proc fedsql;
create table ds3 as select a.*
from sdtm.ds a,
(select max(visitnum) as maxvis, usubjid from sdtm.ds group by usubjid) b
where a.usubjid=b.usubjid and
a.visitnum=b.maxvis and
a.dscat='DISP EVENT'
;
run;
quit;
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.