Hi!
I am trying to run the code below in SAS enterprise guide:
proc sql;
create table temp.lab as
select l.gp_patidf, l.gp_exadate as datum, coalesce(l.gp_examne, l.gp_exacod, l.gp_exashb) as code, l.gp_exatxt1,
ROW_NUMBER() OVER(PARTITION BY gp_patidf, gp_examne, gp_exashb ORDER BY gp_exadate desc) AS tel
from sample_lab as l
where exists (
select *
from clean.COVID_infection ep
where l.gp_patidf=ep.gp_patidf
)
and year(gp_exadate)>=2019
and (
gp_exanote like '%covid%'
or gp_examne in ('SCVTBMT','SCVAB','SCVGB','SCVMB','SCATRA','SCASRA','SCVRBK','SCVPRK','SCVRRM','SCVPRM','SCVRRS','SCVPRS','SCVRRA','SCVPRA','ZKTDAQ')
or gp_exacod in ('4002','4005','4004','4003','4033','4012','4029','3999','4030','4000','4031','4001','4032','4011','3234')
or gp_exashb in ('COVID19STA','COVIDBEL')
or gp_epsicpc='R83.03'
);
quit;
I then receive the following error:
28 proc sql;
29 create table temp.lab as
30 select l.gp_patidf, l.gp_exadate as datum, coalesce(l.gp_examne, l.gp_exacod, l.gp_exashb) as code, l.gp_exatxt1,
31 ROW_NUMBER() OVER(PARTITION BY gp_patidf, gp_examne, gp_exashb ORDER BY gp_exadate desc) AS tel
____
22
76
ERROR 22-322: Syntax error, expecting one of the following: !, !!, &, *, **, +, ',', -, /, <, <=, <>, =, >, >=, ?, AND, BETWEEN,
CONTAINS, EQ, EQT, GE, GET, GT, GTT, LE, LET, LIKE, LT, LTT, NE, NET, OR, ^=, |, ||, ~=.
ERROR 76-322: Syntax error, statement will be ignored.
From other topics I have read, I now understand that row_number () and partition by functions are not available in SAS SQL.
Some solutions to this, suggested in other topics, I found difficult to incorporate in my code.
Does anyone have an alternative?
To clarify, I want to know the most recent date (gp_exadate)
and partition by person (gp_patidf)
, gp_examne, gp_exashb.
This is needed to further on use in code below (using created variable 'tel'):
proc sql;
create table covid_infection_def as
select epi.gp_patidf,
epi.gp_epsicpc,
case when bel.code is not null then 1 else 0 end as quest,
case when epi.datum<j.datum then epi.datum else coalesce(j.datum, epi.datum) end as datum,
v.datum as cordatum,
sta.gp_exatxt1 as Covstatus,
bel.gp_exatxt1 as CovBeleid,
zdag.gp_exatxt1 as ZKTDAQ,
case when epi.hosp=1 or j.hosp=1 or v.hosp=1 then 1 else 0 end as hosp,
case when epi.ic=1 or j.ic=1 or v.ic=1 then 1 else 0 end as ic,
epi.overleden,
epi.gp_epsnote as icpcprob
from clean.COVID_infection epi
left join temp.jour j
on epi.gp_patidf=j.gp_patidf
left join lab sta
on epi.gp_patidf=sta.gp_patidf and sta.code='COVID19STA' and sta.tel=1 /*laatste resultaat per meting per patient*/
left join lab bel
on epi.gp_patidf=bel.gp_patidf and bel.code='COVIDBEL' and bel.tel=1
left join lab zdag
on epi.gp_patidf=zdag.gp_patidf and (zdag.code='ZKTDAQ' or zdag.code='3234') and zdag.tel=1
left join temp.verw v
on epi.gp_patidf=v.gp_patidf;
quit;
Instead of creating a row number for a group and then selecting the lowest of that, use a HAVING clause for the group
having exadate = max(exadate)
Quick example:
data have;
input id date :yymmdd10. value;
format date yymmdd10.;
datalines;
1 2020-02-25 5
1 2020-03-20 6
2 2022-04-01 9
2 2022-03-10 7
;
proc sql;
create table want as
select * from have
group by id
having date = max(date);
quit;
Check if you can run a SQL-PASS-THROUGH Query so you'd not have to change the code
- Cheers -
data have;
set sashelp.class;
run;
ods select none;
ods output sql_results=temp;
proc sql number;
select * from have order by sex;
create table want as
select *,row-min(row)+1 as n
from temp
group by sex
order by row;
quit;
ods select all;
Amazing! I can have multiple columns in group/order by clause. SAS log always have "NOTE: The query requires remerging summary statistics back with the original data." Good alternative to row_number() over ([partition_by_clause] order_by_clause) say Impala SQL. Thanks! @Ksharp
Instead of creating a row number for a group and then selecting the lowest of that, use a HAVING clause for the group
having exadate = max(exadate)
Quick example:
data have;
input id date :yymmdd10. value;
format date yymmdd10.;
datalines;
1 2020-02-25 5
1 2020-03-20 6
2 2022-04-01 9
2 2022-03-10 7
;
proc sql;
create table want as
select * from have
group by id
having date = max(date);
quit;
Are you just looking for the maximum date? Then use PROC SUMMARY.
If you want to also get the value of some other variables from the observation with the maximum date then look at the IDGROUP option of the OUTPUT statement in PROC SUMMARY.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.