I am connecting to a DB2 server and trying to run this query:
proc sql;
connect odbc (dsn);
create table claims as select * from connection to odbc
(select distinct
mbr_id,
prov_mpin,
prov_id,
(max(prov_row_eff_dt)) as prov_eff_dt
from co.table
where mbr_id>0);
order by mbr_id;
disconnect from odbc;
quit;
an expression starting with prov_id specified in select clause, having clause or order by clause is not specified in the group by or the select clause
having clause or oder by clause with a column function and no group by clause is specified.
So I moved the prov_id to above the prov_mpin and then I get the same error but it relates to prov_mpin. so that tells me something is wrong with the max statement. if i taked the max out and just put in prov_row_eff_dt as prov_eff_dt the query runs. So something I am doing is not right with just that max portion. I tried removing the () around the max and that did not work. I am trying to get the max date of the prov_row_eff_dt so I get the newest prov_id.
So I should have
where mbr_id>0
group by mbr_id, prov_id, prov_row_eff_dt);
order by mbr_id;
disconnect from odbc;
quit;
If you use max(), min(), sum(), count(), freq, n, range, nmiss, std and many many others, which are so called summary functions, you HAVE to use 'group by' at the same time.
Regards,
Haikuo
As an add on to Hai.Kuo comments, if you're using GROUP BY in SQL pass through you'll also have to explicitly remerge back all other fields you want that aren't in the Group BY statement or in a summary function because of SQL restrictions.
So I should have
where mbr_id>0
group by mbr_id, prov_id, prov_row_eff_dt);
order by mbr_id;
disconnect from odbc;
quit;
I don't think so. Try this:
proc sql;
connect odbc (dsn);
create table claims as
select * from connection to odbc (
select mbr_id
, prov_mpin
, prov_id
, max(prov_row_eff_dt) as prov_eff_dt
from co.table
where mbr_id > 0
group by 1,2,3
)
order by mbr_id;
disconnect from odbc;
quit;
I reformatted your query to make it easier to read. I removed the DISTINCT keyword because I don't think it is needed, and I used column references (1,2,3) in the GROUP BY clause, which I'm pretty sure will work. I also deleted a stray semi-colon from your example code (in front of your ORDER BY clause).
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.