I have tried to find the answer for this but have had no luck. I have searched and searched my SAS books and online and found nothing that works. Here is my code:
proc sql;
create table pci.alla as
(select distinct
mbr_sys_id,
substr(clm_aud_nbr, 1,10) as claim,
fst_srvc_dt,
adj_hccc_cd,
prov_tin,
mpin,
full_nm,
chrg_sts_cd,
AMA_PL_OF_SRVC_DESC,
MAJ_MKT_NM,
max(mkt_row_eff_dt) as mkt_row_eff_dt format=date9.,
mkt_row_end_dt,
finc_arng_cd,
sys_drg_cd
from pci.all
group by mbr_sys_id,
clm_aud_nbr,
fst_srvc_dt,
adj_hccc_cd,
prov_tin,
mpin,
full_nm,
chrg_sts_cd,
AMA_PL_OF_SRVC_DESC,
MAJ_MKT_NM,
mkt_row_end_dt,
finc_arng_cd,
sys_drg_cd);
run;
The results are duplicated for the mkt row eff dt. It comes over like this
taxid mkt row eff dt maj mkt nm
111 01jan1996 new england
111 01jan1997 maine
111 01jan1998 connecticut
222 01feb1990 new england
222 01feb1991 new jersey
222 01feb1997 new york
333 01sep1990 new york
333 01sep1993 new jersey
333 01sep2004 philadelphia
444 01dec1998 pittsburgh
444 01sep2006 new york
The goal is to get the maximum of the mkt row eff dt with its corresponding market name. So in doing the max (and I am only doing these rows because the output is very large and also PHI related and just doing an example here so I realize I am missing the mbr_sys_id output, etc, etc.........the problem is solely in the mkt row eff dt that I am not getting the maximum date per the taxid) What I should see is all the fields returned but it should be like this:
taxid mkt row eff dt maj mkt nm
111 01jan1998 connecticut
222 01feb1997 new york
333 01sep2004 philadelphia
444 01sep2006 new york
This is because for instance taxid 111 that provider practiced in new england and maine prior to connecticut. Connecticut is the last place they practiced as of 1998 therefore I need that date but my max is not working. It gives me all of the rows and I know it has something to do with it is seeing 01jan 3 times and not the year. However, as you see......there are also variances in the months.
Remove MAJ_MKT_NM from your group by clause. You're telling SAS to group by that variable so it calculates the max per market, as you've requested.
And add in having max(effective_date)=effective date AFTER your group by clause, correcting the variable name.
When I run this it gives me an error: summary functions nested this way are not supported.
I did the end of it like:
group by mbr_sys_id,
clm_aud_nbr,
fst_srvc_dt,
adj_hccc_cd,
prov_tin,
mpin,
full_nm,
chrg_sts_cd,
AMA_PL_OF_SRVC_DESC,
mkt_row_end_dt,
finc_arng_cd,
sys_drg_cd
having max(mkt_row_eff_dt)= mkt_row_eff_dt);
run;
I don't have time to test anything but that does seem strange.
Things I would test:
1) what happens if I leave out the having clause
2) What happens if I don't call the max the same variable name
3) What happens if I don't select the market name in the sql call but merge it in afterwards.
would you be able to use something like this?
proc sql;
select
t1.*
from
have t1
where
mkt_row_eff_dt = (select max(mkt_row_eff_dt) from have where taxid=t1.taxid);
quit;
Hey tmm,
I would use "first." processing in the DATA step to achieve your goal.
Would you consider using the DATA step for this? If so, the code would be much easier to manage for this application.
Huey
I think tmm would actually need last. processing but that would be much simpler.
tmm,
Use the DATA step if you can help it!!!!
Huey
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.