I have insurance claim database. I want to know from which hospital I got maximum claims. How can I do this in PROC SQL procedure?
Thanks in advance.
Arun
Linlin,
Count(*) will count all of the rows. In the cases when only populated fields need be counted, the variable names will be needed:
data have;
input hospital $ claim;
cards;
a 23
b 45
a .
b 43
a 55
;
proc sql ;
select
hospital,count(claim) as count
from have
group by hospital
order by count desc;
quit;
Haikuo
Can you post some sample/fake data that we can have a taste of it?
Here is sample data for Claim table:
BenefUserID | ClmAmount | ApproveAmount | StatusName | TreatmentName | HospId |
108 | 20133 | 17121 | Settled | Surgical Management | HopID-846 |
114 | 37000 | 28080 | Settled | Cataract | HopID-490 |
117 | 40000 | 40000 | Settled | Surgical Management | HopID-32 |
141 | 14078 | 0 | Cancelled | Cataract | HopID-766 |
147 | 16837 | 15637 | Settled | Surgical Management | HopID-490 |
148 | 12053 | 11953 | Settled | Medical Management ( Conservartive Management) | HopID-846 |
159 | 50000 | 20000 | Settled | Surgical Management | HopID-499 |
186 | 21231 | 16000 | Settled | Medical Management ( Conservartive Management) | HopID-490 |
If one row is equivalent to one patient, then linlin's solution is good for you. if you only care the number of different patient, the use: count (distinct BenefUserID)
Regards,
Haikuo
do you want something like below?
data have;
input hospital $ claim;
cards;
a 23
b 45
a 56
b 43
a 55
;
proc sql outobs=1;
select
hospital,count(claim) as count
from have
group by hospital
order by count desc;
quit;
updated after Doc Muhlbaier and Haikuo's comments. Thank you!
Linlin -- slick! Learned something new; good way to start the day.
One correction, no parentheses around the OUTOBS=1 option.
Linlin,
Count(*) will count all of the rows. In the cases when only populated fields need be counted, the variable names will be needed:
data have;
input hospital $ claim;
cards;
a 23
b 45
a .
b 43
a 55
;
proc sql ;
select
hospital,count(claim) as count
from have
group by hospital
order by count desc;
quit;
Haikuo
A nice way of limiting the output - looks like it'll be a handy workaround for the lack of "top" in SAS's SQL implementation.
One thing worth noting though is that in the event of a tie it would only print out the first. For example if two hopitals both had 50 claims, only the first would be output - presumably this is determined by input data order, but I guess we'd have to take it as arbitrary.
DF: While I, too, learned something new from Linlin's post, I agree that neither of the suggestions address ties. How about:
data have;
input hospital $ claim;
cards;
a 23
b 45
a .
b 43
a 55
c 55
;
proc sql ;
select hospital from
(select hospital,count(claim) as counts
from have
group by hospital)
having counts=max(counts)
;
quit;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.