BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
tmm
Fluorite | Level 6 tmm
Fluorite | Level 6

I have this as my code:

proc sql;

create table implants.implants_sum2 as

(select distinct

sys_drg_cd,

drg_desc,

count(distinct claimnum) as Total_Claims,

sum(alloc_allw_amt) as Total_Allowed,

sum(epd_calc_allw_amt) as Total_EPD

from implants.implants_sum

where epd_calc_allw_amt= alloc_allw_amt or

(epd_calc_allw_amt between (alloc_allw_amt + 0.01) and (alloc_allw_amt + 0.05)) or

(epd_calc_allw_amt between (alloc_allw_amt - 0.01) and (alloc_allw_amt - 0.05))

group by sys_drg_cd, drg_desc);

run;

It is only supposed to return any alloc_allw_amt that is between 0.01 cent and 0.05 cents compared to the epd_calc_allw_amt

so if the column looks like the below a X is what should be returned based on code above:

alloc_allw_amt                          epd_calc_allw_amt                          check

1010.00                                         1010.00                                        X

1010.05                                         1010.00                                        X

1010.10                                         1010.00                      

1010.01                                         1010.00                                        X

1010.02                                         1010.00                                        X

1010.06                                         1010.02                                        X

1010.09                                         1010.20

1010.01                                         1010.00                                        X

The problem is I am getting all those amounts you see in alloc_allw_amt column returned. It should not be returning the .10 and the .11 cent differing ones.

1 ACCEPTED SOLUTION

Accepted Solutions
JasonDiVirgilio
Quartz | Level 8

select distinct 

sys_drg_cd,

drg_desc,

count(distinct claimnum) as Total_Claims,

...i'm curious as to what the red DISTINCT is intended to do that the blue one does not and that the GROUP BY cannot do.

I'm not sure we can answer the question without seeing some values for the other two columns mentioned.

View solution in original post

10 REPLIES 10
JasonDiVirgilio
Quartz | Level 8

I'm not sure what values you are using for sys_drg_cd and drg_desc, but you wouldn't need a DISTINCT clause when you are also using a GROUP BY.  I don't think you need the DISTINCT option where you are selecting columns in the SELECT part.

tmm
Fluorite | Level 6 tmm
Fluorite | Level 6

I have to have the sys_drg_cd and sys_drg_desc

This is how I am group by and must have the group by these.

JasonDiVirgilio
Quartz | Level 8

You can keep those two columns by using the GROUP BY, you don't need the DISTINCT option.

Your example data doesn't show what values those two columns have, the DISTINCT option may be causing the issue.

tmm
Fluorite | Level 6 tmm
Fluorite | Level 6

Without the distinct it does not distinctly count my claimnums. If I take it out for example it will group the sys_drg_cd that is 0001 as 479 distinct claim numbers. By putting in distinct it does the correct claim count of 173 and sum the monetary columns as it should. All I need is to have the between stuff work right. That is what the problem is.

Haikuo
Onyx | Level 15

This is indeed puzzling. I have tried the following, and it does what it supposedly do.

data have;

infile cards truncover;

input alloc_allw_amt epd_calc_allw_amt check$;

cards;

1010.00 1010.00 X

1010.05 1010.00 X

1010.10 1010.00 

1010.01 1010.00 X

1010.02 1010.00 X

1010.06 1010.02 X

1010.09 1010.20

1010.01 1010.00 X

;

proc sql;

create table want as

(select *

from have

where epd_calc_allw_amt= alloc_allw_amt or

(epd_calc_allw_amt between (alloc_allw_amt + 0.01) and (alloc_allw_amt + 0.05)) or

(epd_calc_allw_amt between (alloc_allw_amt - 0.01) and (alloc_allw_amt - 0.05))

);

run;

proc print;run;

And I don't think 'group by' will change the behavior, as 'where' takes effect before 'group by'. I know it may sound highly unlikely, but what is the length of your numeric variables? If it is much less than the default '8', then precision could be the issue. Also, what is your running platform?

Regards,

Haikuo

JasonDiVirgilio
Quartz | Level 8

select distinct 

sys_drg_cd,

drg_desc,

count(distinct claimnum) as Total_Claims,

...i'm curious as to what the red DISTINCT is intended to do that the blue one does not and that the GROUP BY cannot do.

I'm not sure we can answer the question without seeing some values for the other two columns mentioned.

tmm
Fluorite | Level 6 tmm
Fluorite | Level 6

Ah that distinct. I took that out and got the correct claim counts now. I also think that the reason I am confused on the monetary is because when it sums it up by the sys_drg_cd or groups by I should say, it takes all DRG's that are 0001 which is a heart transplant and totals those claims and then I get the sum of the alloc_amt and epd_amt which is not within the 0.01 or 0.05 cents however I am asking for each claim to be within that monetary amount, not the final sum of information. So I can see why some of these are off by more than 0.05 cents.


JasonDiVirgilio
Quartz | Level 8

I'm glad you got the first part working...I'm not sure I understand the issue with the second part. I might suggest trying to break it into two separate steps if you are truly trying to do two separate things in one step. It may be easier to maintain/understand your own code.

art297
Opal | Level 21

It would help if you provided a sample input data and, while it probably won't affect your result, you should end your proc sql code with a quit rather than a run statement.

With the following data, you would get the desired result, using YOUR current code:

data implants.implants_sum;

  input sys_drg_cd drg_desc claimnum alloc_allw_amt  epd_calc_allw_amt;

  cards;

1 1 1 1010.00  1010.00

2 2 2 1010.05  1010.00

3 3 3 1010.10  1010.00

4 4 4 1010.01  1010.00

5 5 5 1010.02  1010.00

6 6 6 1010.06  1010.02

7 7 7 1010.09  1010.20

8 8 8 1010.01  1010.00

;

proc sql;

create table implants.implants_sum2 as

(select distinct

sys_drg_cd,

drg_desc,

count(distinct claimnum) as Total_Claims,

sum(alloc_allw_amt) as Total_Allowed,

sum(epd_calc_allw_amt) as Total_EPD

from implants.implants_sum

where epd_calc_allw_amt= alloc_allw_amt or

(epd_calc_allw_amt between (alloc_allw_amt + 0.01) and (alloc_allw_amt + 0.05)) or

(epd_calc_allw_amt between (alloc_allw_amt - 0.01) and (alloc_allw_amt - 0.05))

group by sys_drg_cd, drg_desc);

quit;

ballardw
Super User

I suspect ou are getting the unwanted values because you are asking for them here:

where epd_calc_allw_amt= alloc_allw_amt or <other stuff>

if the above 2 variables have the objectionable value 1010.10 they are accepted as this part of the selection criteria is true.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 10 replies
  • 2768 views
  • 0 likes
  • 5 in conversation