Desktop productivity for business analysts and programmers

MAX and Distinct

Accepted Solution Solved
Reply
Contributor
Posts: 59
Accepted Solution

MAX and Distinct

I have a file with over 187,000 records:

Claim number

Sequence number

Date of service

Provider number

Claim Status

The issue is when I try to distinct the list using MAX and DISTINCT - I get the same results.

Mock sample of data

claim #          Seq #              DOS         

123456          001                 01/01/2013

123456          002                 01/01/2013

121212          001                 02/01/2013

555555          001                 01/01/2013

555555          002                 01/01/2013

555555          003                 01/01/2013

In Access this would simply require a sort on claim # and Seq # and then use the Last function to get the following output:             
123456          002                 01/01/2013 

121212          001                 02/01/2013

555555          003                 01/01/2013

Thanks in advance


Accepted Solutions
Solution
‎12-26-2013 01:24 PM
Occasional Contributor
Posts: 18

Re: MAX and Distinct

Hi,

I am a little confused with your output requirement. Is it really as simple as this that you want?-

data have;                                                                                                                       
input claim $ seq $ DOS ddmmyy10.;                                                                                               
format dos ddmmyy10.;                                                                                                             
datalines;                                                                                                                       
123456 001 01/01/2013                                                                                                             
123456 002 01/01/2013                                                                                                             
121212 001 02/01/2013                                                                                                             
555555 001 01/01/2013                                                                                                             
555555 002 01/01/2013                                                                                                             
555555 003 01/01/2013                                                                                                             
;                                                                                                                                 
run;                                                                                                                              
                                                                                                                                  
                                                                                                                                  
proc sort data=  have;                                                                                                            
by claim descending seq;                                                                                                          
run;                                                                                                                              
                                                                                                                                  
data want;                                                                                                                        
set have;                                                                                                                         
by claim;                                                                                                                         
if first.claim;                                                                                                                   
run; 

You can do some similar stuff like you mentioned sorting Access. Since it looks too simple to be true, I fear my understanding is perhaps not right. Give the above a try. No harmSmiley Happy lol

Thanks,

Naveen

View solution in original post


All Replies
Trusted Advisor
Posts: 1,137

Re: MAX and Distinct

Hi,

Please try

proc sql;

  create table want as select distinct claim,max(seq) format=z3. ,dos from have

  group by claim,dos;

quit;

Thanks,

Jagadish

Thanks,
Jag
Contributor
Posts: 59

Re: MAX and Distinct

When I run the following I'm still getting unwanted lines:

claim #             seq#     dos               prvd#       clmstatus     paymentstatus

2013499501      1          09AUG2013   DR0891   31               11

2013499501      2          09AUG2013:  DR0891   31               31

In the above data (actual data but modified for HIPPA) the line of data I should see only is:

2013499501      2          09AUG2013:  DR0891   31               31

The claimnumber, provider# and dos will repeat, but the seq# increments so I need to make sure I pull the latest seq#.  Other variables will change within the dataset, which is why the distinct isnt working

PROC SQL;

     CREATE TABLE WORK.QUERY_FOR_ENCOUNTER AS

           SELECT DISTINCT      t1.CLAIMNUMBER,

                (MAX(t1.SEQNUM)) AS SEQNUM,

                t1.BEGSERVICEDATE,

                t1.PROVNUM,     

                t1.CLAIMSTATUS,

                t1.PAYMENTSTATUS

           FROM WORK.ENCOUNTER t1

                GROUP BY t1.CLAIMNUMBER, t1.BEGSERVICEDATE;

QUIT;

Super Contributor
Posts: 578

Re: MAX and Distinct

how about this?

proc sql;

create table work.tmp as

select

     claimnumber,

     beginservicedate,

     max(seqnum) as LastSeq

from

     work.encounter

group by claimnumber,beginservicedate;

create table work.want as

select t1.*

from

     work.encounter t1

     inner join (

          select

               claimnumber,

          beginservicedate,

               max(seqnum) as LastSeq

          from

               work.encounter

          group by claimnumber,beginservicedate

          ) t2

          on t1.claimnumber=t2.claimnumber

               and t1.beginservicedate=t2.beginservicedate;

quit;

you might be able to insert the tmp step into the query but I wasn't sure about your performance.

proc sql;

create table work.want as

select t1.*

from

     work.encounter t1

     inner join work.tmp t2

          on t1.claimnumber=t2.claimnumber

               and t1.beginservicedate=t2.beginservicedate;

quit;

Super Contributor
Posts: 578

Re: MAX and Distinct

another option....

proc sql;

create table work.want as

select * from work.encounter t1

where

     seqnum=(select max(seqnum) from work.encounter where claimnumber=t1.claimnumber);

quit;

Contributor
Posts: 59

Re: MAX and Distinct

I'm looking into both Smiley Happy


Solution
‎12-26-2013 01:24 PM
Occasional Contributor
Posts: 18

Re: MAX and Distinct

Hi,

I am a little confused with your output requirement. Is it really as simple as this that you want?-

data have;                                                                                                                       
input claim $ seq $ DOS ddmmyy10.;                                                                                               
format dos ddmmyy10.;                                                                                                             
datalines;                                                                                                                       
123456 001 01/01/2013                                                                                                             
123456 002 01/01/2013                                                                                                             
121212 001 02/01/2013                                                                                                             
555555 001 01/01/2013                                                                                                             
555555 002 01/01/2013                                                                                                             
555555 003 01/01/2013                                                                                                             
;                                                                                                                                 
run;                                                                                                                              
                                                                                                                                  
                                                                                                                                  
proc sort data=  have;                                                                                                            
by claim descending seq;                                                                                                          
run;                                                                                                                              
                                                                                                                                  
data want;                                                                                                                        
set have;                                                                                                                         
by claim;                                                                                                                         
if first.claim;                                                                                                                   
run; 

You can do some similar stuff like you mentioned sorting Access. Since it looks too simple to be true, I fear my understanding is perhaps not right. Give the above a try. No harmSmiley Happy lol

Thanks,

Naveen

Contributor
Posts: 59

Re: MAX and Distinct

Hmmmmm - I didn't know there is a first.function available in SAS.  I'm still new to the whole thing.  Basically I need to be able to grab the latest line of data based on the last sequence regardless of how many other variables are in the same column and if they are distinct or non distinct of one another.

I'll test out your dataset and let you know

Super Contributor
Posts: 578

Re: MAX and Distinct

the sorting option is fine...unless the data resides in an external database.  If it does, then SAS will create a local copy for sorting.  That might not be an issue for you but it can be a time/disk intensive process depending on how data volume.

Occasional Contributor
Posts: 18

Re: MAX and Distinct

True, 187,000 records seems no big deal to have a local copy i guess. But when we take time/disk into consideration, I guess proc sql will help-

proc sql;                                                                                                                          
create table want1 as                                                                                                              
select claim, max(seq) as max_seq,dos                                                                                              
from have                                                                                                                          
group by claim, dos                                                                                                                
order by claim,dos;                                                                                                                

quit;

Contributor
Posts: 59

Re: MAX and Distinct

The proc sort and data step worked great.  I couldn't get the proc sql to work due to the other variables having distinct data.

The proc sort and data set ignores all other columns and only looks at the field placed after the FIRST.

CLAIMNUMBER,

SUBCLM,

SEQNUM,

BEGSERVICEDATE,

PROVNUM,

PROVLOC,

FACILITYNAME,

FFSORCAPIND,

CLAIMSTATUS,

PAYMENTSTATUS

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 10 replies
  • 570 views
  • 5 likes
  • 4 in conversation