BookmarkSubscribeRSS Feed
vpgodbole
Fluorite | Level 6

Hi,

 

I have a dataset containing one record for each claim. it contains claim number , clm created date and clm closed date. I want to know number of pending claims at each month end, point in time.

Cl no. Clmcreatedt  Clmclosedt

101    30APR2017  01JUL2017

102    02JAN2017  08JAN2017

103    25MAR2017 07MAY2017

104    27AUG2016  29SEP2018

105    30JUN2018  .

106    30APR2018  01JUL2018

107    06OCT2015  .

108    14FEB2017  30DEC2017

109    16FEB2017 .

110    01MAY2018  27JUN2018

 

 

if i do it for single month manually the code would be ,

data want ;

set have (where=(clm_created_dt <= '30APR2018'd and (clm_closed_dt = . or clm_closed_dt > '30APR2018'd ))) ;

run ;

No. of records will be pending count for April 2018.

 

However, i don't want to write this for each month from 2015 to 2018, so i have put all month end dates (using intnx and call symput. - code not shown here) into macro variables:  lastDay1 to lastDay48.   1 for 31JAN2015 and 48 for  31DEC2018. 

 

How can i subset in the above step automatically and give one dataset for each month. the record number of each dataset is pending claim count for respective month.

 

Thank you,

16 REPLIES 16
PeterClemmensen
Tourmaline | Level 20

So your data looks like this

 

data have;
input Clno (Clmcreatedt Clmclosedt)(:date9.);
format Clmcreatedt Clmclosedt date9.;
datalines;
101 30APR2017 01JUL2017
102 02JAN2017 08JAN2017
103 25MAR2017 07MAY2017
104 27AUG2016 29SEP2018
105 30JUN2018 .
106 30APR2018 01JUL2018
107 06OCT2015 .
108 14FEB2017 30DEC2017
109 16FEB2017 .
110 01MAY2018 27JUN2018
;

Why do you want to split into separate data sets?

vpgodbole
Fluorite | Level 6

Thanks for replying, even if it doesn't require splitting its fine, I am interested in counts. Could be any method.

 

PeterClemmensen
Tourmaline | Level 20

Something like this?

 

data have;
input Clno (Clmcreatedt Clmclosedt)(:date9.);
format Clmcreatedt Clmclosedt date9.;
datalines;
101 30APR2017 01JUL2017
102 02JAN2017 08JAN2017
103 25MAR2017 07MAY2017
104 27AUG2016 29SEP2018
105 30JUN2018 .
106 30APR2018 01JUL2018
107 06OCT2015 .
108 14FEB2017 30DEC2017
109 16FEB2017 .
110 01MAY2018 27JUN2018
;

data temp;
   set have;
   Clmclosedt_end=intnx('month',Clmclosedt, 0, 'end');
   format Clmclosedt_end date9.;
run;

proc freq data=temp;
   tables Clmclosedt_end / out=want nocum;
run;
vpgodbole
Fluorite | Level 6

Thanks, what i am looking for is how many claims are pending (sill open at that time, either closing date is missing or closing date is after the respective month, 

 

So, for FEB 2017 the count would be  2, for APR 2017 count would be 1, etc.. so the frequency is taking open and closed date into consideration, it only shifts time at which we are counting (point in time).

 

E.g. For FEB2017, I want all claims opened anytime before FEB2017 (not necessarily in FEB2017),  and either not closed or closing date is after FEB2017. That will give me pending claim count point in time for FEB 2017.

 

Thank you for your time.

PeterClemmensen
Tourmaline | Level 20

You can do something like this

 

data have;
input Clno (Clmcreatedt Clmclosedt)(:date9.);
format Clmcreatedt Clmclosedt date9.;
datalines;
101 30APR2017 01JUL2017
102 02JAN2017 08JAN2017
103 25MAR2017 07MAY2017
104 27AUG2016 29SEP2018
105 30JUN2018 .
106 30APR2018 01JUL2018
107 06OCT2015 .
108 14FEB2017 30DEC2017
109 16FEB2017 .
110 01MAY2018 27JUN2018
;

proc sql;
   create table want as
   select *
         ,count(distinct Clno) as count
   from have
   where Clmclosedt=. or Clmcreatedt<=Clmclosedt
   group by year(Clmcreatedt), month(Clmcreatedt);
quit;

 

and edit the where clause to consider only year 2015 through 2018. Then use a simple RETAIN Statement in a data step to give you the accumulated count. 

ballardw
Super User

@vpgodbole wrote:

Thanks, what i am looking for is how many claims are pending (sill open at that time, either closing date is missing or closing date is after the respective month, 

 

So, for FEB 2017 the count would be  2, for APR 2017 count would be 1, etc.. so the frequency is taking open and closed date into consideration, it only shifts time at which we are counting (point in time).

 

E.g. For FEB2017, I want all claims opened anytime before FEB2017 (not necessarily in FEB2017),  and either not closed or closing date is after FEB2017. That will give me pending claim count point in time for FEB 2017.

 

Thank you for your time.


For this you need to explicitly tell us what the end date rule to consider would be. Otherwise we don't have any idea when to stop. Either a specific date, a relative date base on the creation date (do you allow "pending" to run for 5 years? 10 Years? 25 years?) or perhaps program run date.

 

Also since you show

104 27AUG2016 29SEP2018

107    06OCT2015  .

108    14FEB2017  30DEC2017

109    16FEB2017 .

that is 4 claims pending at the end of Feb 2017.

 

Perhaps:

data have;
input Clno (Clmcreatedt Clmclosedt)(:date9.);
format Clmcreatedt Clmclosedt date9.;
datalines;
101 30APR2017 01JUL2017
102 02JAN2017 08JAN2017
103 25MAR2017 07MAY2017
104 27AUG2016 29SEP2018
105 30JUN2018 .
106 30APR2018 01JUL2018
107 06OCT2015 .
108 14FEB2017 30DEC2017
109 16FEB2017 .
110 01MAY2018 27JUN2018
;
run;

data temp;
   set have;
   EndMonth = coalesce(intnx('month',Clmclosedt,0,'E'),intnx('month',today(),0,'E'));
   format endmonth PendMonth date9.;
   PendMonth = intnx('month',Clmcreatedt,0,'E');
   do while (pendmonth lt endmonth);
      Pending =1;
      output;
      pendmonth= intnx('month',pendmonth,1,'E');
   end;
run;

proc freq data=temp;
   tables pendmonth;
run;
vpgodbole
Fluorite | Level 6

The start year month would be JAN 2015 and End month and year would be DEC 2018 or could be current month and year.

novinosrin
Tourmaline | Level 20
data have;
input Clno (Clmcreatedt Clmclosedt)(:date9.);
format Clmcreatedt Clmclosedt date9.;
datalines;
101 30APR2017 01JUL2017
102 02JAN2017 08JAN2017
103 25MAR2017 07MAY2017
104 27AUG2016 29SEP2018
105 30JUN2018 .
106 30APR2018 01JUL2018
107 06OCT2015 .
108 14FEB2017 30DEC2017
109 16FEB2017 .
110 01MAY2018 27JUN2018
;
data temp;
   set have;
   Clmmonth_end=intnx('month',Clmcreatedt, 0, 'end');
    format Clmmonth_end date9.;
   keep   Clmmonth_end ;
run;


proc sql;
create table want as
select *
from temp a, have b
group by Clmmonth_end
having (Clmcreatedt <= Clmmonth_end and (Clmclosedt = . or Clmclosedt > Clmmonth_end))=1
order by Clmmonth_end,clno;
quit;
novinosrin
Tourmaline | Level 20

If the above is ok, and you want just the counts, let me know

vpgodbole
Fluorite | Level 6

Yes, this is very helpful, next I can get counts by simply running freq on Clmmonth_end?

novinosrin
Tourmaline | Level 20

Yep whichever approach you want, or do you want me to modify the above? or you wanna take a stab at it yourself? feel free!

novinosrin
Tourmaline | Level 20

just the counts:

 

data have;
input Clno (Clmcreatedt Clmclosedt)(:date9.);
format Clmcreatedt Clmclosedt date9.;
datalines;
101 30APR2017 01JUL2017
102 02JAN2017 08JAN2017
103 25MAR2017 07MAY2017
104 27AUG2016 29SEP2018
105 30JUN2018 .
106 30APR2018 01JUL2018
107 06OCT2015 .
108 14FEB2017 30DEC2017
109 16FEB2017 .
110 01MAY2018 27JUN2018
;
data temp;
   set have;
   Clmmonth_end=intnx('month',Clmcreatedt, 0, 'end');
    format Clmmonth_end Clmmonth_begin date9.;
   keep   Clmmonth_end ;
run;


proc sql;
create table want as 
select  Clmmonth_end format=monyy.,sum(Clmcreatedt <= Clmmonth_end and (Clmclosedt = . or Clmclosedt > Clmmonth_end)) as pending_claims
from temp a, have b
group by Clmmonth_end
order by Clmmonth_end;
quit;
vpgodbole
Fluorite | Level 6

Sorry everyone for the confusion..

 

If we can further develop:-

I want counts till current month and year, Your code is taking the last End date into consideration which is Jun2018.

 

What if last dt is May2018 and I want counts for June and July to show 0? or in the above case July should be zero.

 

Months to consider for this query can be JAN2015 to  current month year or DEC2018.

 

 

ballardw
Super User

@vpgodbole wrote:

Sorry everyone for the confusion..

 

If we can further develop:-

I want counts till current month and year, Your code is taking the last End date into consideration which is Jun2018.

 

What if last dt is May2018 and I want counts for June and July to show 0? or in the above case July should be zero.

 

Months to consider for this query can be JAN2015 to  current month year or DEC2018.

 

 


If this is reply to my example please note that it was prefaced with a question directly related to this: what is the end.

The COALESCE statement was to select one of two conditions since you indicated you have blank data and the order would be important. So the coalesce picks the first of the dates: the end date if present or todays date (since lacking any explicit prior requirement). You could replace the second part of the endmonth assignment with a specific date:

 

EndMonth = coalesce(intnx('month',Clmclosedt,0,'E'),'31DEC2018'd));

for example

 

Pick one: current month year (already provided) or Dec 2018 ^^^

 

You may have to describe in more detail what "Months to consider for this query can be JAN2015 " might mean as do you only want to count "pending" starting at Jan2015 if the creation date was prior to that?

Or only consider creation dates of Jan2015 or later.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 16 replies
  • 1738 views
  • 1 like
  • 4 in conversation